Java 工具类总结(2): 再也不怕表单字符串处理 - String / StringBuilder / StringBuffer

String / StringBuilder / StringBuffer 是 Java 中字符串类;
本文从源码角度, 进行分析与总结:
1. String 常用的方法和注意点;
2. StringBuilder 和 StringBuffer 各自的特点;
3. 可变类和不可变类的实现原理

1. String 类简要总结

1.1 String 源码分析

/** * The String class represents character strings. All * string literals in Java programs, such as {@code "abc"}, are * implemented as instances of this class. * String 类用于表述字符串字面量, 如: "abc" 就是 String 类的一个实例; *  * Strings are constant; their values cannot be changed after they * are created. String buffers support mutable strings. * Because String objects are immutable they can be shared. For example: * <blockquote><pre> *     String str = "abc"; * </pre></blockquote><p> * is equivalent to: * <blockquote><pre> *     char data[] = {'a', 'b', 'c'}; *     String str = new String(data); * </pre></blockquote><p> * Here are some more examples of how strings can be used: * <blockquote><pre> *     System.out.println("abc"); *     String cde = "cde"; *     System.out.println("abc" + cde); *     String c = "abc".substring(2,3); *     String d = cde.substring(1, 2); * </pre></blockquote> * <p> * String 类是 Immutable 不可变的, String 实例能够被多个变量共享;(但是, 此处举的例子我没有看懂, 哪里能体现共享) * 我理解的共享:  * String str1 = "Hello"; * String str2 = "Hello"; * 此时的 str1, str2 是指向同一个字符串存储地址的, 即 str1 == str2 为 true; *  * The class {@code String} includes methods for examining * individual characters of the sequence, for comparing strings, for * searching strings, for extracting substrings, and for creating a * copy of a string with all characters translated to uppercase or to * lowercase. Case mapping is based on the Unicode Standard version * specified by the {@link java.lang.Character Character} class. * <p> * String 包含常用的方法: 比较 / 查找 / 提取子串 / 创建新字符串(全部大写/小写) *  * The Java language provides special support for the string * concatenation operator (+), and for conversion of * other objects to strings. String concatenation is implemented * through the {@code StringBuilder}(or {@code StringBuffer}) * class and its {@code append} method. * String conversions are implemented through the method * {@code toString}, defined by {@code Object} and * inherited by all classes in Java. For additional information on * string concatenation and conversion, see Gosling, Joy, and Steele, * <i>The Java Language Specification</i>. *  *  * <p> Unless otherwise noted, passing a <tt>null</tt> argument to a constructor * or method in this class will cause a {@link NullPointerException} to be * thrown. * 不要试图访问 String str = null, 被赋予 null 值得字符串变量, 会 cause NullPointerException 错误 *  * <p>A {@code String} represents a string in the UTF-16 format * in which <em>supplementary characters</em> are represented by <em>surrogate * pairs</em> (see the section <a href="Character.html#unicode">Unicode * Character Representations</a> in the {@code Character} class for * more information). * Index values refer to {@code char} code units, so a supplementary * character uses two positions in a {@code String}. * <p>The {@code String} class provides methods for dealing with * Unicode code points (i.e., characters), in addition to those for * dealing with Unicode code units (i.e., {@code char} values). * * @author  Lee Boynton * @author  Arthur van Hoff * @author  Martin Buchholz * @author  Ulf Zibis * @see     java.lang.Object#toString() * @see     java.lang.StringBuffer * @see     java.lang.StringBuilder * @see     java.nio.charset.Charset * @since   JDK1.0 */


1.2 String 演示

// 1. 字符串就是多个 Unicode 字符序列String str = "Java/u2122";  System.out.println(str);// 2. used frequently methodsString str2 = str.substring(0, 1);  // sString str3 = str.substring(str.length()-1, str.length()); // /u2122// 3. String is immutable classString str4 = str + "Hello";    String str5 = "Java/u2122Hello";System.out.println(str4 == str5);   // falseString str8 = "hel";char[] chs = {'h', 'e', 'l'};String str9 = new String(chs);System.out.println(str8 == str9);   // false//5. "" 和 nullString str6 = "";   //str6 是 String 对象, 长度为 0, 值为 空String str7 = null; // str7 只是 null 对象, 表示还没有被初始化                    // 请不要试图使用 str7 进行操作 !!!

1.4 关于 Java 采用 UTF-16 编码的问题

留下问题: (关于 Java 采用 UTF-16 编码的问题)
Java 字符串是由 char 序列组成的, char 类型是一个采用 UTF-16 编码格式表示 Unicode 代码点的代码单元;

大多数字符使用一个代码点就可以表示, 而辅助字符需要两个代码点;
那么如何对具有两个代码点的字符串进行操作呢 ?

2. 从源码分析 StringBuilder / StringBuffer

总结: StringBuffer 和 StringBuilder 都是可变类;

StringBuilder 是 JDK 1.5 引入的类;

当业务主要涉及字符串存储于访问的时候, 使用 String 类;

当业务主要涉及字符串 insert / update / delete 时候, 使用 StringBuilder ( 单线程) / StringBuffer (多线程)

2.2 分析 StringBuilder 源码

StringBuilder 和 StringBuffer 所有方法都相同, 只是在线程安全和效率上有所区别, 因此, 我只演示 StringBuilder 的部分源码

/** * A mutable sequence of characters.  This class provides an API compatible * with {@code StringBuffer}, but with no guarantee of synchronization. * This class is designed for use as a drop-in replacement for * {@code StringBuffer} in places where the string buffer was being * used by a single thread (as is generally the case).   Where possible, * it is recommended that this class be used in preference to * {@code StringBuffer} as it will be faster under most implementations. * * StringBuilder 是可变的字符序列, 它提供的类方法和 StringBuilder 完全相同, 但是它是非线程安全的 !!! * StringBuilder 类设计的初衷是, 希望在单线程(不要求线程安全)情境下, 替代 StringBuffer 类, 尽可能的 * 提高系统的运行效率;  *  * <p>The principal operations on a {@code StringBuilder} are the * {@code append} and {@code insert} methods, which are * overloaded so as to accept data of any type. Each effectively  * converts a given datum to a string and then appends or inserts the * characters of that string to the string builder.  * StringBuilder 类中主要的方法包括: append() / insert() 方法; *  *  The {@code append} method always adds these characters at the end * of the builder; the {@code insert} method adds the characters at * a specified point. * <p> * append() 方法: 从末尾增加新字符串 * insert() 方法: 在指定索引位置增加字符串 *  * For example, if {@code z} refers to a string builder object * whose current contents are "{@code start}", then * the method call {@code z.append("le")} would cause the string * builder to contain "{@code startle}", whereas * {@code z.insert(4, "le")} would alter the string builder to * contain "{@code starlet}". *  * In general, if sb refers to an instance of a {@code StringBuilder}, * then {@code sb.append(x)} has the same effect as * {@code sb.insert(sb.length(), x)}. *  * Every string builder has a capacity(容量, 这里指的是存储字符串的空间). As long as the length of the * character sequence contained in the string builder does not exceed * the capacity, it is not necessary to allocate a new internal * buffer. If the internal buffer overflows, it is automatically made larger. * 这里主要说明了 StringBuilder 类是动态调整存储空间的, 如果存储空间不足的话, 会自动增加存储空间; * 在父类 AbstractStringBuilder 中, ensureCapacityInternal() 方法就是用来确保存储空间问题; * (当然, StringBuffer 也具有相同的功能实现) *  * <p>Instances of {@code StringBuilder} are not safe for * use by multiple threads. If such synchronization is required then it is * recommended that {@link java.lang.StringBuffer} be used. * * <p>Unless otherwise noted, passing a {@code null} argument to a constructor * or method in this class will cause a {@link NullPointerException} to be * thrown. * * @author      Michael McCloskey * @see         java.lang.StringBuffer * @see         java.lang.String * @since       1.5 */public final class StringBuilder extends AbstractStringBuilder    implements, CharSequence{    /**     * Constructs a string builder with no characters in it and an     * initial capacity of 16 characters.     */    public StringBuilder() {        super(16);    // initial capacity: 16 个字符    }    /**     * Constructs a string builder with no characters in it and an     * initial capacity specified by the {@code capacity} argument.     *     * @param      capacity  the initial capacity.     * @throws     NegativeArraySizeException  if the {@code capacity}     *               argument is less than {@code 0}.     */    public StringBuilder(int capacity) {        super(capacity);    }    /**     * Constructs a string builder initialized to the contents of the     * specified string. The initial capacity of the string builder is     * {@code 16} plus the length of the string argument.     *     * @param   str   the initial contents of the buffer.     */    public StringBuilder(String str) {        super(str.length() + 16);    // 如果有 String 实参, 此时的容量是 str.length() + 16 个字符大小        append(str);    }    // append() 方法    // delete() 方法    // deleteCharAt() 方法    // replace() 方法    // insert() 方法    // indexOf() 方法    // reverse() 方法}

2.3 解释可变类 和 不可变类的实现

那为什么, StringBuilder / StringBuffer 是可变的, 而 String 就不是可变的呢? 接下里, 就从源码上来分析:

// from AbstractStringBuilder 是 StringBuilder 和 StringBuffer 共同的父类abstract class AbstractStringBuilder implements Appendable, CharSequence {    /**     * The value is used for character storage.     */    char[] value;    // 原来 StringBuffer 和 StringBuilder 都是通过 char[] value 进行存值嗒                     // 最重要的是, 不是 final 修饰的哦, 当然可以改变    /**     * The count is the number of characters used.     */    int count;}// from String.javapublic final class String    implements, Comparable<String>, CharSequence {    /** The value is used for character storage. */    private final char value[];    // final 修饰, 这就是不可变的原因}

3. 参考

