java中扩容机制

soulballad约 2757 字大约 9 分钟

ArrayList 扩容机制

ArrayList 构造函数

/**
 * Default initial capacity.
 */
private static final int DEFAULT_CAPACITY = 10;

/**
 * Shared empty array instance used for empty instances.
 */
private static final Object[] EMPTY_ELEMENTDATA = {};

private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

public ArrayList() {
    this.elementData = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;
}

public ArrayList(int initialCapacity) {
    if (initialCapacity > 0) {
        this.elementData = new Object[initialCapacity];
    } else if (initialCapacity == 0) {
        this.elementData = EMPTY_ELEMENTDATA;
    } else {
        throw new IllegalArgumentException("Illegal Capacity: "+ initialCapacity);
    }
}

通过构造函数可以看出arraylist的默认容量是10，在创建 arraylist 时如果传入一个capacity，就创建一个指定大小的数组，否则创建一个空数组

扩容的时机

arraylist 中调用 add 方法源码

public boolean add(E e) {
    ensureCapacityInternal(size + 1);  // Increments modCount!!
    elementData[size++] = e;
    return true;
}

add 方法中调用一个 ensureCapacityInternal，然后再添加一个元素

private static int calculateCapacity(Object[] elementData, int minCapacity) {
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
        // 如果最小容量小于默认容量，则取默认容量
        return Math.max(DEFAULT_CAPACITY, minCapacity);
    }
    return minCapacity;
}

private void ensureCapacityInternal(int minCapacity) {
    ensureExplicitCapacity(calculateCapacity(elementData, minCapacity));
}

private void ensureExplicitCapacity(int minCapacity) {
    modCount++;

    // overflow-conscious code
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}

在 ensureExplicitCapacity 中调用一个grow方法

private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}

可以看到新的容量大小=旧容量 * 1.5，即扩充为原来的1.5倍，然后在调用 Arrays 中的copyOf方法，把已有数据进行复制

public static <T> T[] copyOf(T[] original, int newLength) {
    return (T[]) copyOf(original, newLength, original.getClass());
}

public static <T,U> T[] copyOf(U[] original, int newLength, Class<? extends T[]> newType) {
    @SuppressWarnings("unchecked")
    T[] copy = ((Object)newType == (Object)Object[].class)
        ? (T[]) new Object[newLength]
        : (T[]) Array.newInstance(newType.getComponentType(), newLength);
    System.arraycopy(original, 0, copy, 0,
                     Math.min(original.length, newLength));
    return copy;
}

然后就完成了扩容的过程。

Vector 扩容机制

vector 的构造函数

public Vector() {
    this(10);
}

public Vector(int initialCapacity) {
    this(initialCapacity, 0);
}

public Vector(int initialCapacity, int capacityIncrement) {
    super();
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal Capacity: "+
                                           initialCapacity);
    this.elementData = new Object[initialCapacity];
    this.capacityIncrement = capacityIncrement;
}

vector 维护了一个Object类型的数组和一个int类型的标记，当调用 add 方法添加元素的时候

public synchronized boolean add(E e) {
    modCount++;
    ensureCapacityHelper(elementCount + 1);
    elementData[elementCount++] = e;
    return true;
}

和上面 arrayList 类似，也有调用一个 ensureCapacityXXX 的方法，来看一下这个方法

private void ensureCapacityHelper(int minCapacity) {
    // overflow-conscious code
    if (minCapacity - elementData.length > 0)
        grow(minCapacity);
}

也是调用了 grow 方法

private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + ((capacityIncrement > 0) ?
                                     capacityIncrement : oldCapacity);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    elementData = Arrays.copyOf(elementData, newCapacity);
}

看到这里就发现，vector 和 arraylist的扩容机制很相似，不同的是 vector 中的 add 方法上面添加了 synchronized关键字，保证了线程安全，arraylist 没有。

HashMap 扩容机制

同上，先来看下 HashMap 的构造函数

static final int MAXIMUM_CAPACITY = 1 << 30;
static final float DEFAULT_LOAD_FACTOR = 0.75f; 

public HashMap() {
    this.loadFactor = DEFAULT_LOAD_FACTOR; // all other fields defaulted
}

public HashMap(int initialCapacity) {
    this(initialCapacity, DEFAULT_LOAD_FACTOR);
}

public HashMap(int initialCapacity, float loadFactor) {
    if (initialCapacity < 0)
        throw new IllegalArgumentException("Illegal initial capacity: " +
                                           initialCapacity);
    if (initialCapacity > MAXIMUM_CAPACITY)
        initialCapacity = MAXIMUM_CAPACITY;
    if (loadFactor <= 0 || Float.isNaN(loadFactor))
        throw new IllegalArgumentException("Illegal load factor: " + loadFactor);
    this.loadFactor = loadFactor;
    this.threshold = tableSizeFor(initialCapacity);
}

有个参数负载因子 load_factor ，默认为 0.75f，在构造函数的最后阶段调用了 tableSizeFor 方法

static final int tableSizeFor(int cap) {
    int n = cap - 1;
    n |= n >>> 1;
    n |= n >>> 2;
    n |= n >>> 4;
    n |= n >>> 8;
    n |= n >>> 16;
    return (n < 0) ? 1 : (n >= MAXIMUM_CAPACITY) ? MAXIMUM_CAPACITY : n + 1;
}

这里面涉及到位移和位运算，先来说下左移右移和位运算：

位移运算符：以4为例，4的32位二进制为 [0000 0000 0000 0000 0000 0000 0000 0100]
左移(<<)：整体向左移动，移除左边的高位，右边补0，左移n位，相当于2的n次方；
右移(>>)：整体向右移动，移除右边的低位，左边补位，正数补0，负数补1；
无符号右移(>>>)：整体向右移动，移除右边的低位，左边补位，无论正负都补0。
例子：4<<1=8，4>>1=2，-4>>1=-2，4>>>1=2，-4>>>1=2147483646
位运算：转化为二进制比较
与：1 & 1= 1，1 & 0 = 0， 0 & 1 = 0，0 & 0 = 0
或：1 | 1= 1，1 | 0 = 1， 0 | 1 = 1，0 | 0 = 0
异或：1 ^ 1= 0，1 ^ 0 = 1， 0 ^ 1 =1，0 ^ 0 = 0；相同的0，不同得1

再来看上面的方法，假设 n 的二进制为 01xxx...xxx，按照顺序运算：

n >>> 1：右移1位，001xxx...xxx，再异或运算：011xxx...xxx；
n >>> 2：右移2位，00011xxx...xxx，再异或运算：01111xxx...xxx
n >>> 4：右移4位，000001111xxx...xxx，再异或运算：011111111xxx...xxx
...

依此类推，右移1位前面有2个1，右移2位前面有4个1，最后右移16位，就把所有位数都变成1，然后再让结果n+1，就得到2的整数次幂。

接下来看下 hashmap 的 put 方法

public V put(K key, V value) {
    return putVal(hash(key), key, value, false, true);
}

final V putVal(int hash, K key, V value, boolean onlyIfAbsent, boolean evict) {
    Node<K,V>[] tab; Node<K,V> p; int n, i;
    if ((tab = table) == null || (n = tab.length) == 0)
        n = (tab = resize()).length;
    if ((p = tab[i = (n - 1) & hash]) == null)
        tab[i] = newNode(hash, key, value, null);
    else {
        Node<K,V> e; K k;
        if (p.hash == hash && ((k = p.key) == key || (key != null && key.equals(k))))
            e = p;
        else if (p instanceof TreeNode)
            e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
        else {
            for (int binCount = 0; ; ++binCount) {
                if ((e = p.next) == null) {
                    p.next = newNode(hash, key, value, null);
                    if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                        treeifyBin(tab, hash);
                    break;
                }
                if (e.hash == hash &&
                    ((k = e.key) == key || (key != null && key.equals(k))))
                    break;
                p = e;
            }
        }
        if (e != null) { // existing mapping for key
            V oldValue = e.value;
            if (!onlyIfAbsent || oldValue == null)
                e.value = value;
            afterNodeAccess(e);
            return oldValue;
        }
    }
    ++modCount;
    if (++size > threshold)
        resize();
    afterNodeInsertion(evict);
    return null;
}

里面有个 resize 方法

final Node<K,V>[] resize() {
    Node<K,V>[] oldTab = table;
    int oldCap = (oldTab == null) ? 0 : oldTab.length;
    int oldThr = threshold;
    int newCap, newThr = 0;
    if (oldCap > 0) {
        if (oldCap >= MAXIMUM_CAPACITY) {
            threshold = Integer.MAX_VALUE;
            return oldTab;
        }
        else if ((newCap = oldCap << 1) < MAXIMUM_CAPACITY &&
                 oldCap >= DEFAULT_INITIAL_CAPACITY)
            newThr = oldThr << 1; // double threshold
    }
    else if (oldThr > 0) // initial capacity was placed in threshold
        newCap = oldThr;
    else {               // zero initial threshold signifies using defaults
        newCap = DEFAULT_INITIAL_CAPACITY;
        newThr = (int)(DEFAULT_LOAD_FACTOR * DEFAULT_INITIAL_CAPACITY);
    }
    if (newThr == 0) {
        float ft = (float)newCap * loadFactor;
        newThr = (newCap < MAXIMUM_CAPACITY && ft < (float)MAXIMUM_CAPACITY ?
                  (int)ft : Integer.MAX_VALUE);
    }
    threshold = newThr;
    @SuppressWarnings({"rawtypes","unchecked"})
    Node<K,V>[] newTab = (Node<K,V>[])new Node[newCap];
    table = newTab;
    if (oldTab != null) {
        for (int j = 0; j < oldCap; ++j) {
            Node<K,V> e;
            if ((e = oldTab[j]) != null) {
                oldTab[j] = null;
                if (e.next == null)
                    newTab[e.hash & (newCap - 1)] = e;
                else if (e instanceof TreeNode)
                    ((TreeNode<K,V>)e).split(this, newTab, j, oldCap);
                else { // preserve order
                    Node<K,V> loHead = null, loTail = null;
                    Node<K,V> hiHead = null, hiTail = null;
                    Node<K,V> next;
                    do {
                        next = e.next;
                        if ((e.hash & oldCap) == 0) {
                            if (loTail == null)
                                loHead = e;
                            else
                                loTail.next = e;
                            loTail = e;
                        }
                        else {
                            if (hiTail == null)
                                hiHead = e;
                            else
                                hiTail.next = e;
                            hiTail = e;
                        }
                    } while ((e = next) != null);
                    if (loTail != null) {
                        loTail.next = null;
                        newTab[j] = loHead;
                    }
                    if (hiTail != null) {
                        hiTail.next = null;
                        newTab[j + oldCap] = hiHead;
                    }
                }
            }
        }
    }
    return newTab;
}

在HashMap中有一下两个属性和扩容相关：

int threshold;
final float loadFactor;

其中threshold = Length * loadFactor，Length表示table数组的长度（默认值是16），loadFactor为负载因子（默认值是0.75），阀值threshold表示当table数组中存储的元素超过这个阀值的时候，就需要扩容了。以默认长度16，和默认负载因子0.75为例，threshold = 16 * 0.75 = 12，即当table数组中存储的元素个数超过12个的时候，table数组就该扩容了。

当然Java中的数组是无法自动扩容的，方法是使用一个新的数组代替已有的容量小的数组，然后将旧数组中的元素经过重新计算放到新数组中，那么怎样对旧元素进行重新映射呢？

其实很简单，由于我们在扩容时，是使用2的幂扩展，即数组的长度扩大到原来的2倍, 4倍, 8倍…，因此在resize时(Length - 1)这部分相当于在高位新增一个或多个1bit，我们以扩大到原来的两倍为例说明：

1562679650796

(a)中n为16，(b)中n扩大到两倍为32，相当于(n - 1)这部分的高位多了一个1, 然后和原hash码作与操作，这样元素在数组中映射的位置要么不变，要不向后移动16个位置：

1562679650796

这个设计确实非常的巧妙，既省去了重新计算hash值的时间，而且同时，由于新增的1bit是0还是1可以认为是随机的，因此resize的过程，均匀的把之前的冲突的节点分散到新的bucket了。这一块就是JDK1.8新增的优化点。有一点注意区别，JDK1.7中resize的时候，旧链表迁移新链表的时候，如果在新表的数组索引位置相同，则链表元素会倒置，但是从上图可以看出，JDK1.8不会倒置。

HashMap底层实现原理open in new window

HashSet 扩容机制

HashSet 的构造函数

private transient HashMap<E,Object> map;

public HashSet() {
    map = new HashMap<>();
}

public HashSet(int initialCapacity) {
    map = new HashMap<>(initialCapacity);
}

可以看到 hashSet 中维护一个 hashMap，初始化 hashset 其实是在初始化 hashMap，再看下 add 方法

private static final Object PRESENT = new Object();
public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}

add 方法其实是向 map 中存储元素，key 不相同，value 是每次 new 一个 object 对象，所以 hashSet 的扩容机制实际上就是 hashMap 的扩容方法。

StringBuilder 扩容机制

StringBuilder 的构造函数

public final class StringBuilder extends AbstractStringBuilder 
		implements java.io.Serializable, CharSequence{

    static final long serialVersionUID = 4383685877147921099L;

    public StringBuilder() {
        super(16);
    }

    public StringBuilder(int capacity) {
        super(capacity);
    }
}

StringBuilder 继承自 AbstractStringBuilder，初始化 StringBuilder 实际上初始化 AbstractStringBuilder

abstract class AbstractStringBuilder implements Appendable, CharSequence {

    char[] value;

    int count;

    AbstractStringBuilder() {}
    
    AbstractStringBuilder(int capacity) {
        value = new char[capacity];
    }
}

AbstractStringBuilder 中维护一个 char 类型的数组，和一个 int 类型的计数器，当使用 append 方法向StringBuilder 中添加元素时

public StringBuilder append(char[] str) {
    super.append(str);
    return this;
}

调用父类的 append 方法

public AbstractStringBuilder append(char[] str) {
    int len = str.length;
    ensureCapacityInternal(count + len);
    System.arraycopy(str, 0, value, count, len);
    count += len;
    return this;
}

可以看到有调用一个 ensureCapacityInternal 方法，这个方法在上面 arraylist 扩容时见过，这里的功能也和上面类似

private void ensureCapacityInternal(int minimumCapacity) {
    // overflow-conscious code
    if (minimumCapacity - value.length > 0) {
        value = Arrays.copyOf(value, newCapacity(minimumCapacity));
    }
}

private int newCapacity(int minCapacity) {
    // overflow-conscious code
    int newCapacity = (value.length << 1) + 2;
    if (newCapacity - minCapacity < 0) {
        newCapacity = minCapacity;
    }
    return (newCapacity <= 0 || MAX_ARRAY_SIZE - newCapacity < 0)
        ? hugeCapacity(minCapacity)
        : newCapacity;
}

基本上到这里，就已经完全弄清楚了 StringBuilder 的扩容原理。

StringBuffer 扩容机制

StringBuffer 构造函数

public final class StringBuffer extends AbstractStringBuilder 
 			implements java.io.Serializable, CharSequence {

    public StringBuffer() {
        super(16);
    }

    public StringBuffer(int capacity) {
        super(capacity);
    }
}

可以看到 StringBuffer 也继承自 AbstractStringBuilder 类，那肯定已维护一个数组和一个计数器，再看下append方法

@Override
public synchronized StringBuffer append(char[] str) {
    toStringCache = null;
    super.append(str);
    return this;
}

StringBuffer 的 append 方法上有一个 synchronized 关键字，所以它是线程安全的，同时它也调用父类的append 方法来添加元素

public AbstractStringBuilder append(char[] str) {
    int len = str.length;
    ensureCapacityInternal(count + len);
    System.arraycopy(str, 0, value, count, len);
    count += len;
    return this;
}

这里看起来就和上面的是一样，所以它的扩容原理和 StringBuilder 是相同的。

注意一点，向 stringBuffer 中添加 int 类型，bool 类型

public AbstractStringBuilder append(int i) {
    if (i == Integer.MIN_VALUE) {
        append("-2147483648");
        return this;
    }
    // stringSize 方法用来判断 i 的位数，比如33是2位，333是3位
    int appendedLength = (i < 0) ? Integer.stringSize(-i) + 1
        : Integer.stringSize(i);
    int spaceNeeded = count + appendedLength;
    ensureCapacityInternal(spaceNeeded);
    Integer.getChars(i, spaceNeeded, value);
    count = spaceNeeded;
    return this;
}

// 添加布尔类型时，转为对应字符串
public AbstractStringBuilder append(boolean b) {
    if (b) {
        ensureCapacityInternal(count + 4);
        value[count++] = 't';
        value[count++] = 'r';
        value[count++] = 'u';
        value[count++] = 'e';
    } else {
        ensureCapacityInternal(count + 5);
        value[count++] = 'f';
        value[count++] = 'a';
        value[count++] = 'l';
        value[count++] = 's';
        value[count++] = 'e';
    }
    return this;
}

java中扩容机制

# ArrayList 扩容机制

# Vector 扩容机制

# HashMap 扩容机制

# HashSet 扩容机制

# StringBuilder 扩容机制

# StringBuffer 扩容机制