One of the most popular and frequently asked question from Core Java interview is, How HashMap works in Java or Internal Working of HashMap or How get() and put() methods of HashMap works or How HashMap ensures uniqueness of its keys.

How HashMap works in Java

So very simple answer to the question is, hashmap works on the principal of hashing, Hashing is a mechanism to calculate a identification numbers of the objects based on some algorithm. In Java every hash function should follow these contracts:

1) Whenever it is invoked on the same object more than once during an execution of a Java application, the hashCode method must consistently return the same integer, provided no information used in equals comparisons on the object is modified. This integer need not remain consistent from one execution of an application to another execution of the same application.

2) If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.

3) It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.

How put() method of HashMap works internally

HashMap uses put(K,V) method to store values in it, where the key should be unique; if a key already exists, a new put(K,V) operation with same key will override previously stored value.

HashMap stores values in key value pairs, every key value pair is stored as an instance of static inner class named Entry, here is the code of Entry class from Java docs:

static class Entry<K ,V> implements Map.Entry<K ,V> 
{ 
    final K key; 
    V value; 
    Entry<K ,V> next; 
    final int hash; 
    ...//More code goes here 
}
So every key value is stored as an instance of Entry class, key is marked as final; meaning that once a entry is added the key can not be changed. There are two more fields hash and next we will see them in next section of this article.

Internally HashCode maintains an array of type Entry to store values, the syntax is as follows:

transient Entry[] table; 
All the key value pairs are stored in this array based on their hashCode().

Now lets look at the put() function code from Java docs:
	     public V More ...put(K key, V value) {
	         if (key == null)
	             return putForNullKey(value);
	         int hash = hash(key.hashCode());
	         int i = indexFor(hash, table.length);
	         for (Entry<K,V> e = table[i]; e != null; e = e.next) {
	             Object k;
	             if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
	                 V oldValue = e.value;
	                 e.value = value;
	                 e.recordAccess(this);
	                 return oldValue;
	             }
	         }
	 
	         modCount++;
	         addEntry(hash, key, value, i);
	         return null;
	     }
1) As we can see the code above, once a new key value pair is entered using put(K,V), it checks for null key, if the key is null it's stored at 0th index of Entry array; because hashCode() for a null would be 0.

2) If key is not null, a hashCode() is calculated of the key based on it's hashCode() implementation. Based on the value returned the index of Entry array is determined where to put the Entry object of key value pair. Because of some poor hashCode() implementation the resulting hashCode may vary from too low to too high value and hence there is another hash() function of HashMap itself to calculate modified hash code to fit in the range of Entry Array.

3) Now int i = indexFor(hash, table.length); Determines the actual index value to store Entry object in the array.

4) So far so good, key value pairs are stored on an array based of hash calculations, but as we know two non equal objects can have same hashCode() that means for one indexc position of array there can be two or more Entry object; now how to store more than one Entry object having same index ?

The answer is linked list, in case of collision Entry objects are stored as a likedList, from Entry class code you can see there is a property next that determines next Entry object location of same hashCode.

A number of Entry objects with same hashCode(), makes a linkedList of Entry objects on same array index.

This goes like this, when a new entry comes in, hashCode() is calculated and index of array is decided to store Entry Object, if a Entry object is already there on that index, next variable of pre stored Entry is checked; if null new Entry stays there, if next is not null the list is traversed till the end to get a null and new entry resides there.

How HashMap ensure the uniqueness of keys.

As we have seen when a new Entry comes up based on hashCode() array index position is determined, and the linkedList is traversed at the end, during traversal new Entry object's key is compared with current node's key using equals() for true equality. If new key is found equal to the exiting key, the value of existing Entry node in the linkedList is replaced with the new one.

How get() methods of HashMap works internally

We have seen how put() method is used to store key value pairs on an array of type Entry, how objects having same hashCode() are stored forming a linkedList of Entry objects and how to replace unique keys.

The get() method too works the similar way, a key comes up, hash is calculated to determine the index on Entry array. Now using equals() the exact key is compared in the linkedList. Once matched the value is returned back. Here is the similar code from Java Docs:
public V get(Object key) { 
    if (key == null) 
    return getForNullKey(); 
    int hash = hash(key.hashCode()); 
    for (Entry<K , V> e = table[indexFor(hash, table.length)]; e != null; e = e.next) { 
        Object k; 
        if (e.hash == hash && ((k = e.key) == key || key.equals(k))) 
            return e.value; 
    } 
    return null; 
}

Improvement in Java 8

As we have seen the linked list may be very long in case of a big number of objects stored in the HashMap. Starting from Java 8 the algorithm of storing objects of same hash is slightly changed. Now if the number of objects of same hashCode() increases beyond a threshold. The linkedList is converted to a balanced tree and this will improve the performance from O(n) to O(logn) in worst case.

This is it for this lesson, we have see how hashmap works internally. How key value pairs are stored using put() method. How hashMap ensured uniqueness of keys and how get() method returns value based on a key. In upcoming articles we will see more about Java and Other technologies.
  • By Techburps.com
  • Aug 25, 2015
  • Core Java