Sunday, December 4, 2016

Internal working of HashSet

Set is a collection that contains no duplicate elements. So, it can contain at most one null.
HashSet implements Set interface in java. It is not synchronized and is not thread safe.

Example: Showing HashSet does not allow duplicate elements.

public class Test {
   public static void main(String[] args) throws IOException {
        HashSet hashSet = new HashSet();
        hashSet.add(20);
        hashSet.add("Test");
        hashSet.add("XYZ");
        hashSet.add(20);
        hashSet.add("Test");
        System.out.println("Set = "+hashSet);
   }
}


Output:
Set = [20, Test, XYZ]


Internal Working:
Internally when the duplicate elements are passed to the HashSet, the add(e) method in HashSet returns false when the element exists, else it returns true.

When we have a look at the HashSet.java in java API, we can see the following code:

public class HashSet extends AbstractSet implements Set, Cloneable, java.io.Serializable
{
    private transient HashMap<E,Object> map; 

    // Dummy value to associate with an Object in the backing Map
    private static final Object PRESENT = new Object();

    public HashSet() {
        map = new HashMap<>();
    }

    public boolean add(E e) {
        return map.put(e, PRESENT)==null;
    }

    /**
    * Some code
    */
}

Set achieves the uniqueness in its elements through HashMap. In HashMap, each key is unique. So, when an object of HashSet is created, it will create an object of HashMap.

When an element is passed to Set, it is added as a key in the HashMap in the add(Element e) method. Now, a value needs to be associated to the key. Java uses a Dummy value (new Object) which is called PRESENT in HashSet.

In HashMap, the put(Key k,Value V) method returns null, if the key is unique and the key gets added to the map. It returns old value of the key, if the key is duplicated.

public V put(K key, V value) {
/* Some code */
}

In add(e) method, the return value of map.put(key,value) method is checked with null value.

public boolean add(E e) {
return map.put(e, PRESENT)==null;
}

If map.put(key,value) returns null, then map.put(e, PRESENT)==null will return true and element gets added to the HashSet.

If map.put(key,value) returns the old value of the key, then map.put(e, PRESENT)==null will return false and element wont be added to the HashSet.

remove() method also works in the same way.

public boolean remove(Object o) {
    return map.remove(o)==PRESENT;
}

As you know HashSet uses same values for all keys, it is really important to override equals() and hashCode() for any object you are going to store in HashSet thus making the object Immutable.

No comments:

Post a Comment

Home