Internal implementation of Set/HashSet (How Set Ensures Uniqueness) : Core Java Collection Interview Question

In core java interview questions , It is common to get bombarded with Collection framework questions . I was interviewed in Goldman Sachs , and there they asked a question where i got dumbstruck . Interviewer asked How do you implement Set in  Java in other words internal working of Hashset or How hashset works in java. That is , how will make sure each and every element is unique without using Set interfaces or Classes that implements Set Interface .

Read Also :   How hash map works in java

I gave the answer , although qualified the interview round as well , but the answer is far from satisfactory .
So I came back to  home and do some research . So finally i got the answer and sharing it with you .

How HashSet works internally in java




















Set Implementation Internally in Java

Each and every element in the set is unique .  So that there is no duplicate element in set .

So in java if we want to add elements in the set then we write code like this


public class JavaHungry {
    
    public static void main(String[] args)
    {
        // TODO Auto-generated method stub
        
        HashSet<Object> hashset = new HashSet<Object>();
        hashset.add(3);
        hashset.add("Java Hungry");
        hashset.add("Blogspot");
        System.out.println("Set is "+hashset);
    }
}

It will print the result :       Set is [3, Java Hungry, Blogspot]



Now let add duplicate element in the above code


public class JavaHungry {
    
    public static void main(String[] args)
    {
        HashSet<Object> hashset = new HashSet<Object>();
        hashset.add(3);
        hashset.add("Java Hungry");
        hashset.add("Blogspot");
        hashset.add(3);                     // duplicate elements
        hashset.add("Java Hungry");              // duplicate elements
        System.out.println("Set is "+hashset);
    }
}


It will print the result :       Set is [3, Java Hungry, Blogspot]


Now , what happens internally when you pass duplicate elements in the  add() method of the Set object , It will return false and do not add to the HashSet , as the element is already present .So far so good .

But the main problem arises that how it returns false . So here is the answer

When you open the HashSet implementation of the add() method in Java Apis that is rt.jar , you will find the following code in it


public class HashSet<E>
extends AbstractSet<E>
implements Set<E>, Cloneable, java.io.Serializable

{
    private transient HashMap<E,Object> map;
    
    // Dummy value to associate with an Object in the backing Map
    
    private static final Object PRESENT = new Object();
    
    
    
    public HashSet() {
        map = new HashMap<>();
    }
    
    // SOME CODE ,i.e Other methods in Hash Set
    
    
    public boolean add(E e) {
        return map.put(e, PRESENT)==null;
    }
    
    
    
    // SOME CODE ,i.e Other methods in Hash Set
}


So , we are achieving uniqueness in Set,internally in java  through HashMap . Whenever you create an object of HashSet it will create an object of HashMap as you can see in the italic lines in the above code .
We already discussed   How HashMap works internally  in java .

As we know in HashMap each key is unique . So what we do in the set is that we pass the argument in the add(Elemene E) that is E as a key in the HashMap . Now we need to associate some value to the key , so what Java apis developer did is to pass the Dummy  value that is ( new Object () ) which is referred by Object reference PRESENT .

So , actually when you are adding a line in HashSet like  hashset.add(3)   what java does internally is that it will put that element E here 3 as a key in the HashMap(created during HashSet object creation) and some dummy value that is Object's object is passed as a value to the key .

Now if you see the code of the HashMap put(Key k,Value V) method , you will find something like this

 public V put(K key, V value) {
//Some code
}

The main point to notice in above code is that put (key,value) will return

1.  null , if key is unique and added to the map
2.  Old Value of the key , if key is duplicate

So , in HashSet add() method ,  we check the return value of map.put(key,value) method with null value
i.e.

   public boolean add(E e) {
            return map.put(e, PRESENT)==null;
       }

So , if map.put(key,value) returns null ,then
map.put(e, PRESENT)==null      will return true and element is added to the HashSet.



So , if map.put(key,value) returns old value of the key ,then
map.put(e, PRESENT)==null      will return false and element is  not added to the HashSet .

If you still have any doubts then please write in comments .

Read Also :   Best  Books for Learning Java

About The Author

Subham Mittal has worked in Oracle for 3 years.
Enjoyed this post? Never miss out on future posts by subscribing JavaHungry