What is LinkedHashSet ?
According to Oracle docs ,
LinkedHashSet is the Hashtable and linked list implementation of the Set interface with predictable iteration order.
The linked list defines the iteration ordering, which is the order in which elements were inserted into the set. Insertion order is not affected if an element is re-inserted into the set.
Read Also : How TreeMap works internally in java
Why we need LinkedHashSet when we already have the HashSet and TreeSet ?
HashSet and TreeSet classes were added in jdk 1.2 while LinkedHashSet was added to the jdk in java version 1.4
HashSet provides constant time performance for basic operations like (add, remove and contains) method but elements are in chaotic ordering i.e unordered.
In TreeSet elements are naturally sorted but there is increased cost associated with it .
So , LinkedHashSet is added in jdk 1.4 to maintain ordering of the elements without incurring increased cost.
How LinkedHashSet Works Internally in Java ?
Before understanding how LinkedHashSet works internally in java in detail, we need to understand two terms initial capacity and load factor .
What is Initial capacity and load factor?
The capacity is the number of buckets(used to store key and value) in the Hash table , and the initial capacity is simply the capacity at the time Hash table is created.
The load factor is a measure of how full the Hash table is allowed to get before its capacity is automatically increased.
Constructor of LinkedHashSet depends on above two parameters initial capacity and load factor .
There are four constructors present in the LinkedHashSet class .
All constructors have the same below pattern :
// Constructor 1
public LinkedHashSet (int initialCapacity , float loadFactor)
{ super(initialCapacity , loadFactor , true); }
Note : If initialCapacity or loadFactor parameter value is missing during LinkedHashSet object creation , then default value of initialCapacity or loadFactor is used .
Default value for initialCapacity : 16 ,
Default value for loadFactor : 0.75f
For example,
check the below overloaded constructor , loadFactor is missing in the LinkedHashSet constructor argument. So during super() call , we use the default value of the loadFactor(0.75f).
// Constructor 2
public LinkedHashSet (int initialCapacity)
{ super(initialCapacity , 0.75f , true); }
check the below overloaded constructor , initialCapacity and loadFactor both are missing in the LinkedHashSet constructor argument. So during super() call , we use the default value of both initialCapacity(16) and loadFactor(0.75f).
// Constructor 3
public LinkedHashSet ()
{ super(16 , 0.75f , true); }
below is the last overloaded constructor which uses Collection in the LinkedHashSet constructor argument. So during super() call , we use the default value of loadFactor(0.75f).
// Constructor 4
public LinkedHashSet (Collection c)
{ super(Math.max(2*c.size() ,11) , 0.75f , true); }
Note : Since LinkedHashSet extends HashSet class.
Above all the 4 constructors are calling the super class (i.e HashSet ) constructor , given below
public HashSet (int initialCapacity , float loadFactor , boolean dummy)
{ 1. map = new LinkedHashMap<>(initialCapacity , loadFactor); }
In the above HashSet constructor , there are two main points to notice :
a. We are using extra boolean parameter dummy . It is used to distinguish other int, float constructors present in the HashSet class.
b. Internally it is creating a LinkedHashMap object passing the initialCapacity and loadFactor as parameters.
How LinkedHashSet Maintains Unique Elements ?
public class HashSet<E> extends AbstractSet<E> implements Set<E>, Cloneable, java.io.Serializable { private transient HashMap<E,Object> map; // Dummy value to associate with an Object in the backing Map private static final Object PRESENT = new Object(); public HashSet(int initialCapacity , float loadFactor , boolean dummy) {
map = new LinkedHashMap<>(initialCapacity , loadFactor);
} // SOME CODE ,i.e Other methods in Hash Set public boolean add(E e) {
return map.put(e, PRESENT)==null;
} // SOME CODE ,i.e Other methods in Hash Set }
So , we are achieving uniqueness in LinkedHashSet,internally in java through LinkedHashMap . Whenever you create an object of LinkedHashSet it will indirectly create an object of LinkedHashMap as you can see in the italic lines of HashSet constructor.
Read Also : How LinkedHashMap works Internally in Java
As we know in LinkedHashMap each key is unique . So what we do in the LinkedHashSet is that we pass the argument in the add(Elemene E) that is E as a key in the LinkedHashMap . Now we need to associate some value to the key , so what Java apis developer did is to pass the Dummy value that is ( new Object () ) which is referred by Object reference PRESENT .
So , actually when you are adding a line in LinkedHashSet like linkedhashset.add(5) what java does internally is that it will put that element E here 5 as a key in the LinkedHashMap(created during LinkedHashSet object creation) and some dummy value that is Object's object is passed as a value to the key .
Since LinkedHashMap put(Key k , Value v ) method does not have its own implementation . LinkedHashMap put(Key k , Value v ) method uses HashMap put(Key k , Value v ) method.
Now if you see the code of the HashMap put(Key k,Value v) method , you will find something like this
public V put(K key, V value) {
//Some code
}
The main point to notice in above code is that put (key,value) will return
1. null , if key is unique and added to the map
2. Old Value of the key , if key is duplicate
So , in LinkedHashSet add() method , we check the return value of map.put(key,value) method with null value
i.e.
public boolean add(E e) {
return map.put(e, PRESENT)==null;
}
So , if map.put(key,value) returns null ,then
map.put(e, PRESENT)==null will return true and element is added to the LinkedHashSet.
So , if map.put(key,value) returns old value of the key ,then
map.put(e, PRESENT)==null will return false and element is not added to the LinkedHashSet .
How LinkedHashSet Maintains Insertion Order ?
LinkedHashSet differs from HashSet because it maintains the insertion order .
According to LinkedHashSet Oracle docs ,
LinkedHashSet implementation differs from HashSet in that it maintains a doubly-linked list running through all of its entries
LinkedHashSet internally uses LinkedHashMap to add elements to its object.
What is Entry
LinkedHashMap consists of a static inner class named as Entry
Insertion Order of the LinkedHashMap is maintained by two Entry
transient LinkedHashMap.Entry
transient LinkedHashMap.Entry
For double linked list we need to maintain the previous and next Entry objects for each Entry object .
Entry fields before and after are used to store the references to the previous and next Entry objects .
static class Entryextends HashMap.Node { Entry before, after ;
Entry( int hash , K key , V value , Nodenext ) {
super(hash,key,value,next);
}
}
Please mention in the comments in case if you have any questions regarding how LinkedHashSet works internally in Java