How TreeSet Works Internally in Java : TreeSet Interview Questions

We have already discussed difference between TreeSet and HashSet in java . The problem with java developers is that they know what TreeSet/TreeMap do but not  how. TreeSet is not frequently used collections framework class . So it is rare to get asked questions about this class  in the interview. Although , you should at least know when to prefer  TreeSet class  over other collection classes.


Read Also :  Difference between Iterator and ListIterator in Java with Example


What is TreeSet ?

TreeSet is like HashSet which contains the unique elements only but in a sorted manner. The major difference is that TreeSet provides a total ordering of the elements. The elements are ordered using their natural ordering, or by a Comparator typically provided at sorted set creation time. The set's iterator will traverse the set in ascending element order.

How TreeSet works in Java ?

If you look into the TreeSet Api in rt.jar , you will find the following code :



    
  public class TreeSet<E>
              extends AbstractSet<E>
              implements NavigableSet<E>, Cloneable, java.io.Serializable

{
    private transient NavigableMap<E,Object> map;
    
    // Dummy value to associate with an Object in the backing Map
    
    private static final Object PRESENT = new Object();
    
    
    
    public TreeSet() {

        this(new TreeMap<E,Object>());

    }
    
    // SOME CODE ,i.e Other methods in TreeSet
    
    
    public boolean add(E e) {

        return map.put(e, PRESENT)==null;
    }   

    // SOME CODE ,i.e Other methods in TreeSet

}

According to TreeSet Oracle doc :

TreeSet is a NavigableSet implementation backed by TreeMap instance. 


Hence , whenever you are adding element to the TreeSet object , it works just like HashSet , The only difference is that instead of HashMap here we have TreeMap object in the constructor.

As we know in TreeMap each key is unique as it internally uses HashMap . So what we do in the TreeSet is that we pass the argument in the add(Elemene E) that is E as a key in the TreeSet . Now we need to associate some value to the key , so what Java apis developer did is to pass the Dummy  value that is ( new Object () ) which is referred by Object reference PRESENT .

So , actually when you are adding a line in TreeSet like  treeset.add(3)   what java does internally is that it will put that element E here 3 as a key in the TreeMap(created during TreeSet object creation) and some dummy value that is Object's object is passed as a value to the key .

Now if you see the code of the TreeMap put(Key k,Value V) method , you will find something like this

 public V put(K key, V value) {
//Some code
}

The main point to notice in above code is that put (key,value) will return

1.  null , if key is unique and added to the map
2.  Old Value of the key , if key is duplicate

So , in TreeSet add() method ,  we check the return value of map.put(key,value) method with null value
i.e.

   public boolean add(E e) {
            return map.put(e, PRESENT)==null;
       }

So , if map.put(key,value) returns null ,then
map.put(e, PRESENT)==null      will return true and element is added to the TreeSet.



So , if map.put(key,value) returns old value of the key ,then
map.put(e, PRESENT)==null      will return false and element is  not added to the TreeSet .

How to find the index of  any element in the TreeSet ?

There are many ways to find out the index of element in the TreeSet. Below is the one liner :

set.headSet(element).size()


Note  :  headSet(element) method returns the sub TreeSet(portion of TreeSet) whose values  are less than input element. Then we are calling size() method on the sub TreeSet , which returns the index of the element as sub TreeSet is already sorted.


Why and when we use TreeSet ?

We prefer TreeSet in order  to maintain the unique elements  in the sorted order .

What is the runtime performance of the add() method of the TreeSet and HashSet , where n represents the number of elements?

According to TreeSet Oracle doc :

TreeSet implementation provides guaranteed log(n) time cost for the basic operations (add, remove
and contains ) method.

According to HashSet Oracle doc :

HashSet provides constant time performance O(1) for basic operations  (add, remove and contains) method assuming the hash  function disperses the elements properly among the buckets.

One-liner :                TreeSet : O(log(n))  HashSet : O(1)

What is natural ordering in TreeSet ?

"Natural" ordering is the ordering implied by the implementation of Comparable interface by the objects in the TreeSet . Essentially RBTree must be able to tell which object is smaller than other object , and there are two  ways to supply that logic to the RB Tree implementation :

1. We need to implement the Comparable interface in the class(es) used as objects in TreeSet.
2. Supply an implementation of the Comparator would do comparing outside the class itself.


Why do we need TreeSet when we already had SortedSet ?

sortedSet is an interface while TreeSet is the class implementing it. As we know, in java, we can not create the objects of the interface. The class implementing the interface must fulfill the contract of interface, i.e , concrete class must implement all the methods present in the interface. TreeSet is such an implementation.

Which data structure you will prefer  in your code : HashSet and TreeSet ?

TreeSet contains the elements in the sorted order while HashSet is faster. Thus , deciding which one to choose depends upon the conditions :

If you want to maintain the order of the elements then TreeSet should be used because the result is alphabetically sorted.

If you do not want to sort the elements and  avoid duplicate elements . Your task involves mainly insert and retrieve operations then prefer HashSet.While iterating HashSet there is no ordering of elements while TreeSet iterates in the natural order.


What happens if the TreeSet is concurrently modified while iterating the elements ?

The iterator's returned by the TreeSet class iterator method are fail-fast.  fail-fast means if the set is modified at any time after the iterator is created , in any way except the iterator's own remove method  , the iterator will throw a ConcurrentModificationException. Thus , in the face of concurrent modification , the iterator fails quickly and cleanly . You can find it here  difference between fail-fast and fail-safe iterator in java with example.

Which copy technique (deep or shallow ) is used by TreeSet clone() method ?

According to Oracle docs , clone() method returns the shallow copy of the TreeSet instance.  In shallow copy , Both objects A and B shared the same memory location .


How to convert HashSet to TreeSet object ?

One-liner :   Set    treeObject = new TreeSet( hashSetObject);


What is the difference between HashSet and TreeSet in Java ?

I have already shared the difference between HashSet and TreeSet in Java with Example

Write an example of TreeSet ?



import java.util.TreeSet;


public class TreeSetExample {
    
    public static void main(String[] args) { 
 
        TreeSet<String> treesetobj = new TreeSet<String>();
        treesetobj.add("Alive is awesome");
        treesetobj.add("Love yourself");
        System.out.println("TreeSet object output :"+ treesetobj); 
 }
}



Output : 

TreeSet object output :[Alive is awesome, Love yourself]

Please mention in the comments in case you have any other questions regarding TreeSet class in java.

About The Author

Subham Mittal has worked in Oracle for 3 years.
Enjoyed this post? Never miss out on future posts by subscribing JavaHungry