Skip to content
Advertisement

Removing Duplicates from linked list. Why are the position of “prev = head” and “p2 = p2.next” not outside the else statement?

For the first function, shouldn’t “prev = head” be outside of else because we want to set the previous every time before we change the head value?

For the second function, shouldn’t “p2 = p2.next” be outside of else because we want to go next every time?

Thank you guys.

    //This would take O(n) but would require extra space O(n)
    public static Node removeDuplicates(Node head){
        Node prev = null;
        Set<Integer> hs = new HashSet<Integer>();
        
        while(head!= null){
            if(hs.contains(head.data)){
                prev.next = head.next;
            }
            else{
                hs.add(head.data);
                //why is prev = head here instead of out of the else statement? 
                prev = head;
            }
            head = head.next;
        }
        return head;
    }

    //This would take O(n^2) but no extra space is required.
    public static Node removeDuplicatesTradeOff(Node head){
        //pointer 1 and pointer 2.
        Node p1 = head;

        while(p1.next != null){
            Node p2 = p1;
            while(p2.next != null){
                if(p1.data == p2.next.data){
                    p2.next = p2.next.next;
                }
                else{
                    //why is p2 = p2.next here instead of out of the else statement? 
                    p2 = p2.next;
                }  
            }
            p1 = p1.next;
        }

        return head;
    }

Advertisement

Answer

Shouldn’t “p2 = p2.next” be outside of else because we want to go next every time

I think it would be more accurate to say that we want to have a different next available every time, instead of saying that we want to “go next” every time.

We don’t always want to “go next”. When prev.next has changed due to another operation (in this case removal of a duplicate) we want to stay where we are, because prev.next has already changed and is now pointing to a node further ahead (because a duplicate node has just been removed).

In other words, we don’t want to have a different prev every time, we just want to have a different prev.next every time. So as long as prev.next advances every time, we don’t care if prev stays the same sometimes.

That’s why in both methods prev (or p2) only advances in the else branch, while prev.next (or p2.next) is updated (advances) only in the if branch.

Think of these two as different operations, the else branch being “go next” and the if branch being “drop next”. When you drop a node ahead of you, you did not move (true!), but since you did drop one node ahead of you, now there is a new node ahead of you, so did not have to move. So you can just continue with if/else checks and sooner or later you will reach the end or you will drop the last node ahead of you.

One input example to illustrate this point is

head(1) -> 1 -> 1 -> 1 -> null

With such an input, the algorithm would only do

  • drop next
  • drop next
  • drop next

and it would be done.

No “go next” happened at all.

Result

head(1) -> null
User contributions licensed under: CC BY-SA
8 People found this is helpful
Advertisement