# Linked List – remove duplicates algorithm in C#/Java

I am studying Data Structures and Algorithms in C#/Java. After encountering a solution to the problem of Linked List duplicate removal, I have been struggling to understand it.

The solution is the one proposed by the renowned book Cracking the coding Interview (5th edition, page 208).

```void RemoveDuplicates_HashSet(Node n)
{
HashSet<object> set = new HashSet<object>();

Node previous = null;
while (n != null)
{
if (set.Contains(n.Data))       // Condition 1
previous.Next = n.Next;
else                            // Condition 2
{
previous = n;
}

n = n.Next;
}
}
```

Running the code with the following linked list `A->B->A->B`:

```// Creating test Singly LinkedList
Node n = new Node("A");
n.Next = new Node("B");
n.Next.Next = new Node("A");
n.Next.Next.Next = new Node("B");

RemoveDuplicates_HashSet(n);
```

Works perfectly fine: the value of `n` after the method is `A->B`.

By following the code with a debugger, I can see that what happens in the method loop is the following:

```| Pass | HashSet | n          | previous   | Comment                  |
| ---- | ------- | ---------- | ---------- | ------------------------ |
| –    | –       | A->B->A->B | null       |                          |
| 1    | A       | B->A->B    | A->B->A->B | Condition 2 is triggered |
| 2    | A,B     | A->B       | B->A->B    | Condition 2 is triggered |
| 3    | A,B     | B          | B->B       | Condition 1 is triggered |
| 4    | A,B     | null       | B          | Condition 1 is triggered |
```

I fail to understand how this actually results in several ways:

1. Where/how exactly are duplicates dropped from `n`? I understand that HashSet contains only unique elements, and it will therefore detect if an element was encountered already, however I still cannot see how the algorithm works in its entirety.

2. How is it that the values pointed to by `n` are updated to be `A->B`? Where is it that, given that essentially the loop is simply iterating over the Linked List doing `n = n.Next`, `n` is actually updated with the final value `A->B`? I understand that the list is passed by reference, but I cannot see how it is actually modified.

This code: `if (set.Contains(n.Data)) previous.Next = n.Next` checks if the element has already been encountered and, if it has, removes n from the linked list. It removes the node by assigning n.Next to previous.Next (which means previous.Next no longer points to n).