C++

Delete the Duplicate Nodes from an Unsorted Linked List

In this article, we are going to see the different approaches to delete the duplicate nodes from the linked list using the C++ programming. Let’s start with an explanation of what the “duplicate nodes” in a linked list mean.

We are given an unsorted linked list in this challenge and requested to delete any duplicate members from the list. Let’s use some examples to attempt to grasp the issue.

The input unsorted linked list is as follows:

Elements 8, 10, and 9 appear more than once in the linked list, as seen in the previous list. This indicates that there are duplicates of 8, 10, and 9 in the linked list which we must eliminate. The output linked list is as follows once the duplicate entries are removed from the list:

A quick and easy way to find out is to compare all possible pairs of nodes in the list to see if they have the same information. When their information coincides, we get rid of the second node by deleting it. But this approach is more time-consuming.

Better efficiency is possible using hashing. The goal is to work through the provided list and add each node to a set as you go. If the currently viewed node appears in the previous set, it may be safely disregarded. When the process is complete, the list will no longer contain any duplicate nodes.

Let’s understand each approach in detail.

Approach 1: Using Two Loops

Algorithm Steps

  1. Create a linked list function, “createLinkedList“, which can create a linked list.
  2. Create a function called “deleteDuplicatesNodes” that can delete the duplicate node from the linked list.
  3. Create two local pointers, ptr1 and ptr2, inside of the “deleteDuplicatesNodes” function.
  4. Assign the ptr1=headNode and ptr2=NULL values, where ptr1 is used to track the node whose duplicates have to be checked and ptr2 is used to traverse each node of the linked list to check if any node data is the same as the ptr1 node data.
  5. The nested loop pointer ptr2 continues traversing the whole linked list node until it finds NULL.
  6. If the ptr1 node data is similar to the nested loop ptr2 node data, delete that node from the linked list.
  7. If not, increment the ptr1->next and check the next node’s data for duplicates.
  8. Steps 4 through 7 are repeated until ptr1 is not equal to NULL which indicates that it reached the end of the linked list.
  9. Finally, we print the linked list.
  10. End of the algorithm.

C++ Code Implementation (Using Two Loops)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
#include <bits/stdc++.h>
using namespace std;

/* The node has a data <strong>and</strong> address part */
class Node {
  public:
   int data;
   Node* next;
};

/* The below <strong>is</strong> a method to create a new node of the linked
list */
Node* newNode(int value) {
  Node* tempNode = new Node;
  tempNode->data = value;
  tempNode->next = NULL;
  return tempNode;
}

/*
This function will add new node into the existing linked
list and if the linked list is empty, then it will
create a new node <strong>as</strong> a header node. In this we are passing
pointer to pointer <strong>as</strong> a reference to the head of a linked list.
*/
void createLinkedList(Node** headNode, int value) {

  /* Create a new node*/
  Node* newNode = new Node();
  Node* lastNode = *headNode;
  newNode->data = value; /* Adding the data */
  /* Address of this new node intially kept NULL*/
  newNode->next = NULL;

  /* Checking the headNode reference is NULL or not.
  If yes, then the newNode will become the headNode*/
  if (*headNode == NULL) {
      *headNode = newNode;
      return;
  }
  /* If <strong>not</strong>, then this while loop will
  execute and find the lastNode of the linked
  list, so that newly created newNode append at the last*/
  while (lastNode->next != NULL) {
   lastNode = lastNode->next;
  }
  /* add the address of the newNode to the
  lastNode next
  */
  lastNode->next = newNode;
  return;
}

/*
This function will delete the duplicates of the node
*/
void deleteDuplicatesNodes(Node* headNode) {
  Node *ptr1, *ptr2, *duplicate;
  ptr1 = headNode;
  while (ptr1 != NULL && ptr1->next != NULL) {
   ptr2 = ptr1;
   while (ptr2->next != NULL) {
     /* If found, below code will delete
     duplicates node*/
     if (ptr1->data == ptr2->next->data) {
       duplicate = ptr2->next;
       ptr2->next = ptr2->next->next;
       delete (duplicate);
     } else /* If not found, ptr2 will be update to
     to the next node*/
      ptr2 = ptr2->next;
   }
   ptr1 = ptr1->next;
 }
}

/* This function will print the linked list*/
void display(Node* node) {
  while (node->next) {
    printf("%d -> ", node->data);
    node = node->next;
  }
  printf("%d ", node->data);
}

/* The main function of the program*/
int main() {
  /* Empty List*/
  Node* headNode = NULL;
  int N; /* Array size */
  cout << "Enter the size (N) of the array : " << endl;
  cin >> N;
  int arr[N];
  cout << "Enter elements in the array : " << endl;
  for (int k = 0; k < N; k++) {
    cin >> arr[k];
    createLinkedList(&headNode, arr[k]);
  }
  printf("Original Linked List :\n");
  display(headNode);
  deleteDuplicatesNodes(headNode);
  printf("\nLinked list after deleting duplicates nodes :\n");
  display(headNode);
  return 0;
}

Output:

1
2
3
4
5
6
7
8
9
10
11
12
Enter the size (N) of the array :
5
Enter elements in the array :
2
2
0
8
0
Original Linked List :
2 -> 2 -> 0 -> 8 -> 0
Linked list after deleting duplicates nodes :
2 -> 0 -> 8

Explanation:

Lines 21 to 52: We create a function with the name, “createLinkedList”. We pass two parameters (a headNode pointer to pointer and a value) in that function. As shown in the preceding program, whenever this function is called, it first creates a new node with a value (which we passed) and a node address with a NULL value.

/* Create a new node*/
Node* newNode = new Node();
Node* lastNode = *headNode;
newNode->data = value; /* Adding the data */
/* Address of this new node intially kept NULL*/
newNode->next = NULL;

Then, it checks to see if the headNode reference is NULL. If it is, the newly created node becomes the head.

/* Checking the headNode reference is NULL or not.
If yes, then the newNode will become the headNode*/
if (*headNode == NULL) {
*headNode = newNode;
return;
}

If the headNode reference is not NULL, it appends to the lastNode of the linked list. The address of this newly created newNode is assigned to the lastNode so that it can start pointing to the newly created newNode.

/* If not, then this while loop will
execute and find the lastNode of the linked
list, so that newly created newNode append at the last*/
while (lastNode->next != NULL) {
lastNode = lastNode->next;
}

Now, the newly created newNode becomes the lastNode. This process continues until that time when we call this function.

The previous steps create the linked list as per our requirements. Now, we delete the duplicate nodes from the linked list.

Lines 57 to 75: We create one function named “deleteDuplicatesNodes” which accepts one parameter which is the headNode pointer of the linked list. We create two local-level variables, ptr1 and ptr2, to track the linked list when we loop it to find out the duplicates in the linked list. We initialize the ptr1 with the headNode since this will be the upper loop, and the ptr2 with the NULL value.

ptr1 = headNode;

ptr1: This variable is at the outer loop and tracks the elements whose duplicates we are going to check.

ptr2: This variable is inside the loop and continues to check each node’s data to find out if it matches the ptr1 hold data. If it matches, its duplicates are removed from the linked list. This is checked and continues until it does not reach the last node whose value is NULL.

When ptr2->next == NULL, the nested while loop ends and the outer while loop increments the ptr1 = ptr1->next with the next node data.

Note: The ptr2 ends when the ptr1 loop is over because ptr2 is inside the ptr1 loop.

while (ptr1 != NULL && ptr1->next != NULL) {
    ptr2 = ptr1;
    while (ptr2->next != NULL) {
      /* If found, below code will delete
      duplicates node*/
      if (ptr1->data == ptr2->next->data) {
        duplicate = ptr2->next;
        ptr2->next = ptr2->next->next;
        delete (duplicate);
      } else /* If not found, ptr2 will be update to
      to the next node*/
       ptr2 = ptr2->next;
    }
    ptr1 = ptr1->next;
  }

Lines 78 to 84: This displays the final linked list without any duplication. In that case, we pass the headNode as a parameter which is the address of the linked list.

/* This function will print the linked list*/
void display(Node* node) {
  while (node->next) {
    printf("%d -> ", node->data);
    node = node->next;  
  }
  printf("%d", node->data);
}

Approach 2: Hashing Method

Algorithm Steps

  1. Create a linked list function, “createLinkedList”, which can create a linked list.
  2. Create a function called “deleteDuplicatesNodes” that can delete the duplicate node from the linked list.
  3. Create two local pointers, currentNode and previousNode, inside of the “deleteDuplicatesNodes” function.
  4. Create an unsorted hash set inside of the “deleteDuplicatesNodes”.
  5. Assign the currentNode=headNode and previousNode=NULL values where currentNode is used to track the node whose duplicates must be checked and previousNode is used to track the currentNode’s previous node.
  6. The while loop traverses the entire linked list node until currentNode == NULL.
  7. If the currentNode data is already in the hash set, the node is deleted.
  8. If not, it adds that node’s data to the hash set.
  9. Steps 5 through 8 are repeated until the currentNode is not equal to NULL which indicates that it reached the end of the linked list.
  10. Finally, we print the linked list.
  11. End of the algorithm.

C++ Code Implementation (Using the Hash Method)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
#include <bits/stdc++.h>
using namespace std;

/* The node has a data and address part */
class Node {
   public:
     int data;
     Node* next;
};

/* The below is a method to create a new node of the linked
list */
Node* newNode(int value) {
  Node* tempNode = new Node;
  tempNode->data = value;
  tempNode->next = NULL;
  return tempNode;
}

/*
This function will add new node into the existing linked
list and if the linked list is empty, then it will
create a new node as a head. In this we are passing
pointer to pointer as a reference to the head of a list.
*/
void createLinkedList(Node** headNode, int value) {

  /* Create a new node*/
  Node* newNode = new Node();
  Node* lastNode = *headNode;
  newNode->data = value; /* Adding the data */
  /* Address of this new node intially kept NULL*/
  newNode->next = NULL;

  /* Checking the headNode reference is NULL or not.
  If yes, then the newNode will become the headNode*/
  if (*headNode == NULL) {
    *headNode = newNode;
    return;
  }
  /* If not, then this while loop will
  execute and find the lastNode of the linked
  list, so that newly created newNode append at the last*/
  while (lastNode->next != NULL) {
    lastNode = lastNode->next;
  }
  /* add the address of the newNode to the
  lastNode next
  */
  lastNode->next = newNode;
  return;
}

/*
This function will delete the duplicates of the node
*/
void deleteDuplicatesNodes(Node* headNode) {
  /* This will store the visited node list*/
  unordered_set<int> hash;

  struct Node* currentNode = headNode;
  struct Node* previousNode = NULL;
  while (currentNode != NULL) {
    /* If the currentNode data already visited, then this
      code will delete that node
    */
    if (hash.find(currentNode->data) != hash.end()) {
       previousNode->next = currentNode->next;
       delete (currentNode);
    }
    /*
    If not visited currentNode data, then
    insert into the hash
    */
    else {
      hash.insert(currentNode->data);
      previousNode = currentNode;
    }
    currentNode = previousNode->next;
  }
}

/* This function will print the linked list*/
void display(Node* node) {
  while (node->next) {
    printf("%d -> ", node->data);
    node = node->next;
  }
  printf("%d ", node->data);
}

/* The main function of the program*/
int main() {
  /* Empty List*/
  Node* headNode = NULL;
  int N; /* Array size */
  cout << "Enter the size (N) of the array : " << endl;
  cin >> N;
  int arr[N];
  cout << "Enter elements in the array : " << endl;
  for (int k = 0; k < N; k++) {
    cin >> arr[k];
    createLinkedList(&headNode, arr[k]);
  }
  printf("Original Linked List :\n");
  display(headNode);
  deleteDuplicatesNodes(headNode);
  printf("\nLinked list after deleting duplicates nodes :\n");
  display(headNode);
  return 0;
}

Output:

1
2
3
4
5
6
7
8
9
10
11
12
Enter the size (N) of the array :
5
Enter elements in the array :
8
9
1
10
1
Original Linked List :
8 -> 9 -> 1 -> 10 -> 1
Linked list after deleting duplicates nodes :
8 -> 9 -> 1 -> 10

Lines 26 to 52: We create a function with the name “createLinkedList”. In that function, we pass two parameters (a headNode pointer to pointer and a value). As shown in the preceding program, whenever this function is called, it first creates a new node with a value (which we passed) and an address with a NULL value.

/* Create a new node*/
Node* newNode = new Node();
Node* lastNode = *headNode;
newNode->data = value; /* Adding the data */
/* Address of this new node intially kept NULL*/
newNode->next = NULL;

Then, it checks to see if the headNode reference is NULL. If it is, the newly created node becomes the head.

/* Checking the headNode reference is NULL or not.
  If yes, then the newNode will become the headNode*/
  if (*headNode == NULL) {
      *headNode = newNode;
      return;
  }

If the headNode reference is not NULL, it appends to the lastNode of the linked list. The address of this newly created newNode is assigned to the lastNode so that it can start pointing to the newly created newNode.

/* If not, then this while loop will
   execute and find the lastNode of the linked
   list, so that newly created newNode append at the last*/
   while (lastNode->next != NULL) {
   lastNode = lastNode->next;
}

Now, the newly created newNode becomes the lastNode. This process continues until that time when we call this function.

The previous steps create the linked list as per our requirements. Now, we delete the duplicate nodes from the linked list.

Lines 57 to 75: We create one function named “deleteDuplicatesNodes” which accepts one parameter which is the headNode pointer of the linked list. We create an unsorted HashSet hash. We also create two local-level variables, currentNode and previousNode, to track the linked list when we loop it to find out the duplicates in the linked list. We initialize the currentNode with the headNode since this will be the upper loop, and the previousNode with the NULL value.

struct Node* currentNode = headNode;
struct Node* previousNode = NULL;

currentNode: This variable is at the outer loop and tracks the elements whose duplicates we are going to check.

previousNode:  This handles the currentNode’s previous node. We create a while loop that executes until the currentNode does not find the NULL value, which means at the end of the linked list. If the currentNode data is already in the hash map, assign the value of the currentNode’s next pointer to the previousNode’s next pointer.

previousNode->next = currentNode->next;

If not, add the data of the currentNode to the hash map and make the previousNode equal to the currentNode.

else {
    hash.insert(currentNode->data);
    previousNode = currentNode;
}

At the end, assign the value of the next pointer of the previousNode to the currentNode.

while (currentNode != NULL) {
     /* If the currentNode data already visited, then this
       code will delete that node
     */
     if (hash.find(currentNode->data) != hash.end()) {
        previousNode->next = currentNode->next;
        delete (currentNode);
     }
     /*
     If not visited currentNode data, then
     insert into the hash
     */
     else {
        hash.insert(currentNode->data);
        previousNode = currentNode;
     }
     currentNode = previousNode->next;
  }

Lines 84 to 90: This displays the final linked list without any duplication. In that case, we pass the headNode as a parameter which is the address of the linked list.

/* This function will print the linked list*/
void display(Node* node) {
  while (node->next) {
    printf("%d -> ", node->data);
    node = node->next;
  }
  printf("%d ", node->data);
}

Conclusion

In the first method, to get rid of the duplicates, we utilize two loops: an outer loop that iterates over the linked list and selects an element, and a second loop that iterates over that element to look for any possible duplicates. As soon as a duplicate is detected, it is deleted from the list. This method uses a nested loop to examine and eliminate duplicates from an unsorted linked list which increases the time complexity of the process. Time complexity is O(n2).

In the second method, hashing can be used to minimize the temporal complexity of removing the duplicates from an unsorted linked list. In this case, if the node appears in the hashmap twice, it is a duplication and the first occurrence should be deleted. If a node is missing from the map, it’s because there’s a new one that has to be added. It takes on average O(n) time to go over a linked list of length “n” and checks if its nodes are in the map. The temporal complexity for looking up a value in a hash table is O(1). Considering the aforementioned prerequisites together, the total time complexity is O(n).

About the author

Shekhar Pandey