Integer Division by Two
Integer division is necessary when doing a merge sort.
When an odd number is divided by two, there is a remainder of 1. For example:
7 / 2 = 3 R 1
Integer division means taking the whole number as your answer and abandon the 1.
When an even number is divided by two, there is a remainder of 0. For example:
6 / 2 = 3 R 0
Integer division means taking the whole number as your answer and abandon the 0.
In either case, the whole number is taken and the remainder is abandoned.
The aim of this article is to determine the time complexity of the merge sort, written also as mergesort. Sorting can be ascending or descending. Sorting in this article refers to sort ascending.
Merge Sort Algorithm
Merge Sort is a divide and conquer sorting algorithm. In this algorithm, the unsorted list is divided into two halves using the integer division. Each of the halves is further divided into two halves again using the integer division. This division continues until the list is considered as single separate elements.
Beginning from the left, the consecutive elements are then paired in a sorted manner. The paired elements are then paired again in a sorted form. This grouping by pairs continues until the whole list is reformed and sorted.
Linear and Logarithmic Time Complexity
Consider the following code in the C language:
int k = i + 1;
printf("%d ", k);
}
printf("\n");
The output is:
1 2 3 4 5 6 7 8
The code is a for-loop (ignoring the print statement just after the for-loop). It prints the integers from 1 to 8, for a total of 8 numbers. The content of the body of the for-loop is:
printf("%d ", k);
These two statements are considered as one main operation in this situation. There are 8 operations. Let n be 8. This is a linear time complexity and it is written as follows:
O(n)
Where n is the possible maximum number of operations. This is the Big-O notation. It begins with O in uppercase and followed by parentheses. Inside the parentheses is the maximum possible number of operations.
Consider now the following code where out of 8 numbers, 3 numbers are printed:
int k = i + 1;
printf("%d ", k);
if (k == 3) break;
}
printf("\n");
The output is:
1 2 3
The code is a for-loop (ignoring the print statement just after the for-loop). It prints the integers from 1 to 3 out of 8 numbers. The content of the body of the for-loop is:
printf("%d ", k);
if (k == 3) break;
Still, these three statements are considered as one main operation in this situation. There are 3 operations out of 8.
Now,
8 = 23
=> 3 = log2(8)
So, the number of main operations carried out by the previous code is 3 (out of 8).
Let n be 8. This is a logarithmic time complexity and it is written as:
O(log2n)
Where (log2 n) means 3 for the previous code.
There were 8 numbers. In general, when the number of operations for the code is between 2 and 5 out of 8 and not just 3, the time complexity is described as log2(n).
Example of Unsorted and Sorted List
An example of unsorted list is as follows:
PÂ Â Â VÂ Â Â DÂ Â Â QÂ Â Â SÂ Â Â XÂ Â Â TÂ Â Â B
There are eight elements in the list. When this list is sorted, it becomes:
BÂ Â Â DÂ Â Â PÂ Â Â QÂ Â Â SÂ Â Â TÂ Â Â VÂ Â Â X
Counting the Number of Main Operations in the Merge Sort
The following list:
PÂ Â Â VÂ Â Â DÂ Â Â QÂ Â Â SÂ Â Â XÂ Â Â TÂ Â Â B
is used to count the number of the main operations in merge sort.
The integer division by two gives the following result:
PÂ Â Â VÂ Â Â DÂ Â Â QÂ Â Â Â Â Â Â SÂ Â Â XÂ Â Â TÂ Â Â B
This is one operation. Another division of both parts by two gives the following result:
PÂ Â Â VÂ Â Â Â Â Â Â DÂ Â Â QÂ Â Â Â Â Â Â SÂ Â Â XÂ Â Â Â Â Â Â TÂ Â Â B
These are three operations so far (one plus two). Another division of each new part by two gives the following result:
PÂ Â Â Â Â Â Â VÂ Â Â Â Â Â Â DÂ Â Â Â Â Â Â QÂ Â Â Â Â Â Â SÂ Â Â Â Â Â Â XÂ Â Â Â Â Â Â TÂ Â Â Â Â Â Â B
These are seven operations so far (three plus four). The list of the eight elements are now considered as eight separate elements with a total of seven operations so far. The next phase is to pair while sorting, beginning from the left. So, the next result is:
PÂ Â Â VÂ Â Â Â Â Â Â DÂ Â Â QÂ Â Â Â Â Â Â SÂ Â Â XÂ Â Â Â Â Â Â BÂ Â Â T
There are two changes in position for the last pair. Two changes are two operations. This makes a total of nine operations so far (seven plus two).
The pairing and sorting continues, always beginning from the left to give the following result:
DÂ Â Â PÂ Â Â QÂ Â Â VÂ Â Â Â Â Â Â BÂ Â Â SÂ Â Â TÂ Â Â X
Each of these two groups had four changes in position, making eight operations. There are seventeen operations so far (nine plus eight). The pairing and sorting continues and lastly, to give the following result:
BÂ Â Â DÂ Â Â PÂ Â Â QÂ Â Â SÂ Â Â TÂ Â Â VÂ Â Â X
There are seven changes in position here, making seven operations. This makes twenty-four operations (seventeen plus seven) for the complete sorting.
Time Complexity for Merge Sort
The previous counting (illustration) is used to quote the time complexity for the mergesort. There are twenty-four operations.
24 = 8 x 3
That is:
24 = 8 x log2(8)
because log2(8) is 3.
Let 8 be n. With that, the mergesort time complexity is given by:
O(n.log2n)
where the dot means multiplication.
In practice, out of 8 elements, the number of operations is approximately 24 or 24.
Conclusion
The mergesort is a particular divide and conquer sorting algorithm. It is a very efficient sorting algorithm. Its time complexity is very satisfactory, compared to the other sorting algorithms. Its time complexity is:
O(nlog2n)
Note: The number of operations must not necessarily be exactly n.log2n . However, it should approximate it.
Coding mergesort needs recursive functions. It is not too difficult to code it once the algorithm has been understood. Coding the Merge Sort is a discussion for some other time.