C++

Majority Element with C++

Let n be the number of elements in a vector or array. The majority element or leader is the element that occurs more than half the times of all the elements in the vector. Half of n means: if n is 10, then half the time is 5. The element at half the position is at index 4, for zero based index counting. The least integer that is more than half of 10 is clearly 6 (corresponding to index 5). If n is 11, then half of n is considered still as 5. This is the whole number taken when 11 is divided by 2. The index for half is still 4 for zero based indexing. When n is 11, the least integer that is clearly more than the literal half of 11 is still 6 (corresponding to index 5).

So, half the length (size) of a vector is the result of integer division of n by 2. That is, the whole number of the quotient, after dividing by 2, is taken whether or not there is a remainder.

The majority element or leader is the element that occurs more than half the times of all the elements in the vector or array. It must be more than half the times and not just half the times. That is, it must be more than n/2 times (taking the resulting whole number). The function should return -1, if no majority element exists.

This article explains how to determine the majority element for a time complexity of O(n). That is, a maximum of n main operations are needed to obtain the majority element.

Vector Examples

Consider the following vector:

    vector<int> A = {6, 8, 4, 6, 8, 6, 6}

 
The majority element (leader) is 6. It occurs 4 times out of 7 times (elements).

Consider the following vector:

    vector<int> B = {3, 4, 3, 2, 3, -1, 3, 3}

 
The majority element is 3. It occurs 5 times out of 8 elements.

Consider the following vector:

    vector<int> C = {4, 3, 4, 4, 4, 2}

 
The majority element is 4. It occurs 4 times out of 6 elements.

Consider the following vector:

    vector<int> D = {5, 4, 7, 1, 7, 2, 3, 7, 8}

 
There is no majority element here. So, the function has to return -1. There are nine elements here. The number, 7,occurs three times and each of the other elements occur once. Three is not more than half of nine. An acceptable majority element should have occurred, at least 5 times.

Illustration for Algorithm for O(n) Time Complexity

From the following vectors, two different elements are removed at the same time.

Consider again the vector:

    vector<int> A = {6, 8, 4, 6, 8, 6, 6}

 
The leader (majority) element is 6. The first two elements are different. If both of them are removed, the remaining set would be, {4, 6, 8, 6, 6}. In this remaining set, 6 is still the leader: three times out of 5 times. The next two elements, 4 and 6 are different. If they are removed, the remaining set will be {8, 6, 6}. In the remaining set, 6 is still the leader. The next two elements, 8 and 6 are different. If they are removed, the remaining set will be {6}. In this last set of only one element, 6 is still the leader. It therefore looks as if two different numbers are removed repeatedly. The final remaining element would be the majority element.

Consider now the vector:

    vector<int> B = {3, 4, 3, 2, 3, -1, 3, 3}

 
The leader (majority) element is 3. The first two elements are different. If both of them are removed, the remaining set would be, {3, 2, 3, -1, 3, 3}. In this remaining set, 3 is still the leader: four times out of six times. The next two elements, 3 and 2 are different. If they are removed, the remaining set will be {3, -1, 3, 3}. In the remaining set, 3 is still the leader. The next two elements, 3 and -1 are different. If they are removed, the remaining set will be {3, 3}. In this last set of two elements, 3 is still the leader. It therefore still looks as if two different numbers are removed repeatedly. The final same remaining elements would be the majority element.

Consider next the vector:

    vector<int> C = {4, 3, 4, 4, 4, 2}

 
The leader (majority) element is 4. The first two elements are different. If both of them are removed, the remaining set would be, {4, 4, 4, 2}. In this remaining set, 4 is still the leader: three times out of four times. The next two elements, 4 and 4 are the same and they should not be removed. However, the first element here, and the third element here, can be considered for removal. It happens that these two are also the same. Still, the first element and the fourth element can be considered for removal. They are different, so they are removed. The remaining last set is {4, 4}. It therefore still looks as if two different numbers are removed repeatedly. The final remaining same elements would be the majority element.

Consider then the vector:

    vector<int> D = {5, 4, 7, 1, 7, 2, 3, 7, 8}

 
We already know that this vector has no leader, though 7 occurs three times and each other number  occurs once. 7 occurs three out of nine times and that does not make it a leader. Still, different pairs can be removed repeatedly to see what the final remaining set would look like. The first two elements, 5 and 4 are different. If they are removed, the remaining set would be, {7, 1, 7, 2, 3, 7, 8}. In this remaining set, 7 is still the predominant element. But it is not still the leader of the remaining set. Remember that the leader must occur more than half the number of times. The next two elements, 7 and 1 are different. If they are removed, the remaining set would be {7, 2, 3, 7, 8}. 7 is still the predominant element, but it is still not the leader. The next two elements, 7 and 2 are different. They are removed to have the set, {3, 7, 8}. There is no predominant element remaining this time and there is no leader. The next two elements, 3 and 7 are different. When they are removed, the remaining set would be {8}.

For the previous three vectors, the final remaining element or the final remaining same elements is the majority element. It is already known that there is no majority (leader) element in this last vector. So, the fact that an element is finally remaining does not necessarily mean that it is the majority element.

Now, consider the case where n is an even number and each element in the vector occurs once. In this case, all the pairs of elements will be removed and there will be no element remaining in the final set. Clearly, in this case, the function has to return -1 as there is no majority element.

For the final remaining one element or the final remaining same elements has/have to be checked if the element occurs more than half the number of times in the vector. This remaining element is called the candidate.

O(n) Time Complexity Algorithm for Majority Element

The strategy is to be removing pairs of elements that are different, repeatedly: beginning from the left in the given vector. If no element is remaining, then there is no majority element and the function should return -1. If one or more of the same elements are remaining, then it has to be checked if the element occurs more than half of the times in the vector. This element is called the candidate. It becomes the majority element if it occurs more than half the number of times.

This checking can be done in linear time, by scanning the vector from the left, and to stop as soon as the number of occurrence is just greater than half the length of the vector. If the whole vector is scanned, and the number of times the candidate occurs is not more than half the times, then there is no majority element (according to the definition).

Strategy with C++

With C++, the elements do not have to be removed from the given vector. Instead, a stack is used. The first element of the vector is pushed into the top of the stack. If the next element is different from the top element in the stack, then the top element in the stack is removed (popped off); otherwise, this next element is pushed to the top of the stack (if the stack top element and this next element, are the same). This scheme continues for the rest of the elements.

At the end of the scanning (one pass of the vector), if no element is in the stack, there is no majority element. One or more elements may remain in the stack. If more than one element remains in the stack, then these remaining elements have to be the same. This element is called the candidate.

Whether one or more of the same element are remaining in the stack, that element or the same element occurring more than once is a possible majority element. The vector has to be re-scanned to see if this element occurs more than half the number of times for the total number of elements in the vector. If it occurs more than half the times, then that element is the majority (leader) element; otherwise, the vector (or array) has no majority element (the function should return -1).

C++ Coding

Whenever a vector is used in a C++ program, the heading of the program has to be something like:

    #include <iostream>
    #include <vector>
    #include <stack>
    using namespace std;

 
The stack library has to be included. Standard namespace is used. Vector is the main list, so its library is included. The iostream library should always be included; it is responsible for input/output. The name of the function for the O(n) algorithm is majorityElement. The first half of the function code is:

    int majorityElement (vector<int> &A) {
        int N = A.size();
        int size = 0;
        stack<int> st;
        st.push(A[0]);

        for (int i=1; i<N; i++) {
            if (st.size() > 0) {  //to avoid, Segmentation fault (core dumped) error
                if (A[i] == st.top()) {
                    st.push(A[i]);
                }
                else {
                    st.pop();  //if different
                }
            }
            else {
                st.push(A[i]);  //push to top of stack if empty
            }
        }

 
This has the first for-loop which is the main for-loop. Before the for-loop, the first element of the vector is sent to the stack (the top). This for-loop implements the C++ strategy is mentioned above. The second and last part of the majorityElement() function is:

        int candidate = -1;
        if (st.size() > 0)
            candidate = st.top();
        int leader = -1;
        int count = 0;
        for (int i=0; i<N; i++) {
            if (A[i] == candidate) {
                count += 1;
            }
        }
        if (count > N/2)
            leader = candidate;
        return leader;
    }

 
This checks if the candidate is actually the majority element. The variable leader is a synonym of the majority element. The variable candidate is the possible majority element. As soon as the value for count exceeds N/2, then the candidate is the majority element. This part has the for-loop that checks if the vector has a majority element. The above two parts should be joined to form one function. The function returns -1, if no majority element exists.

A suitable C++ main function, for the above code is:

    {
        vector<int> v1 = {6, 8, 4, 6, 8, 6, 6};
        int ret1 = majorityElement(v1);
        cout << ret1 << endl;
       
        vector<int> v2 = {3, 4, 3, 2, 3, -1, 3, 3};
        int ret2 = majorityElement(v2);
        cout << ret2 << endl;
       
        vector<int> v3 = {4, 3, 4, 4, 4, 2};
        int ret3 = majorityElement(v3);
        cout << ret3 << endl;

        vector<int> v4 = {5, 4, 7, 1, 7, 2, 3, 7, 8};
        int ret4 = majorityElement(v4);
        cout << ret4 << endl;

        return 0;
    }

 

Time Complexity

Since there are two for-loops with the vector being scanned twice, the reader might be tempted to say that the time complexity is, O(n+n). Now, the body of the first for-loop is much longer than the body for the second for-loop. So, the time for the second for-loop body to execute is much smaller than the time for the first for-loop body to execute. In other words, that time for the second body is relatively negligible. So, the time complexity for the above algorithm is quoted as:

    O(n)

 
Time complexity is the approximate number of main operations for the function in question.

Conclusion

The general strategy to find the majority element in O(n) time is to be removing pairs of elements that are different, repeatedly, beginning from the left in the given list. If no element is remaining, finally in the list, then there is no majority element and the function should return -1. If one or more of the same element are remaining, then it has to be checked if the element occurs more than half of the times in the list. This element is called the candidate. If the candidate occurs more than half the times, in the given list, then the candidate is the majority element.

Chrys

About the author

Chrysanthus Forcha

Discoverer of mathematics Integration from First Principles and related series. Master’s Degree in Technical Education, specializing in Electronics and Computer Software. BSc Electronics. I also have knowledge and experience at the Master’s level in Computing and Telecommunications. Out of 20,000 writers, I was the 37th best writer at devarticles.com. I have been working in these fields for more than 10 years.