Python

Pandas Union

The panda’s function “index.union()” is used to obtain the index in a union order as in the set chapter of mathematics, where we used the union function. We can also find the union of the index by using the “Concat()” function, which combines the DataFrame and sets the index order in a sequence manner.

The Syntax for the Concat() Method

 

The Syntax for the “index.union” Method

 

Example 1: The Union of Two DataFrame Indexes Using the concat() Method

In this example, we are using the “concat()” function to “union” the indexes of two DataFrame. Concatenating two DataFrames together is a straightforward process. Although the “union” function in pandas is similar to a union, it also eliminates duplicates. Concat and the drop duplicates function in pandas both use the union. This union function operates similarly to how the union function in the mathematics chapter “sets” operates, combining all the numbers sequentially; however, in Pandas, the repeated number is dropped.

We are utilizing the “Spyder” tool to implement the code in the article’s first example. The initial step in every Python pandas code is to import the panda’s library as “pd”.


We will now generate a DataFrame. To concatenate the DataFrames and apply union on their indexes, two DataFrames will be created. The first DataFrame we have is named “table1”. It contains two columns, the “courses” and “Fee”, each has some values stated in them. In the first column, we list the names of a few programming courses, including “OOP”, “Python”, “Java”, and “Android Studio”. We mentioned the course fees in the second column, “Fee,” and they are “30000”, “35000”, “32000”, and “25000”. To generate our first DataFrame “table1”, we are now utilizing “pd.dataframe”, as you can see in the following image:


The creation of the second DataFrame “table2” is the following stage. The column names in this DataFrame are the same as those in the previous DataFrame, but their values differ. We have “Graphic design”, “PHP”, “SQL”, and “Swift” in the “courses” column, and we have “34000”, “32000”, “22000”, and “24000” in the “Fee” column. For the generation of the second DataFrame, table2, we are once more using “pd.dataframe” as demonstrated.


Now, in this section, we will implement the main function of our examples, which combines two DataFrames using the “Concat()” function and applying the “union” function to their indexes. The “concat()” method concatenates pandas items across a certain axis with optional set logic, which can be intersection or union along with the other axes. Here, we’ve used “pd.concat(table1, table2)” to combine the DataFrame. We’ve also passed the parameter “Ignor_index=True” because we don’t want the index to be repeated. Instead, we want the index in incremental form when combining the DataFrames. Finally, we saved the result in the union variable and printed it using the “print()” function. Consequently, the “Concat()” method will essentially use the index to find the union of the DataFrame.


Let’s turn to its output, which is shown in the following picture. As we can see, combining our DataFrames worked successfully by using the “Concat()” function. The two columns, “courses” and “Fee”, are displayed. Since our index is not repeated, it is displayed in the froth of union, as can be seen, because we have passed the parameter of the “concat()” function, which is “ignore_index=True”. The index size that we currently have is “8”, meaning that it spans a range of “0 to 7”.

Example 2: Combining the Index by Using the index.union() Method

This is a simple and compact example. Before merging the two indexes in this example, we utilized the “index.union()” method. As always, we must import the panda’s library as “pd” before running this code. In this example, we are simply building an index rather than a DataFrame. So, to create the initial “index1”, we used “pd.index” with the numbers “4”, “5”, “6”, and “7”, and we followed the same procedure as you can see for the “index2”. The “index2” values are “8”, “9”, “10” and “11”.


As you can see, we are currently utilizing “index1.union(index2)”. It will create a union order by combining both indexes. We next use the “print()” method to display the generated result.


The output result shows that the combined index starts at number four and concludes at number eleven. This time, a single row of the index is displayed. It also indicates that its datatype is “int64”.

Example 3: A Union of Three DataFrame Indexes Using the concat() Method

This example is similar to the first example, but this one will combine three DataFrames and order their indexes sequentially. In general, Python’s index returns the location of the provided element in a list or the characters in a string. Index essentially defines the position of the element, to put it simply.

We must import the pandas library as “pd” before we can start writing the code for this example. In this scenario, three DataFrames should be created. The first DataFrame we have is named “data1,” and it has three columns “Student_Name”, “Marks”, and “Remarks”. Some values have been added to these columns. We have “Noah”, “Emma”, “Enna”, and “George” in the first column, “Student_Name”, and in the second column “Marks,” we have a list of the student’s marks “450”, “490”, “482” and “209” and the final column contains the student’s “Remarks” either a “Pass” or a “Fail”. Now, we create this DataFrame as shown by using “pd.dataframe”.


It’s time to construct a second DataFrame with three columns, which will be identical to the first one, but the values of the columns are changed. The DataFrame is named “data2”. We have three columns “Student_Name”, “Marks”, and “Remarks”. In the first column, “Student_Name”, we have “Watson”, “Henry”, “James”, and “Oliver” the values in the second column, “Marks”, we have “499”, “390”, “290” and “400” and in the last column, we have the remarks “Pass” or “Fail”. This “data2” DataFrame is completed, so we use the same “pd.dataframe” to produce this.


It’s time to create the third DataFrame “data3,” which has three columns with the same names as the last DataFrame but with different values. The values we have in the first column are “Archie”, “Ethan”, “Michael”, and “Samuel”. We have “230”, “498”, “290”, and “403” in the second column, and in the third, we have “Fail”, “Pass”, “Fail”, and “Pass”. To generate the third DataFrame “data3”, we are again using “pd.dataframe”.


Here, we combine our three DataFrames and assign their indexes a sequence union-wise by using the “concat()” function with its parameter. Here, we use “pd.concat(df1, df2, df3)” is used to combine the DataFrames, which are the “data1”, “data2”, and “data3”. As you can see, we likewise utilized “Ignore_index=True”, which will omit the repeated index and provide them with a new index that is accurate and in the right order because we desire a union order for our index. Set the option “ignore_index=True” to instruct the concatenation to disregard any currently used indices. Then it will set the index starting from ‘0” to continue in the results. To store the result produced by this function, we initialize the variable “union”. Afterward, we invoke the “print()” function to display the content that is stored in this “union”:


The DataFrames are combined and shown as a single DataFrame in the output, as can be seen. The three DataFrames are combined by using the “concat()” function and its union of their indexes by using the parameter “Ignore_index=True”. Due to the same number of columns and the same names of columns in each of these three DataFrames, no extra columns were displayed in the result. The “Student_Name”, “Marks,” and “Remarks” are the three columns displayed after concatenation. The index’s Union-wise representation is visible; it starts from “0” and terminates at “11”, indicating that the index size is “12”.

Conclusion

In this article, we concatenated the DataFrame to get the union index by utilizing the “Concat()” function. We additionally utilize the concat function’s parameter “ignor_index=True” because we don’t want the repeated index. In the second example of the article, which is brief and simple, we used the “index.union()” function to combine the indexes and display the index’s data type. We believe that these methods will simplify your task.

About the author

Aqsa Yasin

I am a self-motivated information technology professional with a passion for writing. I am a technical writer and love to write for all Linux flavors and Windows.