Python Pandas

Pandas Merge by Index

In Python, the “Pandas” module supports various functions and methods for performing multiple operations on Pandas objects. One of the operations that is performed in Python on Pandas objects is merging DataFrames. Merging means combining two DataFrames into one and matching the rows based on shared columns or attributes. In Python, we can merge DataFrames using various methods, such as “pandas.merge()”, “pandas.concat()” and “DataFrame.jon()”.

This guide will present clear information on merging Pandas DataFrame by index utilizing numerous examples.

How to Merge Pandas DataFrame by Index in Python?

To merge Pandas DataFrame by index, various methods are used in Python:

Merge Pandas DataFrame by Index Using “pandas.merge()” Method

The “pandas.merge()” method is utilized to merge or combine the two DataFrames in various ways. This method merges two DataFrames based on the inner join by default. In this join the common indexes data will be retrieved to the output.

Syntax

DataFrame.merge(right, how='inner', left_index=False, right_index=False,on=None, left_on=None, right_on=None, sort=False, indicator=False, validate=None, suffixes=('_x', '_y'), copy=None)

 

Parameters

According to the above-provided syntax:

  • The “right” parameter specifies the DataFrame or series object to merge with.
  • The “how” parameter indicates how to perform the merge type. Some of the merge types contain left, right, outer, cross, and inner.
  • The “on=” parameter indicates the index or column names of the level to merge/join on. These level names must be found in both objects.
  • The “left_on=” and “right_on” parameters specify the column/index level names to merge on in the left object or in the right object.
  • The “left_index” and “right_index” parameters are the boolean values that indicate whether to use the left object index or the right object as the join key.
  • The other methods can also perform specified operations. To get a deep understanding of this syntax you can check/review this official syntax doc.

The following code merges the two DataFrame “df1” and “df2” by using the “pandas.merge()” method. This method merges two DataFrame by executing the “inner” join. This means that the return DataFrame contains only the indexes that appear in both DataFrames:

import pandas
df1 = pandas.DataFrame({'Name': ['Anna', 'Lily', 'Jena', 'Jane'], 'Age': [15, 19, 14, 18]},index=['A', 'B', 'C', 'D'])
print(df1, '\n')
df2 = pandas.DataFrame({'Height': [5.1, 4.5, 4.1, 5.4], 'Salary': [1245, 1580, 1364, 1677]},index=['A', 'B', 'H', 'C'])
print(df2, '\n')
df = pandas.merge(df1, df2, left_index=True, right_index=True)
print(df)

 

Here is the output that retrieves the new merge DataFrame:

Merge Pandas DataFrame by Index Using “pandas.concat()” Method

The “pandas.concat()” method is utilized to concatenate Pandas objects along a specified axis. We can also use this method to merge two DataFrames by index:

pandas.concat(objs, *, axis=0, join='outer', levels=None, sort=False, copy=None, names=None,ignore_index=False, keys=None, verify_integrity=False)

 

Check this guide for a detailed understanding of the “Pandas.concat()” method.

In the below code, the “pandas.concat()” function executes an outer join by default along the “axis=1”. This method retrieves the new DataFrame that contains each index value from each DataFrame:

import pandas
df1 = pandas.DataFrame({'Name': ['Anna', 'Lily', 'Jena', 'Jane'], 'Age': [15, 19, 14, 18]},index=[0, 1, 2, 3])
print(df1, '\n')
df2 = pandas.DataFrame({'Height': [5.1, 4.5, 4.1, 5.4], 'Salary': [1245, 1580, 1364, 1677]},index=[0, 1, 4, 2])
print(df2, '\n')
df = pandas.concat([df1, df2], axis=1)
print(df)

 

The following snippet retrieves the merge DataFrame:

Merge Pandas DataFrame by Index Using “DataFrame.join()” Method

In Python, the “DataFrame.join()” method is utilized to join columns of another DataFrame. This method can also be utilized to merge DataFrame by index:

DataFrame.join(other, on=None, rsuffix='', sort=False, how='left', lsuffix='', validate=None)

 

For a detailed overview of “DataFrame.join()”, check this tutorial.

In this example code, the “DataFrame.join()” method is utilized to perform/execute a left join. This method returns the new DataFrame that contains the index of the first DataFrame:

import pandas
df1 = pandas.DataFrame({'Name': ['Anna', 'Lily', 'Jena', 'Jane'], 'Age': [15, 19, 14, 18]},index=[0, 1, 2, 3])
print(df1, '\n')
df2 = pandas.DataFrame({'Height': [5.1, 4.5, 4.1, 5.4], 'Salary': [1245, 1580, 1364, 1677]},index=[0, 1, 4, 2])
print(df2, '\n')
df = df1.join(df2)
print(df)

 

The below output snippet shows the DataFrame merge:

Conclusion

The “pandas.merge()” method, the “pandas.concat()” method, and the “DataFrame.join()” method are used to merge Pandas DataFrame by index. These methods are used to join the DataFrame by inner, outer, and left join by default. This blog presented a comprehensive tutorial on how to merge Pandas DataFrame by index using numerous examples.

About the author

Haroon Javed

Hi, I'm Haroon. I am an electronics engineer and a technical content writer. I am a tech geek who loves to help people to the best of my knowledge.