Python Pandas

Pandas Flatten MultiIndex

In Pandas, the “MultiIndex” refers to the DataFrame having more than one level of index. We can create more than one level index on both rows and columns of Pandas DataFame. Performing data analysis or manipulation techniques on multi-index is quite challenging. So, to simplify DataFrame we need to flatten the multi-index columns and rows according to the specified levels.

This post will discuss the below methods that are used in Python to flatten the multi-index of DataFrame:

Method 1: Flatten MultiIndex in Pandas Using the “DataFrame.reset_index()” Method

The “DataFrame.reset_index()” method is utilized to reset the index or level of the specified multi-index.

Syntax

DataFrame.reset_index(level=None, *, drop=False, allow_duplicates=_NoDefault.no_default, inplace=False, col_level=0, col_fill='', names=None)

To get a detailed overview of the “DataFrame.reset_index()” method you can check this dedicated guide.

Example 1: Flattening the MultiIndex of All Levels

Here, we imported the “pandas” module and created the multi-index from the tuples value utilizing the “pd.MultiIndex.from_tuples()” method. Next, the “pandas.DataFrame()” method is used to generate a DataFrame with the specified multi-index. Lastly, the “DataFrame.reset_index()” method is used to reset or flatten all the levels of the multi-index:

import pandas
data = pandas.MultiIndex.from_tuples([('Joseph', 22),('Lily', 21),('Anna', 19)],names=['Name', 'Age'])
df = pandas.DataFrame({'Score-1': [28, 29, 30],'Score-2': [48, 19, 20],'Score-3': [18, 29, 30]},index=data)
print(df, '\n')
df.reset_index(inplace=True)
print(df)

The given multi-index DataFrame has been flattened successfully:

Example 2: Flattening the MultiIndex of Specified Levels

In this code, the “DataFrame.reset_index()” method takes the specified column value in the “level=” parameter to flatten the particular level of the MultiIndex DataFrame:

import pandas
data = pandas.MultiIndex.from_tuples([('Joseph', 22),('Lily', 21),('Anna', 19)],names=['Name', 'Age'])
df = pandas.DataFrame({'Score-1': [28, 29, 30],'Score-2': [48, 19, 20],'Score-3': [18, 29, 30]},index=data)
print(df, '\n')
df.reset_index(inplace=True, level=['Age'])
print(df)

The specified index level “Age” has been flattened successfully:

Method 2: Flatten MultiIndex in Pandas Using the “DataFrame.to_records()” Method

The “df.to_records()” method converts the DataFrame into a NumPy record array. We can also use the “df.to_records()” method of the “Pandas” module to flatten the multi-index of DataFrame.

Syntax

DataFrame.to_records(index=True, column_dtypes=None, index_dtypes=None)

We cover the “DataFrame.to_records()” method in our dedicated guide.

In the below code, first, we use the “df.to_records()” method to convert the DataFrame with multiindex into a NumPy array. Next, we will pass it to the “pandas.DataFrame()” method to create the DataFrame by flattening all the multi-index levels:

import pandas
data = pandas.MultiIndex.from_tuples([('Joseph', 22),('Lily', 21),('Anna', 19)],names=['Name', 'Age'])
df = pandas.DataFrame({'Score-1': [28, 29, 30],'Score-2': [48, 19, 20],'Score-3': [18, 29, 30]},index=data)
print(df, '\n')
df1 = pandas.DataFrame(df.to_records())
print(df1)

The DataFrame has been flattened successfully:

Method 3: Flatten Multi-Index Columns of Pandas DataFrame Using the “DataFrame.columns.get_level_values()”

The “get_level_values()” method of the “pandas” module is used to get the index of values for the requested level. We can use this method to flatten the multi-index DataFrame. Here, the “df.groupby()” method groups the data based on the specified column and applies the aggregate function to particular columns. Lastly, we use the “df.columns.get_level_values()” method with the level value passed as an argument to flatten the multi-index DataFrame:

import pandas
df = pandas.DataFrame({'Name': ['Joseph', 'Anna', 'Lily'],'Age': ['22', '22', '19'],
                   'Score-1': [28, 29, 30],'Score-2': [48, 19, 20]})
df1 = df.groupby('Age').agg({ 'Score-1': ['mean', 'sum'], 'Score-2': 'sum'})
print(df1, '\n')
df1.columns = df1.columns.get_level_values(1)
print(df1)

The above code displays the below DataFrame:

Conclusion

The “df.reset_index()”, “df.to_records()” and the “df.columns.get_level_values()” methods are used to flatten the multi-index of DataFrame in Python. The “df.reset_index()” method can flatten all the levels or specified levels of multi-index in pandas. The “df.to_records()” method with “pandas.DataFrame()” can also flatten all the levels of the multi-index Pandas object. This guide explored various approaches to flattening multi-index layers of DataFrame using multiple examples.

About the author

Haroon Javed

Hi, I'm Haroon. I am an electronics engineer and a technical content writer. I am a tech geek who loves to help people to the best of my knowledge.