In Pandas, the “MultiIndex” refers to the DataFrame having more than one level of index. We can create more than one level index on both rows and columns of Pandas DataFame. Performing data analysis or manipulation techniques on multi-index is quite challenging. So, to simplify DataFrame we need to flatten the multi-index columns and rows according to the specified levels.
This post will discuss the below methods that are used in Python to flatten the multi-index of DataFrame:
- Using the “DataFrame.reset_index()” Method
- Using the “DataFrame.to_records()” Method
- Using the “DataFrame.columns.get_level_values()”
Method 1: Flatten MultiIndex in Pandas Using the “DataFrame.reset_index()” Method
The “DataFrame.reset_index()” method is utilized to reset the index or level of the specified multi-index.
Syntax
To get a detailed overview of the “DataFrame.reset_index()” method you can check this dedicated guide.
Example 1: Flattening the MultiIndex of All Levels
Here, we imported the “pandas” module and created the multi-index from the tuples value utilizing the “pd.MultiIndex.from_tuples()” method. Next, the “pandas.DataFrame()” method is used to generate a DataFrame with the specified multi-index. Lastly, the “DataFrame.reset_index()” method is used to reset or flatten all the levels of the multi-index:
data = pandas.MultiIndex.from_tuples([('Joseph', 22),('Lily', 21),('Anna', 19)],names=['Name', 'Age'])
df = pandas.DataFrame({'Score-1': [28, 29, 30],'Score-2': [48, 19, 20],'Score-3': [18, 29, 30]},index=data)
print(df, '\n')
df.reset_index(inplace=True)
print(df)
The given multi-index DataFrame has been flattened successfully:
Example 2: Flattening the MultiIndex of Specified Levels
In this code, the “DataFrame.reset_index()” method takes the specified column value in the “level=” parameter to flatten the particular level of the MultiIndex DataFrame:
data = pandas.MultiIndex.from_tuples([('Joseph', 22),('Lily', 21),('Anna', 19)],names=['Name', 'Age'])
df = pandas.DataFrame({'Score-1': [28, 29, 30],'Score-2': [48, 19, 20],'Score-3': [18, 29, 30]},index=data)
print(df, '\n')
df.reset_index(inplace=True, level=['Age'])
print(df)
The specified index level “Age” has been flattened successfully:
Method 2: Flatten MultiIndex in Pandas Using the “DataFrame.to_records()” Method
The “df.to_records()” method converts the DataFrame into a NumPy record array. We can also use the “df.to_records()” method of the “Pandas” module to flatten the multi-index of DataFrame.
Syntax
We cover the “DataFrame.to_records()” method in our dedicated guide.
In the below code, first, we use the “df.to_records()” method to convert the DataFrame with multiindex into a NumPy array. Next, we will pass it to the “pandas.DataFrame()” method to create the DataFrame by flattening all the multi-index levels:
data = pandas.MultiIndex.from_tuples([('Joseph', 22),('Lily', 21),('Anna', 19)],names=['Name', 'Age'])
df = pandas.DataFrame({'Score-1': [28, 29, 30],'Score-2': [48, 19, 20],'Score-3': [18, 29, 30]},index=data)
print(df, '\n')
df1 = pandas.DataFrame(df.to_records())
print(df1)
The DataFrame has been flattened successfully:
Method 3: Flatten Multi-Index Columns of Pandas DataFrame Using the “DataFrame.columns.get_level_values()”
The “get_level_values()” method of the “pandas” module is used to get the index of values for the requested level. We can use this method to flatten the multi-index DataFrame. Here, the “df.groupby()” method groups the data based on the specified column and applies the aggregate function to particular columns. Lastly, we use the “df.columns.get_level_values()” method with the level value passed as an argument to flatten the multi-index DataFrame:
df = pandas.DataFrame({'Name': ['Joseph', 'Anna', 'Lily'],'Age': ['22', '22', '19'],
'Score-1': [28, 29, 30],'Score-2': [48, 19, 20]})
df1 = df.groupby('Age').agg({ 'Score-1': ['mean', 'sum'], 'Score-2': 'sum'})
print(df1, '\n')
df1.columns = df1.columns.get_level_values(1)
print(df1)
The above code displays the below DataFrame:
Conclusion
The “df.reset_index()”, “df.to_records()” and the “df.columns.get_level_values()” methods are used to flatten the multi-index of DataFrame in Python. The “df.reset_index()” method can flatten all the levels or specified levels of multi-index in pandas. The “df.to_records()” method with “pandas.DataFrame()” can also flatten all the levels of the multi-index Pandas object. This guide explored various approaches to flattening multi-index layers of DataFrame using multiple examples.