“Average” or mean in mathematics is determined by adding all the given values and dividing it by the total numbers of values. While working with group data of DataFrame in Python, sometimes we need to determine the mean or average of specific columns. The “df.groupby()” method is used along with the “mean()” method to determine the average/mean of specified DataFrame single or multiple columns for each group.
This post provides a comprehensive tutorial on determining the mean/average of the DataFrame group data.
How to Determine the Mean/Average by Group in Pandas DataFrame?
The “groupby()” is used along with the “mean()” method to group the data based on single and multiple columns and find the mean/average of the single or multiple columns.
Let’s explore this method by utilizing the below example code:
Example 1: Determine the Mean of a Column Group by a Single DataFrame Column
Let’s utilize the below code to determine the mean of a columned that is grouped by a single column:
data1 = {'Name': ['Joseph', 'Lily', 'Anna', 'Henry', 'Joseph', 'Anna'],'Age': [15, 23, 32, 18, 14, 32],'Height': [5.6, 6.2, 3.7, 6.1, 4.3, 5.3]}
df = pandas.DataFrame(data1)
print(df, '\n')
df1 = df.groupby(['Name'])['Age'].mean()
print(df1)
Here in this code:
-
- The “pandas” module is imported.
- The “pd.DataFrame()” method takes the dictionary data as an argument and creates the DataFrame.
- The “df.groupby()” method is used to group the data based on the single column “Name”.
- After grouping data based on a single column, the “mean()” method is used to determine the mean or average of another column named “Age”, based on the group data.
Output
The “mean/average” of the single column based on the DataFrame group has been calculated.
Example 2: Determine the Mean of a Column Group by Multiple DataFrame Columns
Let’s overview the below code:
data1 = {'Name': ['Joseph', 'Lily', 'Anna', 'Lily', 'Joseph', 'Anna'],'Age': [15, 32, 23, 18, 15, 23],'Height': [5.6, 6.2, 3.7, 6.1, 4.3, 5.3]}
df = pandas.DataFrame(data1)
print(df, '\n')
df1 = df.groupby(['Name', 'Age'])['Height'].mean()
print(df1)
In the above code:
-
- The “df.groupby()” method groups data based on the multiple columns “Name” and “Age”.
- The “mean()” method is used along with the “groupby()” method to determine the mean or average of the single column based on the group data.
Output
The “mean/average” of the multiple columns based on the DataFrame group has been calculated.
Example 3: Determine the Mean of Multiple Column Group by Single DataFrame Column
This example is used to determine the mean of multiple columns based on the group data:
data1 = {'Name': ['Joseph', 'Lily', 'Anna', 'Lily', 'Joseph', 'Anna'],'Age': [15, 32, 23, 18, 15, 23],'Height': [5.6, 6.2, 3.7, 6.1, 4.3, 5.3]}
df = pandas.DataFrame(data1)
print(df, '\n')
df1 = df.groupby(['Name'])[['Age','Height']].mean()
print(df1)
In the above code:
-
- The “df.groupby()” method is used along with the “mean()” method to determine the mean of multiple columns “Age” and “Height” based on the data group by a single column.
Output
Alternative Method: Using the “agg()” Function to Determine the Mean/Average of DataFrame Groups
The “agg()” function can also be used to determine the mean/average of the Pandas DataFrame data group by single or multiple columns. Let’s apply this method in the below example:
data1 = {'Name': ['Joseph', 'Lily', 'Anna', 'Henry', 'Joseph', 'Anna'],'Age': [15, 23, 32, 18, 14, 32],'Height': [5.6, 6.2, 3.7, 6.1, 4.3, 5.3]}
df = pandas.DataFrame(data1)
print(df, '\n')
df1 = df.groupby(['Name'])['Age'].agg('mean')
print(df1)
In the above code:
-
- The “df.groupby()” groups the data of DataFame based on the multiple columns named “Age” and “Name”.
- The “agg()” method takes the attribute “mean” as an argument and determines the mean/average of the specified column based on the group data.
Output
The mean/average has been determined successfully.
Conclusion
In Python, the “groupby()” method is used along with the “mean()” method to determine the mean/average of single or multiple columns for each group data. The “mean()” method is used to determine the average of single or multiple columns based on the group data of DataFrame. The “agg()” method can also be utilized as an alternative to determining the mean/average for each group. This write-up presented a detailed guide on finding the mean/average for each group data using numerous examples.