This article will demonstrate how to sum all or particular columns in a Pandas DataFrame using Python. The DataFrame.sum() function will be used along with a few helpful parameters in the numerous examples of this tutorial.
The ‘dataframe.sum()’ function in Pandas returns the total sum for the specified axis. If the input is an axis of the index, the function adds each column’s values individually. Then it does the same for each column, returning a series storing the sum of the data/values in each column. Additionally, it supports calculating the DataFrame’s sum by ignoring the missing values.
Syntax
Parameters
- axis: {columns (1), index (0)}
- skipna: Ignore NA/null values when calculating the result.
- level: If the specified axis is hierarchical (a multi-index), count to a particular index level before converting to a Series.
- numeric_only: Just float, int, and Boolean columns are acceptable. If None, try to use everything; if not, only numerical data. For Series, not implemented.
- min_count: The number of possible values required to complete the operation. The outcome will be NA if there are fewer non-NA values present than min_count.
Return
DataFrame (if level specified) or Series.
DataFrame
For all the examples, we will use the following ‘analysis’ DataFrame. It holds 12 rows with 5 columns.
# Create the dataframe using lists
analysis = pandas.DataFrame([[23,'sravan',1000,34,56],
[23,'sravan',700,11,0],
[23,'sravan',20,4,2],
[21,'siva',400,32,45],
[21,'siva',100,456,78],
[23,'sravan',00,90,12],
[21,'siva',400,32,45],
[20,'sahaja',120,1,67],
[23,'sravan',00,90,12],
[22,'suryam',450,76,56],
[22,'suryam',40,0,1],
[22,'suryam',12,45,0]
],columns=['id','name','points3','points1','points2'])
# Display the DataFrame - analysis
print(analysis)
Output
0 23 sravan 1000 34 56
1 23 sravan 700 11 0
2 23 sravan 20 4 2
3 21 siva 400 32 45
4 21 siva 100 456 78
5 23 sravan 0 90 12
6 21 siva 400 32 45
7 20 sahaja 120 1 67
8 23 sravan 0 90 12
9 22 suryam 450 76 56
10 22 suryam 40 0 1
11 22 suryam 12 45 0
Here, the ‘id’, ‘points3’, ‘points2’, and ‘points1’ columns are numeric, and make sure that you need to load the DataFrame for all the examples that we are discussing in this tutorial.
Scenario 1: Sum of All Columns
We can directly apply sum() on the DataFrame to return the sum of values in each column.
Example
print(analysis.sum())
Output
name sravansravansravansivasivasravansivasahajasrav...
points3 3242
points1 871
points2 374
Explanation
You can see that the sum of values in each column is returned.
Scenario 2: Sum of Particular Column
If you want to return the sum of values in a particular column, then you need to specify the column name and the DataFrame object.
Example
Let’s return the sum of values in the ‘points1’,’points2’, and ‘points3’ columns separately.
print(analysis['points1'].sum())
# Return the sum of values in points2 column
print(analysis['points2'].sum())
# Return the sum of values in points3 column
print(analysis['points3'].sum())
Output
374
3242
Explanation
- Sum of values in the points1 column is 871.
- Sum of values in the points2 column is 374.
- Sum of values in the points3 column is 3242.
Scenario 3: Sum Across Rows
If you want to return the sum of values across each row, then you need to specify the axis parameter in the sum() function and set it to 1.
Example
Let’s return the sum of values of ‘points1’, ‘points2’, and ‘points3’ across all rows and store the result in the ‘SUM’ column.
analysis['SUM']=analysis[['points1','points2','points3']].sum(axis=1)
print(analysis)
Output
0 23 sravan 1000 34 56 1090
1 23 sravan 700 11 0 711
2 23 sravan 20 4 2 26
3 21 siva 400 32 45 477
4 21 siva 100 456 78 634
5 23 sravan 0 90 12 102
6 21 siva 400 32 45 477
7 20 sahaja 120 1 67 188
8 23 sravan 0 90 12 102
9 22 suryam 450 76 56 582
10 22 suryam 40 0 1 41
11 22 suryam 12 45 0 57
Explanation
Now, the new column – ‘SUM’ holds the sum of three points.
We can also add across rows without using sum(). By using the “+” operator, we can achieve the previous functionality.
Example
- Add values in points1 and points2 columns and store the result in the ‘2 Added‘ column.
- Add values in points1, points2, and points3 columns and store the result in the ‘3 Added‘ column.
# Create the dataframe using lists
analysis = pandas.DataFrame([[23,'sravan',1000,34,56],
[23,'sravan',700,11,0],
[23,'sravan',20,4,2],
[21,'siva',400,32,45],
[21,'siva',100,456,78],
[23,'sravan',00,90,12],
[21,'siva',400,32,45],
[20,'sahaja',120,1,67],
[23,'sravan',00,90,12],
[22,'suryam',450,76,56],
[22,'suryam',40,0,1],
[22,'suryam',12,45,0]
],columns=['id','name','points3','points1','points2'])
# Add values in points1 and points2 columns and store the result in '2 Added' column
analysis['2 Added']=analysis['points1']+analysis['points2']
# Add values in points1,points2 and points2columns and store the result in '3 Added' column
analysis['3 Added']=analysis['points1']+analysis['points2']+analysis['points3']
print(analysis)
Output
0 23 sravan 1000 34 56 90 1090
1 23 sravan 700 11 0 11 711
2 23 sravan 20 4 2 6 26
3 21 siva 400 32 45 77 477
4 21 siva 100 456 78 534 634
5 23 sravan 0 90 12 102 102
6 21 siva 400 32 45 77 477
7 20 sahaja 120 1 67 68 188
8 23 sravan 0 90 12 102 102
9 22 suryam 450 76 56 132 582
10 22 suryam 40 0 1 1 41
11 22 suryam 12 45 0 45 57
Scenario 4: sum() With groupby()
If you want to return the sum of values for individual groups, then you have to use groupby() with sum(). So groupby() is used to group the column values in a particular column, and sum() will return the sum in each group.
Example
Let’s group the rows based on the name column and return the sum of values in each group for all columns.
# Create the dataframe using lists
analysis = pandas.DataFrame([[23,'sravan',1000,34,56],
[23,'sravan',700,11,0],
[23,'sravan',20,4,2],
[21,'siva',400,32,45],
[21,'siva',100,456,78],
[23,'sravan',00,90,12],
[21,'siva',400,32,45],
[20,'sahaja',120,1,67],
[23,'sravan',00,90,12],
[22,'suryam',450,76,56],
[22,'suryam',40,0,1],
[22,'suryam',12,45,0]
],columns=['id','name','points3','points1','points2'])
# group the rows based on name column and return sum of values in each group for all columns
print(analysis.groupby('name').sum())
Output
name
sahaja 20 120 1 67
siva 63 900 520 168
sravan 115 1720 229 82
suryam 66 502 121 57
Explanation
So there are 4 groups in the ‘name’ column. For each group, the sum of id, points3, points1, and points2 is returned.
Conclusion
We tried to teach you how to compute the sum across DataFrames using the Pandas sum() method. We have discussed the row-wise and column-wise addition of values in the examples of this post. Additionally, you learned how to add columns conditionally and how to sum the values after grouping the column of the DataFrame. Now, you may be able to sum the columns of the DataFrame together or sum the values within the DataFrame column by yourself.