The cumsum() function in Pandas allows you to calculate the cumulative sum over a given axis.
Cumulative sum refers to the total sum of a given data set at a given time. This means that the total sum keeps changing as new data is added or removed.
Let us discuss how to use the cumsum() function in Pandas.
Function Syntax
The function syntax is as shown:
1 | DataFrame.cumsum(axis=None, skipna=True, *args, **kwargs) |
Function Parameters
The function accepts the following parameters:
- axis – along which axis the cumulative addition is performed. Defaults to zero or columns.
- skipna – allows or disallows null rows or columns.
- **kwargs – Additional keyword arguments.
Function Return Value
The function returns a cumulative sum of a DataFrame along the specified axis.
Example
The example below shows how to use the cumsum() function in Pandas DataFrame.
Suppose we have a sample DataFrame as shown:
1 2 3 4 5 6 7 8 9 10 | # import pandas import pandas as pd df = pd.DataFrame({ "student_1": [80, 67, 55, 89, 93], "student__2": [76, 77, 50, 88, 76], "student_3": [88, 67, 80, 90, 92], "student_4": [70, 64, 70, 45, 60], "student_5": [98, 94, 92, 90, 92]}, index=[0,1,2,3,4]) df |
To perform the cumulative sum over the columns, we can do the following:
1 | df.cumsum(axis=0) |
The code above should return:
Note that the values in each column include the total of the previous values.
To operate on the rows, you can set the axis as one. An example is as shown:
Conclusion
This article discussed how to perform a cumulative sum over a specific axis in a Pandas DataFrame using the cumsum() function.
Thanks for reading!!