Python Pandas

Pandas Rolling Groupby

Python provides a “pandas” library that has multiple functions/methods that are used to perform simple as well as complex operations easily. In Python, the “pandas” make data analysis easier for developers. Additionally, Python offers an incredible environment for performing data investigation.

In this post, we will talk about:

What is the “rolling()” Function in Python?

In Python, the “pandas” offers multiple useful functions/methods for performing complex calculations on data. For that purpose, the “rolling()” function is one of the most useful pandas’ functions that can be used. It provides a rolling windows calculator on the specified data in the provided object series. Furthermore, the “rolling()” function window concept is mostly utilized in signal processing or time series data.

How to Perform Complex Calculation on DataFrame By Using the “rolling()” Groupby Function in Python?

To perform the complex calculation on DataFrame by using the “rolling()” groupby function in Python, first, import the “pandas” and “numpy” libraries:

import pandas as pd
import numpy as np

 
Then, use the “DataFrame()” method to generate the data including NaN values. Then, apply the “rolling()” function with the desired number of rolling windows with the “sum()” function inside the “print()” statement to view the resultant data:

data = pd.DataFrame({'Values': [15, 24, 35, 45, np.nan, 50, 60, 70]})
print(data.rolling(2).sum())

 
In the below-given output, the first value is “NaN” while the second value is the “39” which is the sum of the first “15” and second “24” values because we have specified the size of the window “2”. The “rolling()” method is used to perform the calculation after two windows, respectively. It can be noticed that the fourth and the fifth values are “NaN” not because the provided size of the window expired but it is because of the specified fifth input “NaN” value:


Now, to determine the sum of the provided values with a minimum number of the observations needed to perform the mathematical operations. Specify the minimum period value “1” and rolling window size “2” in the “rolling ()” method as a parameter with the “sum()” method inside the “print()” statement:

print(data.rolling(2, min_periods=1).sum())

 
Output


If you want to know how the “rolling()” function works on the time/date type of data. Let’s check out the provided example.

First, import the “pandas” and “numpy” libraries:

import pandas as pd
import numpy as np

 
Then, use the “DataFrame()” function to create a timestamp type of a DataFrame including the index column which specifying the timestamp value for each column and pass it to the variable named “data”:

data = pd.DataFrame({'Values': [15, 24, np.nan]},
                    index = [pd.Timestamp('20230101 00:00:00'),
                            pd.Timestamp('20230101 00:00:01'),
                            pd.Timestamp('20230101 00:00:02')])

 
Now, invoke the “rolling()” method with the desire time period as parameter along with the “sum()” method inside the “print()” statement to display the resultant values:

data
print(data.rolling('3s').sum())

 
Output


That’s it! We have explained the pandas “rolling()” groupby function in Python.

Conclusion

The “pandas” offers multiple useful functions/methods for performing complex calculations on data and the “rolling()” function is one of the most useful function that provides a rolling windows calculator on the specified data in the provided object series. This post demonstrated the pandas groupby “rolling()” function in Python.

About the author

Maria Naz

I hold a master's degree in computer science. I am passionate about my work, exploring new technologies, learning programming languages, and I love to share my knowledge with the world.