The moving average, also known as a rolling or running average, is a time-series data analysis tool that computes the averages of distinct subsets of the entire dataset. It is also known as a moving mean (MM) or rolling mean because it includes calculating the mean of the dataset over a certain period. The moving average can be computed in a variety of methods, one of which is to select a defined subset from an entire sequence of numbers.
Pandas DataFrame.Ewm()
EWMA provides more weight to current observations or less weight to data as it moves further back in time, allowing it to record the recent trends relatively quicker than the other techniques of finding averages. The “DataFrame.ewm()” Pandas method is used to perform EMA.
Syntax:
adjust=True,ignore_na=False,axis=0).mean()
Parameters:
-
- The first parameter, “com”, is the reduction in the center of weight.
- The “span” is the span-related degradation.
- The “halflife” represents the halflife’s decline.
- The “alpha” parameter is a smoothing element whose value ranges from 0 to 1 inclusively. The “min_periods” specifies the minimum number of observations in a timeframe that is needed to produce a value. Otherwise, NA is returned.
- To correct for an imbalance in relative weightings like “adjust”, divide by a declining adjustment factor into the starting periods.
- When calculating weights, the “ignore_na” disregard the missing values.
- The “axis” is the appropriate axis to use. The rows are identified by the number 0, while the columns are identified by the value 1.
Example 1: With Span Parameter
In this example, we have an analytics DataFrame that store the Company Product stock information. We have Product, quantity and cost columns, and company needs to estimate the exponential moving average with a span of 5 days.
# Create pandas dataframe for calculating Exponential Moving Average
# with 3 columns.
analytics=pandas.DataFrame({'Product':[11,22,33,44,55,66,77,88,99,110],
'quantity':[200,455,800,900,900,122,400,700,80,500],
'cost':[2400,4500,5090,600,8000,7800,1100,2233,500,1100]})
print(analytics)
print()
# Calculate Exponential Moving Average for 5 days
analytics['5 Day EWM']=analytics['quantity'].ewm(span=5).mean()
print(analytics)
Output:
0 11 200 2400
1 22 455 4500
2 33 800 5090
3 44 900 600
4 55 900 8000
5 66 122 7800
6 77 400 1100
7 88 700 2233
8 99 80 500
9 110 500 1100
Product quantity cost 5 Day EWM
0 11 200 2400 200.000000
1 22 455 4500 353.000000
2 33 800 5090 564.736842
3 44 900 600 704.000000
4 55 900 8000 779.241706
5 66 122 7800 539.076692
6 77 400 1100 489.835843
7 88 700 2233 562.734972
8 99 80 500 397.525846
9 110 500 1100 432.286704
Explanation:
In the first output, we displayed the entire analytics. In the second output, we calculate the exponential moving average for the quantity column and store the values in the “5 day EWM” column.
Example 2: Visualize the EWM
Let’s visualize the exponential moving average for the “quantity” column over a span of 5 days using the Matplotlib module.
import pandas
analytics=pandas.DataFrame({'Product':[11,22,33,44,55,66,77,88,99,110],
'quantity':[200,455,800,900,900,122,400,700,80,500],
'cost':[2400,4500,5090,600,8000,7800,1100,2233,500,1100]})
# Plot the actual quantity
pyplot.plot(analytics['quantity'],label='Quantity')
# Calculate Exponential Moving Average for 5 days
analytics['5 Day EWM']=analytics['quantity'].ewm(span=5).mean()
# Plot the 5 Day exponential moving average quantity
pyplot.plot(analytics['5 Day EWM'],label='5-Day EWM')
# Set the legend to 1
pyplot.legend(loc=1)
Output:
Explanation:
We calculate the exponential moving average for the quantity column and store the values in “5 day EWM” column. Now, you can see that in the graph, the blue line indicates the actual “quantity” and the orange color indicates the Exponential Moving Average with a span of 5 days.
Example 3: With Span and Adjust Parameters
Estimate the exponential moving average for the “cost” column with a span of 2 days by setting the adjust to False and visualize it.
import pandas
analytics=pandas.DataFrame({'Product':[11,22,33,44,55,66,77,88,99,110],
'quantity':[200,455,800,900,900,122,400,700,80,500],
'cost':[2400,4500,5090,600,8000,7800,1100,2233,500,1100]})
# Plot the actual cost
pyplot.plot(analytics['cost'],label='Purchase')
# Calculate Exponential Moving Average for 2 days
analytics['2 Day EWM']=analytics['cost'].ewm(span=2,adjust=False).mean()
# Plot the 2 Day exponential moving average cost
pyplot.plot(analytics['2 Day EWM'],label='2-Day EWM')
# Set the legend to 1
pyplot.legend(loc=1)
print(analytics)
Output:
0 11 200 2400 2400.000000
1 22 455 4500 3800.000000
2 33 800 5090 4660.000000
3 44 900 600 1953.333333
4 55 900 8000 5984.444444
5 66 122 7800 7194.814815
6 77 400 1100 3131.604938
7 88 700 2233 2532.534979
8 99 80 500 1177.511660
9 110 500 1100 1125.837220
Explanation:
We store the values in the “2 day EWM” column for cost and display. Finally, we visualize using the Matplotlib Pyplot.
Example 4: With Ignore_Na Parameter
See the exponential moving average for the “Product” column having None values with a span of 3 days by setting the ignore_na to False.
import pandas
analytics=pandas.DataFrame({'Product':[None,22,33,None,55,None],
'quantity':[200,455,None,900,900,122]})
# Plot the actual cost
pyplot.plot(analytics['Product'],label='Product')
# Calculate Exponential Moving Average with span of 3 days without ignoring NaN values.
analytics['3 Day EWM with NaN']=analytics['Product'].ewm(span=3,ignore_na=False).mean()
# Plot the 3 Day exponential moving average Product
pyplot.plot(analytics['3 Day EWM with NaN'],label='3-Day EWM')
# Set the legend to 1
pyplot.legend(loc=1)
Output:
Conclusion
The concept of calculating the exponential weighted moving average is discussed in this article. In the introduction section of this writing, we explained the idea of EWM. The “DataFrame.ewm().mean()” Pandas method is provided with all its parameters which are briefly described. We carried out four examples. The basic strategies for computing EWM are elaborated on in this learning.