R

How to Conduct Time Series Analysis and Predict Future Trends in R

Time series forecasting is the process of analyzing the data that accumulates over time and utilizing the statistical models to forecast the upcoming trends and patterns. Here, we will discuss the methods to get the time series analysis and predict the future trend of it. For this, we need to install the forecast package in R. The forecast() function accepts a wide range of parameters. It typically depends on predictions on a time series model. The method to install the forecast package is defined in the following command:

install.packages('forecast')

Once this forecast package is installed successfully in the R directory, we can use its function to conduct the time series analysis and predict the future value.

Example 1: Time Series Forecasting Using the ARIMA Model

The most often used approach for time series forecasting is the ARIMA modeling. ARIMA is an Auto-Regressive Integrated Moving Average acronym to define a given time series based on the formerly predicted values and set on the future values. We use auto to fit the ARIMA model for the time series for the future values.arima() function.

library('forecast')

m1<-auto.arima(AirPassengers)

f1<-forecast(m1, level=c(80))

plot(f1)

Here, using the library() function, the forecast package is loaded into the R platform. After that, we employ the auto.arima() function which sets the automatic ARIMA model for the specified “AirPassengers” dataset. Then, the results that are generated by the ARIMA model are stored in the “dataModel” variable. After that, we call that “dataModel” variable to display the result of the ARIMA model for the “AirPassengers” dataset.

Next, we define the “f1” variable where the forecasting is performed for the future values of the specified dataset. We deploy the forecast() method where “dataModel” is the input with the Arima model and the level is set for the forecast interval which is 80%. Finally, we have a plot() method that visually represents the forecasted data and any related uncertainty intervals as the “f1” variable is passed inside it. The “x” coordinator typically indicates time, whereas the “y” coordinator usually reflects the values of the time series.

The visual of the forecasted values is retrieved in the following image using the time series dataset. The shaded region in the plot includes all the values that could occur in the following ten years, and the blue color pattern represents the average of all the values in the shaded section:

Example 2: Generate a Time Series for Forecasting and Predict the Value Using the Ts() Function

In the prior example, we used the default dataset for the time series and then generated the predicted value. However, we can use the ts() method to get the time series objects and set the forecasting and predict the value based on that time series object.

library('forecast')

Monsoon <- c(967,1345,998,884,1863,466,1551,879,1274,655,1491,1599)

Monsoon_ts <- ts(Monsoon,start = c(2023,5),frequency = 10)

print(Monsoon_ts)

summary(Monsoon_ts)

plot(Monsoon_ts,main = "Before Monsoon Prediction")

model <- auto.arima(Monsoon_ts)

model

fdata <- forecast(model, 10)

print(fdata)

plot(fdata, main = "Data Forecast for Monsoon_ts")

Here, we set the vector of the random numbers using the c() function and store that data in the “Monsoon” variable. After that, we define another variable which is “Monsoon_ts” where the ts() method is employed to transform the data in “Monsoon” into the time series object. The ts() method takes the “Monsoon” variable as the first argument to transform, and then it takes the “start” parameter with the year and month values which indicate the time series’ starting point. Lastly, we set the frequency() parameter with the value of “10” to take the observation in a complete cycle. We then display the time series data of “Monsoon” by inputting the “Monsoon_ts” in the print() method of R.

We can see the time series data in the following output where the start and end times are mentioned. The frequency for the time series and the time series points are also displayed:

In the next step, we use the plot() method to generate the time series data graph. We set the “Monsoon_ts” as a parameter and the “main” parameter which is used to set the title for the graph. We can see the Monsoon time series graph before the prediction in the following:

The time series object is previously created. Now, we set the ARIMA model for that time series object that stores the “Monsoon” data. For this, we have a “model” variable where the arima() method is defined and passed with the “Mosoon_ts” time series data. After that, we display the information about the ARIMA model for the “Mosoon_ts” data which can be seen in the following R console:

Next, we apply the forecasting for the “Moon_ts” times series using the forecast() method which is set up in the “fdata” variable. After that, we showed the predicted data as seen in the following image:

We also created the graph for the “Monsoon_ts” data that is forecasted using the plot() function which is shown in the following:

Example 3: Time Series for Forecasting and Predicting the Value Using the TBATS Model

TBATS is another model that is used to handle the time series data forecasting. TBATS is an abbreviation for Trigonometric Box-Cox ARMA Trend Seasonal. We use the tbasts() function to fit the TBATS model for time series forecasting.

library(forecast)

fitTBATS <- tbats(USAccDeaths)

prediction <- predict(fitTBATS)

 

prediction

plot(forecast(fitTBATS))

We import the forecast library in the R directory using the library() function. After that, we define the “fitTABATS” variable where the tbats() function is used to fit the TBATS model for the “USAccDeaths” data. Then, we predict the future value using the “USAccDeaths” dataset time series by setting that in the predict() method which is called within the “prediction” variable. The “prediction” variable is called to display the prediction data for the “USAccDeaths” which is seen in the following snap.

The output contains the expected death toll for the upcoming months along with the probability intervals.

Lastly, we set the plot() function which is used to create the graph for the predicted values that are obtained from the TBATS model for USAccAttacks.

Conclusion

We performed the time series analysis and predicted the future trends based on that time series. We used a different approach, including the forecast model and the ts() function. These are effective approaches for conducting the time series forecasting.

About the author

Saeed Raza

Hello geeks! I am here to guide you about your tech-related issues. My expertise revolves around Linux, Databases & Programming. Additionally, I am practicing law in Pakistan. Cheers to all of you.