What is Time Series Analysis?
Time series analysis is a statistical technique that deals with time series data, or trend analysis. Time series data means that data is in a series of particular time periods or intervals. Also, know as Time Series Forecasting.
After reading this post, you will know:
- Standard definitions of time series, time series analysis, and time series forecasting.
- The important components to consider in time series data.
- Time series analysis using fbProphet.
- Time series analysis courses.
We analyze time series data in order to forecast the future. Forecasting is the process of making predictions of the future based on past and present data and most commonly by analysis of trends. This could include inventory management, predicting financial market prices, analyzing traffic, and has several other use cases.
Time series analysis differs from Conventional Machine Learning algorithms since we use a different approach while making a model. The dataset available for time series analysis may be very limited compared to the data size of most of the commonly used Machine learning algorithms making forecasting even more challenging.
For example, we can be asked to predict the sales of a store in the upcoming month on the basis of data for the past 3 months. This is not an easy task given the limited data but the silver lining is that we can use several time series analysis methods to do so with minimal error in forecasting.
There are four major components in time series data
- Level: The baseline value for the series if it were a straight line.
- Trend: Continous increase or decrease in the series.
- Seasonality: The repeating short-term cycle in the series which can be determined beforehand by analyzing past data.
- Noise: The random variation in the series.
The most common methods used for forecasting are Autoregression (AR), Moving Average (MA), Autoregressive Integrated Moving Average (ARIMA).
Time Series Analysis Application
Time Series Analysis using fbProphet
Let’s do some hands-on, we can do time series analysis python, analysis of financial time series, or time series data analysis but for this article, we’ll predict Airline passenger traffic using the popular Airline passenger dataset using fbProphet.
We’ll start by importing the libraries required.
Let’s check out the dataset. This can be done in pandas using the pd.read_csv(‘filepath’) command.
We will drop blank(Null/NaN) values using pd.dropna() where setting inplace = True will return nothing(changes are made in place ) and update the dataframe whereas setting inplace = False which happens to be the default returns a copy of the object.
Along with this, we’ll also rename the columns as ‘ds’ and ‘y’ respectively since fbProphet requires us to do so before proceeding.
Now we’ll plot our data using matplotlib and label the axes respectively.
Now, let’s start training our model and for that, we can use Prophet.fit(dataframe), as you can observe we have our parameter interval_width set to 0.95 which is basically the uncertainty interval for our forecast more on its intuition later.
To predict passenger traffic for the future we would have to add future dates on our data. We are going to use the inbuilt make_future_dataframe function for that which takes two arguments the time frame and the frequency respectively. Here we have set periods = 36 to get data for 36 months(3 years) and frequency is set as MS to indicate monthly intervals.
Now let’s see our forecast for future dates. If you print forecast, you’ll get a large number of columns, but we’re only interested in the given columns. This would enable you to understand the intuition behind the uncertainty interval we talked about before.
If you recall we had set the interval_width as 0.95 which is a high value, and you can observe the margin between that, yhat_lower, and yhat_upper columns which seems to be large given the uncertainty in the future.
Now try setting your interval_width to a small value of say 0.30 and observe the difference in the margin of that, yhat_lower, and yhat_upper you’ll see a very small margin between the values since we did not allow room for uncertainty. Since a large value for interval_width gives more flexibility for our forecast, we prefer it.
Again, these intervals assume that the future will see the same frequency and magnitude of rate changes as the past. This assumption is probably not true, so you should not expect to get accurate coverage on these uncertainty intervals.
We plot our forecast and see an upward trend in passenger traffic. It states the obvious more people started traveling through the air as time progressed. You may observe some black dots and different shades of blue in the graph.
The black dots account for the observed values, the blue line is the predicted value whereas the shaded light blue region is the uncertainty interval. Try lowering the value of interval width and you’ll see lesser of the blue shaded region as we provide less room for uncertainty as discussed above. Regardless we do a pretty good job on the forecast.
Another awesome feature of fbProphet is that we can see the trend and seasonality of our forecast separately. There’s nothing astonishing about the trend line but the seasonality graph is quite intriguing. As you can observe the graph peaked from the month of May to mid-July means we had the maximum traffic during this time of the year and hey why not it is vacation time!
We can also plot daily and weekly seasonality, consider the holiday effect in our forecasts and take into account a lot of factors in our forecast which I didn’t cover in this post just for the sake of keeping it simple.
Time Series Analysis Courses
If you want to learn more about time series analysis, a good way to start is to take time series analysis course fo your interest – time series analysis in python, Introduction to Time Series Analysis and Forecasting in R, and time series analysis machine learning.