Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.
SUMMARY: The purpose of this project is to construct a time series prediction model and document the end-to-end steps using a template. The Monthly Airline Passengers dataset is a time series situation where we are trying to forecast future outcomes based on past data points.
INTRODUCTION: The problem is to forecast the annual numbers of lynx trappings in Canada, taken from Brockwell & Davis (1991) and also considered by Campbell & Walker (1977). The dataset describes the numbers of the animal trapped measured annually from 1821 to 1934, and there are 114 observations. We used the first 75% of the observations for training and testing various models, while holding back the last 25% of the observations for validating the final model.
ANALYSIS: The baseline prediction (or persistence) for the dataset resulted in an RMSE of 1117. After performing a grid search for the most optimal ARIMA parameters, the final ARIMA non-seasonal order was (2, 0, 2) with the seasonal order being (0, 0, 0, 0). Furthermore, the chosen model processed the validation data with an RMSE of 737, which was significantly better than the baseline model.
CONCLUSION: For this dataset, the chosen ARIMA model achieved a satisfactory result and should be considered for further modeling.
Dataset Used: Annual Canadian Lynx Trappings 1821–1934
Dataset ML Model: Time series forecast with numerical attributes
The HTML formatted report can be found here on GitHub.