Time Series Model for Female Births in California Using Python and ETS

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The purpose of this project is to construct a time series prediction model and document the end-to-end steps using a template. The California Female Births dataset is a time series situation where we are trying to forecast future outcomes based on past data points.

Additional Notes: This is a replication, with some small modifications, of Dr. Jason Brownlee’s blog post, How to Grid Search Triple Exponential Smoothing for Time Series Forecasting in Python (https://machinelearningmastery.com/how-to-grid-search-triple-exponential-smoothing-for-time-series-forecasting-in-python/). I plan to leverage Dr. Brownlee’s exponential smoothing or ETS (Error, Trend and Seasonality) tutorial examples and build an ETS-based notebook template for future uses.

INTRODUCTION: The problem is to forecast the daily number of female births in California. The dataset described a time-series of baby births over 12 months in 1959, and there are 365 observations. We used the first 200 observations for training the model while using the remaining 165 observations for validating the model.

ANALYSIS: The ETS model, which models multiplicative trend with no trend dampening, BoxCox transform, and bias removed, appeared to have the best RMSE at 6.984.

CONCLUSION: For this dataset, the chosen ETS model achieved a satisfactory result and should be considered for further modeling.

Dataset Used: Daily total female births in California, 1959

Dataset ML Model: Time series forecast with numerical attributes

Dataset Reference: Rob Hyndman and Yangzhuoran Yang (2018). tsdl: Time Series Data Library. v0.1.0. https://pkg.yangzhuoranyang./tsdl/

The HTML formatted report can be found here on GitHub.