Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.
SUMMARY: The purpose of this project is to construct a predictive model using various machine learning algorithms and to document the end-to-end steps using a template. The Human Activity Recognition Using Smartphones dataset is a multi-class classification situation where we are trying to predict one of several (more than two) possible outcomes.
INTRODUCTION: Researchers collected the datasets from experiments that consist of a group of 30 volunteers, with each person performing six activities by wearing a smartphone on the waist. With its embedded accelerometer and gyroscope, the research captured measurement for the activities of WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING. The dataset has been randomly partitioned into two sets, where 70% of the volunteers were selected for generating the training data and 30% the test data.
In previous iterations, the script focused on evaluating various classic machine learning algorithms and identify the algorithm that produces the best accuracy metric. The previous iterations established a baseline performance in terms of accuracy and processing time.
In iteration Take1, we constructed and tuned an XGBoost machine learning model for this dataset. We also observed the best accuracy result that we could obtain using the XGBoost model with the training and test datasets.
In this Take2 iteration, we will construct several Multilayer Perceptron (MLP) models with one hidden layer. These simple MLP models will serve as the baseline models as we build more complex MLP models in future iterations.
ANALYSIS: For iteration Take1, the XGBoost model achieved an accuracy metric of 99.45% in training. When configured with the optimized parameters, the XGBoost model processed the test dataset with an accuracy of 94.94%, which indicated a high variance issue. We will need to explore regularization techniques or other modeling approaches before deploying the model for production use.
For this Take2 iteration, the one-layer MLP models achieved an accuracy metric of between 98.8% and 99.3% after 50 epochs in training. Those same models processed the test datasets with an accuracy metric of between 93.0% and 95.9%.
- One-layer 32 nodes: Training – 98.84% Testing – 94.84%
- One-layer 64 nodes: Training – 99.22% Testing – 94.40%
- One-layer 96 nodes: Training – 99.30% Testing – 95.86%
- One-layer 128 nodes: Training – 99.02% Testing – 95.86%
- One-layer 192 nodes: Training – 99.27% Testing – 94.63%
- One-layer 256 nodes: Training – 99.22% Testing – 95.18%
- One-layer 384 nodes: Training – 99.18% Testing – 93.07%
- One-layer 512 nodes: Training – 98.93% Testing – 94.23%
- One-layer 768 nodes: Training – 99.15% Testing – 94.97%
- One-layer 1024 nodes: Training – 99.10% Testing – 94.06%
CONCLUSION: For this iteration, the one-layer MLP models produced mixed results with noticeable variance. For this dataset, We will need to explore regularization techniques or other modeling approaches to reduce variance before deploying the model for production use.
Dataset Used: Human Activity Recognition Using Smartphones
Dataset ML Model: Multi-class classification with numerical attributes
The HTML formatted report can be found here on GitHub.