Multi-Class Model for Kaggle Tabular Playground Series 2021 December Using AutoKeras

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The project aims to construct a predictive model using various machine learning algorithms and document the end-to-end steps using a template. The Kaggle Tabular Playground Series December 2021 dataset is a multi-class modeling situation where we are trying to predict one of several (more than two) possible outcomes.

INTRODUCTION: Kaggle wants to provide an approachable environment for relatively new people in their data science journey. Since January 2021, they have hosted playground-style competitions on Kaggle with fun but less complex, tabular datasets. The dataset used for this competition is synthetic but based on a real dataset and generated using a CTGAN. The dataset is used for this competition is synthetic but based on a real dataset and generated using a CTGAN. This dataset is based on the original Forest Cover Type Prediction competition.

ANALYSIS: After a series of tuning trials, the best AutoKeras model processed the training dataset with an accuracy score of 72.54%. When we processed the test dataset with the final model, the model achieved an accuracy score of 70.71%.

CONCLUSION: In this iteration, the AutoKeras model did not appear to be a suitable algorithm for modeling this dataset without using additional trial iterations.

Dataset Used: Kaggle Tabular Playground Series December 2021 Data Set

Dataset ML Model: Multi-Class classification with numerical and categorical attributes

Dataset Reference: https://www.kaggle.com/c/tabular-playground-series-dec-2021

One potential source of performance benchmark: https://www.kaggle.com/c/tabular-playground-series-dec-2021/leaderboard

The HTML formatted report can be found here on GitHub.

Multi-Class Model for Kaggle Tabular Playground Series 2021 June Using AutoKeras

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using various machine learning algorithms and document the end-to-end steps using a template. The Kaggle Tabular Playground June 2021 dataset is a multi-class modeling situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: Kaggle wants to provide an approachable environment for relatively new people in their data science journey. Since January 2021, they have hosted playground-style competitions on Kaggle with fun but less complex, tabular datasets. The dataset used for this competition is synthetic but based on a real dataset and generated using a CTGAN. The original dataset deals with predicting the category on an eCommerce product given various attributes about the listing. Although the features are anonymized, they have properties relating to real-world features.

ANALYSIS: After a series of tuning trials, the best AutoKeras model processed the training dataset with a logarithmic loss of 1.7691. When we processed the test dataset with the final model, the model achieved a logarithmic loss of 1.7686.

CONCLUSION: In this iteration, the AutoKeras model appeared to be a suitable algorithm for modeling this dataset.

Dataset Used: Kaggle Tabular Playground 2021 June Data Set

Dataset ML Model: Multi-Class classification with numeric attributes

Dataset Reference: https://www.kaggle.com/c/tabular-playground-series-jun-2021/

One potential source of performance benchmark: https://www.kaggle.com/c/tabular-playground-series-jun-2021/leaderboard

The HTML formatted report can be found here on GitHub.

Multi-Class Model for Kaggle Tabular Playground Series 2021 May Using AutoKeras

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using various machine learning algorithms and document the end-to-end steps using a template. The Kaggle Tabular Playground May 2021 dataset is a multi-class modeling situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: Kaggle wants to provide an approachable environment for relatively new people in their data science journey. Since January 2021, they have hosted playground-style competitions on Kaggle with fun but less complex, tabular datasets. The dataset used for this competition is synthetic but based on a real dataset and generated using a CTGAN. The original dataset deals with predicting the category on an eCommerce product given various attributes about the listing. Although the features are anonymized, they have properties relating to real-world features.

ANALYSIS: After a series of tuning trials, the best AutoKeras model processed the training dataset with a logarithmic loss of 1.0984. When we processed the test dataset with the final model, the model achieved a logarithmic loss of 1.1023.

CONCLUSION: In this iteration, the AutoKeras model appeared to be a suitable algorithm for modeling this dataset.

Dataset Used: Kaggle Tabular Playground 2021 May Data Set

Dataset ML Model: Multi-Class classification with categorical attributes

Dataset Reference: https://www.kaggle.com/c/tabular-playground-series-may-2021/

One potential source of performance benchmark: https://www.kaggle.com/c/tabular-playground-series-may-2021/leaderboard

The HTML formatted report can be found here on GitHub.

Tabular Data Analytics Project Templates Using Python and AutoKeras Version 3

As I work on practicing and solving machine learning (ML) problems, I find myself repeating a set of steps and activities repeatedly.

Thanks to Dr. Jason Brownlee’s suggestions on creating a machine learning template, I have pulled together a project template that I use to experiment with modeling ML problems using Python and the AutoKeras library.

Version 3 of the AutoKeras templates contain updated structures and code like the previous templates. I designed the templates to address regression, binary classification, and multi-class classification modeling exercises from beginning to end.

You will find the Python templates on the Analytics Project Templates page.

Multi-Label Tabular Data Classification Analytics Project Template Using TensorFlow Version 2

As I work on practicing and solving machine learning (ML) problems, I find myself repeating a set of steps and activities repeatedly.

Thanks to Dr. Jason Brownlee’s suggestions on creating a machine learning template, I have pulled together a project template that I use to experiment with modeling multi-label classification problems using Python and the TensorFlow library.

Version 2 of the TensorFlow templates contain updated structures and code like the previous multi-label classification TensorFlow templates. I designed the templates to address multi-class and multi-label modeling exercises from beginning to end.

You will find the Python templates on the Analytics Project Templates page.

Multi-Class Image Classification Analytics Project Template Using TensorFlow Version 2

As I work on practicing and solving machine learning (ML) problems, I find myself repeating a set of steps and activities repeatedly.

Thanks to Dr. Jason Brownlee’s suggestions on creating a machine learning template, I have pulled together a project template that I use to experiment with modeling image classification problems using Python and the TensorFlow library.

Version 2 of the TensorFlow templates contain updated structures and code like the previous image classification TensorFlow templates. I designed the templates to address multi-class modeling exercises from beginning to end.

You will find the Python templates on the Analytics Project Templates page.

Tabular Data Analytics Project Templates Using Python and TensorFlow Version 9

As I work on practicing and solving machine learning (ML) problems, I find myself repeating a set of steps and activities repeatedly.

Thanks to Dr. Jason Brownlee’s suggestions on creating a machine learning template, I have pulled together a project template that I use to experiment with modeling ML problems using Python and the TensorFlow library.

Version 9 of the TensorFlow templates contain updated structures and code like the previous TensorFlow templates. I designed the templates to address regression, binary classification, and multi-class classification modeling exercises from beginning to end.

You will find the Python templates on the Analytics Project Templates page.

Multi-Class Model for Crop Mapping with Fused Optical and Radar Data Using TensorFlow

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using various machine learning algorithms and document the end-to-end steps using a template. The Crop Mapping with Fused Optical Radar Data dataset is a multi-class modeling situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: This dataset combines optical and PolSAR remote sensing images for cropland classification. The organization collected the images using RapidEye satellites (optical) and the Unmanned Aerial Vehicle Synthetic Aperture Radar (UAVSAR) system (radar) over an agricultural region near Winnipeg, Manitoba, Canada, in 2012. There are two sets of 49-radar features and two sets of 38-optical features for 05 and 14 July 2012. Seven crop type classes exist for this data set: 1-Corn; 2-Peas; 3-Canola; 4-Soybeans; 5-Oats; 6-Wheat; and 7-Broadleaf.

ANALYSIS: The performance of the preliminary TensorFlow models achieved an average accuracy benchmark of 0.9942 after running for 20 epochs. When we applied the final model to the test dataset, the model achieved an accuracy score of 0.9951.

CONCLUSION: In this iteration, the simple TensorFlow model appeared to be a suitable algorithm for modeling this dataset.

Dataset Used: Crop Mapping with Fused Optical Radar Data

Dataset ML Model: Multi-class classification with numerical attributes

Dataset Reference: https://archive-beta.ics.uci.edu/ml/datasets/crop+mapping+using+fused+optical+radar+data+set

The HTML formatted report can be found here on GitHub.