Multi-Class Image Classification Deep Learning Model for Chinese MNIST Characters Using TensorFlow

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Chinese MNIST Characters dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: The Chinese MNIST dataset uses data collected by Dr. K Nazarpour and Dr. M Chen for a project at Newcastle University. One hundred Chinese nationals took part in data collection. Each participant wrote with a standard black ink pen all 15 numbers in a table with 15 designated regions drawn on a white A4 paper. Each participant repeated this process ten times, with each sheet scanned at the 300×300 pixels resolution. It resulted in a dataset of 15000 images, each representing one character from a set of 15 characters.

In this iteration, we will construct a few simple CNN models to predict the shoe category based on the available images.

ANALYSIS: The one-layer CNN model’s performance achieved an average accuracy score of 92.79% on the test dataset after 15 epochs. The three-layer CNN model processed the same test dataset with an average accuracy measurement of 97.92%.

CONCLUSION: In this iteration, the simple CNN models appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: Chinese MNIST Digit Recognizer

Dataset ML Model: Multi-class image classification with numerical attributes

Dataset Reference: https://www.kaggle.com/fedesoriano/chinese-mnist-digit-recognizer

One potential source of performance benchmarks: https://data.ncl.ac.uk/articles/Handwritten_Chinese_Numbers/10280831/1

The HTML formatted report can be found here on GitHub.

Multi-Class Image Classification Deep Learning Model for Kaggle UT Zappos50K Shoe Dataset Using TensorFlow Take 5

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Kaggle UT Zappos50K Shoe dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.

UT Zappos50K (UT-Zap50K) is a large shoe dataset consisting of 50,025 catalog images collected from Zappos.com. The dataset divided the photos into four major categories — shoes, sandals, slippers, and boots — followed by functional types and individual brands. The research team created this dataset in the context of an online shopping task, where users pay special attention to fine-grained visual differences.

In this Take1 iteration, we will construct a CNN model based on the InceptionV3 architecture to predict the shoe category based on the available images.

ANALYSIS: In this Take1 iteration, the InceptionV3 model’s performance achieved an accuracy score of 98.34% after ten epochs using the training dataset. The final model processed the validation dataset with an accuracy measurement of 87.28%.

CONCLUSION: In this iteration, the InceptionV3-based CNN model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: Kaggle UT Zappos50K Shoe Dataset

Dataset ML Model: Multi-class image classification with numerical attributes

Dataset Reference: https://www.kaggle.com/grassknoted/asl-alphabet

One potential source of performance benchmarks: https://www.kaggle.com/grassknoted/asl-alphabet/code

The HTML formatted report can be found here on GitHub.

Multi-Class Image Classification Deep Learning Model for Kaggle UT Zappos50K Shoe Dataset Using TensorFlow Take 4

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Kaggle UT Zappos50K Shoe dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.

UT Zappos50K (UT-Zap50K) is a large shoe dataset consisting of 50,025 catalog images collected from Zappos.com. The dataset divided the photos into four major categories — shoes, sandals, slippers, and boots — followed by functional types and individual brands. The research team created this dataset in the context of an online shopping task, where users pay special attention to fine-grained visual differences.

In this Take1 iteration, we will construct a CNN model based on the MobileNetV3Large architecture to predict the shoe category based on the available images.

ANALYSIS: In this Take1 iteration, the MobileNetV3Large model’s performance achieved an accuracy score of 99.20% after ten epochs using the training dataset. The final model processed the validation dataset with an accuracy measurement of 86.09%.

CONCLUSION: In this iteration, the MobileNetV3Large-based CNN model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: Kaggle UT Zappos50K Shoe Dataset

Dataset ML Model: Multi-class image classification with numerical attributes

Dataset Reference: https://www.kaggle.com/grassknoted/asl-alphabet

One potential source of performance benchmarks: https://www.kaggle.com/grassknoted/asl-alphabet/code

The HTML formatted report can be found here on GitHub.

Multi-Class Image Classification Deep Learning Model for Kaggle UT Zappos50K Shoe Dataset Using TensorFlow Take 3

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Kaggle UT Zappos50K Shoe dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.

UT Zappos50K (UT-Zap50K) is a large shoe dataset consisting of 50,025 catalog images collected from Zappos.com. The dataset divided the photos into four major categories — shoes, sandals, slippers, and boots — followed by functional types and individual brands. The research team created this dataset in the context of an online shopping task, where users pay special attention to fine-grained visual differences.

In this Take1 iteration, we will construct a CNN model based on the EfficientNetB7 architecture to predict the shoe category based on the available images.

ANALYSIS: In this Take1 iteration, the EfficientNetB7 model’s performance achieved an accuracy score of 99.32% after ten epochs using the training dataset. The final model processed the validation dataset with an accuracy measurement of 83.39%.

CONCLUSION: In this iteration, the EfficientNetB7-based CNN model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: Kaggle UT Zappos50K Shoe Dataset

Dataset ML Model: Multi-class image classification with numerical attributes

Dataset Reference: https://www.kaggle.com/grassknoted/asl-alphabet

One potential source of performance benchmarks: https://www.kaggle.com/grassknoted/asl-alphabet/code

The HTML formatted report can be found here on GitHub.

Multi-Class Image Classification Deep Learning Model for Kaggle UT Zappos50K Shoe Dataset Using TensorFlow Take 2

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Kaggle UT Zappos50K Shoe dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.

UT Zappos50K (UT-Zap50K) is a large shoe dataset consisting of 50,025 catalog images collected from Zappos.com. The dataset divided the photos into four major categories — shoes, sandals, slippers, and boots — followed by functional types and individual brands. The research team created this dataset in the context of an online shopping task, where users pay special attention to fine-grained visual differences.

In this Take1 iteration, we will construct a CNN model based on the DenseNet201 architecture to predict the shoe category based on the available images.

ANALYSIS: In this Take1 iteration, the DenseNet201 model’s performance achieved an accuracy score of 97.66% after ten epochs using the training dataset. The final model processed the validation dataset with an accuracy measurement of 84.96%.

CONCLUSION: In this iteration, the DenseNet201-based CNN model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: Kaggle UT Zappos50K Shoe Dataset

Dataset ML Model: Multi-class image classification with numerical attributes

Dataset Reference: https://www.kaggle.com/grassknoted/asl-alphabet

One potential source of performance benchmarks: https://www.kaggle.com/grassknoted/asl-alphabet/code

The HTML formatted report can be found here on GitHub.

Multi-Class Image Classification Deep Learning Model for Kaggle UT Zappos50K Shoe Dataset Using TensorFlow Take 1

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Kaggle UT Zappos50K Shoe dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.

UT Zappos50K (UT-Zap50K) is a large shoe dataset consisting of 50,025 catalog images collected from Zappos.com. The dataset divided the photos into four major categories — shoes, sandals, slippers, and boots — followed by functional types and individual brands. The research team created this dataset in the context of an online shopping task, where users pay special attention to fine-grained visual differences.

In this Take1 iteration, we will construct a CNN model based on the ResNet152V2 architecture to predict the shoe category based on the available images.

ANALYSIS: In this Take1 iteration, the ResNet152V2 model’s performance achieved an accuracy score of 96.20% after ten epochs using the training dataset. The final model processed the validation dataset with an accuracy measurement of 85.93%.

CONCLUSION: In this iteration, the ResNet152V2-based CNN model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: Kaggle UT Zappos50K Shoe Dataset

Dataset ML Model: Multi-class image classification with numerical attributes

Dataset Reference: https://www.kaggle.com/grassknoted/asl-alphabet

One potential source of performance benchmarks: https://www.kaggle.com/grassknoted/asl-alphabet/code

The HTML formatted report can be found here on GitHub.

Multi-Class Image Classification Deep Learning Model for ASL Alphabet Images Using TensorFlow Take 5

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The ASL Alphabet Images dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: The data set is a collection of alphabets from the American Sign Language, separated into 29 folders representing the various classes. The training data set contains 87,000 images which are 200×200 pixels. There are 29 classes, of which 26 are for the letters A-Z and three labels for SPACE, DELETE, and NOTHING. The test data set contains only 28 images to encourage the use of real-world test images.

In this Take5 iteration, we will construct a CNN model based on the DenseNet201 architecture to predict the ASL alphabet letters based on the available images.

ANALYSIS: In this Take5 iteration, the DenseNet201 model’s performance achieved an accuracy score of 99.94% after ten epochs using the training dataset. The same model processed the validation dataset with an accuracy measurement of 94.52%. Finally, the final model processed the test dataset with an accuracy score of 100%.

CONCLUSION: In this iteration, the DenseNet201-based CNN model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: Kaggle ASL Alphabet Images

Dataset ML Model: Multi-class image classification with numerical attributes

Dataset Reference: https://www.kaggle.com/grassknoted/asl-alphabet

One potential source of performance benchmarks: https://www.kaggle.com/grassknoted/asl-alphabet/code

The HTML formatted report can be found here on GitHub.

Multi-Class Image Classification Deep Learning Model for ASL Alphabet Images Using TensorFlow Take 4

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The ASL Alphabet Images dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: The data set is a collection of alphabets from the American Sign Language, separated into 29 folders representing the various classes. The training data set contains 87,000 images which are 200×200 pixels. There are 29 classes, of which 26 are for the letters A-Z and three labels for SPACE, DELETE, and NOTHING. The test data set contains only 28 images to encourage the use of real-world test images.

In this Take4 iteration, we will construct a CNN model based on the ResNet152V2 architecture to predict the ASL alphabet letters based on the available images.

ANALYSIS: In this Take4 iteration, the ResNet152V2 model’s performance achieved an accuracy score of 99.83% after ten epochs using the training dataset. The same model processed the validation dataset with an accuracy measurement of 95.71%. Finally, the final model processed the test dataset with an accuracy score of 100%.

CONCLUSION: In this iteration, the ResNet152V2-based CNN model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: Kaggle ASL Alphabet Images

Dataset ML Model: Multi-class image classification with numerical attributes

Dataset Reference: https://www.kaggle.com/grassknoted/asl-alphabet

One potential source of performance benchmarks: https://www.kaggle.com/grassknoted/asl-alphabet/code

The HTML formatted report can be found here on GitHub.