Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.
SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Robusta Coffee Leaf Images dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.
INTRODUCTION: The dataset contains 1560 Robusta coffee leaf images with visible mites and spots for infection cases and images without such appearance for healthy cases. Also, the dataset includes labels regarding the health state (healthy and unhealthy) and the severity of the disease (leaf area with spots).
In iteration Take1, we constructed a CNN model using a simple three-block VGG architecture and tested the model’s performance using a validation dataset (20%) set aside from the training images.
In this Take2 iteration, we will construct a CNN model using the DenseNet121 architecture and test the model’s performance using a validation dataset (20%) set aside from the training images.
ANALYSIS: In iteration Take1, the baseline model’s performance achieved an accuracy score of 98.16% on the training dataset after 15 epochs. Furthermore, the final model achieved an accuracy score of 51.28% on the validation dataset.
In this Take2 iteration, the DenseNet121 model’s performance achieved an accuracy score of 97.61% on the training dataset after 15 epochs. Furthermore, the final model achieved an accuracy score of 73.08% on the validation dataset.
CONCLUSION: In this iteration, the DenseNet121 CNN model appeared to be suitable for modeling this dataset. We should consider experimenting with more CNN architectures for further modeling.
Dataset Used: Robusta Coffee Leaf Images
Dataset ML Model: Multi-class image classification with numerical attributes
Dataset Reference: Parraga-Alava, Jorge; Cusme, Kevin; Loor, Angélica; Santander, Esneider (2019), “RoCoLe: A robusta coffee leaf images dataset”, Mendeley Data, V2, doi: 10.17632/c5yvn32dzg.2 http://dx.doi.org/10.17632/c5yvn32dzg.2
A potential source of performance benchmarks: https://data.mendeley.com/datasets/c5yvn32dzg/2
The HTML formatted report can be found here on GitHub.