Multi-Class Tabular Classification Model for Dry Bean Identification Using Python and TensorFlow Decision Forests

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The project aims to construct a predictive model using various machine learning algorithms and document the end-to-end steps using a template. The Dry Bean Identification dataset is a multi-class modeling situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: There is a wide range of genetic diversity in dry beans, the most produced one among the edible legume crops in the world. Seed classification is essential for the marketing and production of sustainable agricultural systems. The primary objective of this research study is to provide a method for obtaining uniform seed varieties from crop production. The research team developed a computer vision system to classify seven different types of dry beans with similar features. For the classification model, the researchers used a high-resolution camera to gather 13,611 images of dry beans.

ANALYSIS: The performance of the preliminary Random Forest model achieved an accuracy benchmark of 97.26% on the training dataset. When we applied the finalized model to Kaggle’s test dataset, the model achieved an accuracy score of 97.35%.

CONCLUSION: In this iteration, the TensorFlow Decision Forests model appeared to be a suitable algorithm for modeling this dataset.

Dataset Used: Dry Bean Dataset

Dataset ML Model: Multi-Class classification with numerical features

Dataset Reference: https://www.muratkoklu.com/datasets/

One source of potential performance benchmarks: https://doi.org/10.1016/j.compag.2020.105507

The HTML formatted report can be found here on GitHub.

Multi-Class Tabular Classification Model for Dry Bean Identification Using Python and Scikit-Learn

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The project aims to construct a predictive model using various machine learning algorithms and document the end-to-end steps using a template. The Dry Bean Identification dataset is a multi-class modeling situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: There is a wide range of genetic diversity in dry beans, the most produced one among the edible legume crops in the world. Seed classification is essential for the marketing and production of sustainable agricultural systems. The primary objective of this research study is to provide a method for obtaining uniform seed varieties from crop production. The research team developed a computer vision system to classify seven different types of dry beans with similar features. For the classification model, the researchers used a high-resolution camera to gather 13,611 images of dry beans.

ANALYSIS: The average performance of the machine learning algorithms achieved an accuracy benchmark of 91.39% using the training dataset. Furthermore, we selected Random Forest as the final model as it processed the training dataset with a final accuracy score of 92.51%. When we processed the test dataset with the final model, the model achieved an accuracy score of 91.92%.

CONCLUSION: In this iteration, the Random Forest model appeared to be a suitable algorithm for modeling this dataset.

Dataset Used: Dry Bean Dataset

Dataset ML Model: Multi-Class classification with numerical features

Dataset Reference: https://www.muratkoklu.com/datasets/

One source of potential performance benchmarks: https://doi.org/10.1016/j.compag.2020.105507

The HTML formatted report can be found here on GitHub.

Erika Andersen on Be Bad First, Part 1

In the book, Be Bad First: Get Good at Things Fast to Stay Ready for the Future, Erika Andersen shares her mindset and techniques for learning new things well and quickly.

These are some of my favorite concepts and takeaways from reading the book.

Chapter 1: The New Need to Learn – and Our Mixed Response

In this chapter, Erika Andersen discusses the rationale of why being bad first is so essential for today’s world. She offers the following observations and recommendations for us to think about:

The amount of knowledge acquired by humans has been growing exponentially. This explosion of knowledge, and the technological, scientific, and cultural advances that have resulted, have also dramatically changed how we learn and work.

The advance in knowledge acquisition also hastens the pace of change, both societal and personal. The rate of change means that our career progression is much different from our parent’s. Conversely, our children’s career progression will look very different from ours.

The same shift – from stability to fluidity – has happened on an organizational level. In the early 20th century, the business landscape was dominated by big companies we all assumed would last forever. Instead, many of those companies have been replaced by successful enterprises that have arisen from new technologies spawned by new knowledge.

“The ability to learn faster than your competitors may be the only sustainable competitive advantage,” said Arie de Geus.

The proliferation of knowledge and options over the past centuries and supported by the most potent knowledge distribution mechanism to date, the Internet, means:

            More Knowledge = More Communication = More Knowledge

Given all this, it seems clear that those who can acquire and apply new knowledge and skills quickly and continuously will likely succeed in today’s world. We face a reality where knowledge is increasing exponentially, where work is changing daily, and where advancements in every area of discipline nearly our ability to communicate. The natural conclusion is that the ability to learn well and quickly is the essential skill we can have.

While we may realize intellectually that being successful these days requires being open to continuous, disruptive learning, that does not mean we like this way of life. Change is hard for most people to deal with, and we naturally often resist learning new things. Moreover, we dislike the feeling of being a newbie over and over.

We don’t like being thrown into that “be bad first” position when we need to learn new things. As adults, we don’t want to do what feels like going backward, to being novices all over again. So a key question we need to ask ourselves is: How can we overcome our hesitation and our resistance to new learning to become those “masters of mastery” who will best succeed in the 21st century?

The good news is that we have all got something inside us that will help. We may hate to be bad at things – but we love getting at things.

交通中的隱喻

(從我一個尊敬的作家,賽斯·高汀

你能跑多長多短會改變。

加快速度以更快地闖紅燈只會浪費能量。

死按喇叭不會讓交通堵塞消失。

不管你的車有多花哨,如果馬路關閉了,那多好的車還是不值錢。

到達目的地的最糟糕方法之一是始終在最快的車道上行駛,並避開任何收費的公路。交通流量並不總是流向您想去的地方。

一直跟隨另一輛車最終會讓你自己迷路。

如果你偶爾不停下來加個油,你就會被困在某個地方。

給某人一個機會或通行權,讓他們進入交通並不會真正減慢你的速度。

在我們的文化中,我們為汽車及其效率提供了太多的注意力,而沒為行人及其應得的機會提供了足夠的資源。

地圖不是領土,但有個地圖是件好事。

我們高估了加速度的作用。堅持不懈、良好的方向和有輛可靠的車子通常總是能勝過光有馬力。

Binary-Class Image Classification Model for Concrete Crack Images Using Python and TensorFlow Take 5

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Concrete Crack Images dataset is a binary-class modeling situation where we attempt to predict one of two possible outcomes.

INTRODUCTION: In this study, the research team developed a computerized vision system to recognize whether there are cracks on concrete surfaces. The dataset contains concrete images with different surface finishes and illumination conditions. The photos were collected from various college campus buildings. The dataset is divided into two classes, negative and positive, and each type has 20,000 images with 227 x 227 pixels.

ANALYSIS: The InceptionV3 model’s performance achieved an accuracy score of 99.74% after 5 epochs using a separate validation dataset. When we applied the model to the validation dataset, the model achieved an accuracy score of 99.86%.

CONCLUSION: In this iteration, the TensorFlow InceptionV3 CNN model appeared suitable for modeling this dataset.

Dataset ML Model: Binary-Class classification with numerical features

Dataset Used: Özgenel, Çağlar Fırat (2019), “Concrete Crack Images for Classification”, Mendeley Data, V2, doi: 10.17632/5y9wdsg2zt.2

Dataset Reference: https://data.mendeley.com/datasets/5y9wdsg2zt/2

One source of potential performance benchmarks: https://www.kaggle.com/datasets/arunrk7/surface-crack-detection

The HTML formatted report can be found here on GitHub.

Binary-Class Image Classification Model for Concrete Crack Images Using Python and TensorFlow Take 4

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Concrete Crack Images dataset is a binary-class modeling situation where we attempt to predict one of two possible outcomes.

INTRODUCTION: In this study, the research team developed a computerized vision system to recognize whether there are cracks on concrete surfaces. The dataset contains concrete images with different surface finishes and illumination conditions. The photos were collected from various college campus buildings. The dataset is divided into two classes, negative and positive, and each type has 20,000 images with 227 x 227 pixels.

ANALYSIS: The EfficientNetV2S model’s performance achieved an accuracy score of 99.76% after 5 epochs using a separate validation dataset. When we applied the model to the validation dataset, the model achieved an accuracy score of 99.76%.

CONCLUSION: In this iteration, the TensorFlow EfficientNetV2S CNN model appeared suitable for modeling this dataset.

Dataset ML Model: Binary-Class classification with numerical features

Dataset Used: Özgenel, Çağlar Fırat (2019), “Concrete Crack Images for Classification”, Mendeley Data, V2, doi: 10.17632/5y9wdsg2zt.2

Dataset Reference: https://data.mendeley.com/datasets/5y9wdsg2zt/2

One source of potential performance benchmarks: https://www.kaggle.com/datasets/arunrk7/surface-crack-detection

The HTML formatted report can be found here on GitHub.

Binary-Class Image Classification Model for Concrete Crack Images Using Python and TensorFlow Take 3

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery. [https://machinelearningmastery.com/]

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Concrete Crack Images dataset is a binary-class modeling situation where we attempt to predict one of two possible outcomes.

INTRODUCTION: In this study, the research team developed a computerized vision system to recognize whether there are cracks on concrete surfaces. The dataset contains concrete images with different surface finishes and illumination conditions. The photos were collected from various college campus buildings. The dataset is divided into two classes, negative and positive, and each type has 20,000 images with 227 x 227 pixels.

ANALYSIS: The VGG19 model’s performance achieved an accuracy score of 99.65% after 5 epochs using a separate validation dataset. When we applied the model to the validation dataset, the model achieved an accuracy score of 99.83%.

CONCLUSION: In this iteration, the TensorFlow VGG19 CNN model appeared suitable for modeling this dataset.

Dataset ML Model: Binary-Class classification with numerical features

Dataset Used: Özgenel, Çağlar Fırat (2019), “Concrete Crack Images for Classification”, Mendeley Data, V2, doi: 10.17632/5y9wdsg2zt.2

Dataset Reference: https://data.mendeley.com/datasets/5y9wdsg2zt/2

One source of potential performance benchmarks: https://www.kaggle.com/datasets/arunrk7/surface-crack-detection

The HTML formatted report can be found here on GitHub.

Binary-Class Image Classification Model for Concrete Crack Images Using Python and TensorFlow Take 2

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Concrete Crack Images dataset is a binary-class modeling situation where we attempt to predict one of two possible outcomes.

INTRODUCTION: In this study, the research team developed a computerized vision system to recognize whether there are cracks on concrete surfaces. The dataset contains concrete images with different surface finishes and illumination conditions. The photos were collected from various college campus buildings. The dataset is divided into two classes, negative and positive, and each type has 20,000 images with 227 x 227 pixels.

ANALYSIS: The DenseNet201 model’s performance achieved an accuracy score of 99.87% after 5 epochs using a separate validation dataset. When we applied the model to the validation dataset, the model achieved an accuracy score of 99.86%.

CONCLUSION: In this iteration, the TensorFlow DenseNet201 CNN model appeared suitable for modeling this dataset.

Dataset ML Model: Binary-Class classification with numerical features

Dataset Used: Özgenel, Çağlar Fırat (2019), “Concrete Crack Images for Classification”, Mendeley Data, V2, doi: 10.17632/5y9wdsg2zt.2

Dataset Reference: https://data.mendeley.com/datasets/5y9wdsg2zt/2

One source of potential performance benchmarks: https://www.kaggle.com/datasets/arunrk7/surface-crack-detection

The HTML formatted report can be found here on GitHub.