當我們能夠識別聲音時，那就會產生聽力。

當我們努力的去理解一件事的含義時，那就會產生聆聽。

去聆聽不僅需要集中精力，而且還需要我們做承諾去體驗這字背後的經驗，意圖和情感。 這種承諾可能會令人恐懼。 因為如果我們暴露於那種情感和那些想法中，我們也許會發現我們想逃避的事情。

]]>SUMMARY: The purpose of this project is to construct a predictive model using various machine learning algorithms and to document the end-to-end steps using a template. The Crop Mapping in Canada dataset is a multi-class classification situation where we are trying to predict one of several (more than two) possible outcomes.

INTRODUCTION: This data set is a fused bi-temporal optical-radar data for cropland classification. The organization collected the images using RapidEye satellites (optical) and the Unmanned Aerial Vehicle Synthetic Aperture Radar (UAVSAR) system (Radar) over an agricultural region near Winnipeg, Manitoba, Canada in 2012. There are 2 * 49 radar features and 2 * 38 optical features for two dates: 05 and 14 July 2012. Seven crop type classes exist for this data set as follows: 1-Corn; 2-Peas; 3-Canola; 4-Soybeans; 5-Oats; 6-Wheat; and 7-Broadleaf.

In iteration Take1, we constructed and tuned machine learning models for this dataset using the Scikit-Learn library. We also observed the best accuracy result that we could obtain using the tuned models with the training and test datasets.

In this Take2 iteration, we will construct and tune machine learning models for this dataset using the XGBoost. We will observe the best accuracy result that we can obtain using the tuned models with the training and test datasets.

ANALYSIS: From iteration Take1, the baseline performance of the machine learning algorithms achieved an average accuracy of 94.87%. Two algorithms (Extra Trees and Random Forest) achieved the top accuracy metrics after the first round of modeling. After a series of tuning trials, Extra Trees turned in a better overall result. Extra Trees achieved an accuracy metric of 99.71%. When configured with the optimized parameters, the Extra Trees model processed the testing dataset with an accuracy of 99.74%, which was even better than the prediction accuracy from the training data.

From this Take2 iteration, the performance of the XGBoost model achieved an accuracy score of 99.82%. After a series of tuning trials, the XGBoost model processed the test dataset with an accuracy score of 99.84%, which was consistent with the training result.

CONCLUSION: For this iteration, the XGBoost model achieved the best overall results using the training and test datasets. For this dataset, we should consider using the XGBoost algorithm for further modeling.

Dataset Used: Crop Mapping in Canada Data Set

Dataset ML Model: Multi-Class classification with numerical attributes

Dataset Reference: https://archive.ics.uci.edu/ml/datasets/Crop+mapping+using+fused+optical-radar+data+set

The HTML formatted report can be found here on GitHub.

]]>SUMMARY: The purpose of this project is to construct a predictive model using various machine learning algorithms and to document the end-to-end steps using a template. The Crop Mapping in Canada dataset is a multi-class classification situation where we are trying to predict one of several (more than two) possible outcomes.

INTRODUCTION: This data set is a fused bi-temporal optical-radar data for cropland classification. The organization collected the images using RapidEye satellites (optical) and the Unmanned Aerial Vehicle Synthetic Aperture Radar (UAVSAR) system (Radar) over an agricultural region near Winnipeg, Manitoba, Canada in 2012. There are 2 * 49 radar features and 2 * 38 optical features for two dates: 05 and 14 July 2012. Seven crop type classes exist for this data set as follows: 1-Corn; 2-Peas; 3-Canola; 4-Soybeans; 5-Oats; 6-Wheat; and 7-Broadleaf.

In this Take1 iteration, we will construct and tune machine learning models for this dataset using the Scikit-Learn library. We will observe the best accuracy result that we can obtain using the tuned models with the training and test datasets.

ANALYSIS: The baseline performance of the machine learning algorithms achieved an average accuracy of 94.87%. Two algorithms (Extra Trees and Random Forest) achieved the top accuracy metrics after the first round of modeling. After a series of tuning trials, Extra Trees turned in a better overall result. Extra Trees achieved an accuracy metric of 99.71%. When configured with the optimized parameters, the Extra Trees model processed the testing dataset with an accuracy of 99.74%, which was even better than the prediction accuracy from the training data.

CONCLUSION: For this iteration, the Extra Trees model achieved the best overall results using the training and test datasets. For this dataset, we should consider using the Extra Trees algorithm for further modeling.

Dataset Used: Crop Mapping in Canada Data Set

Dataset ML Model: Multi-Class classification with numerical attributes

Dataset Reference: https://archive.ics.uci.edu/ml/datasets/Crop+mapping+using+fused+optical-radar+data+set

The HTML formatted report can be found here on GitHub.

]]>Thanks to Dr. Jason Brownlee’s suggestions on creating a machine learning template, I have pulled together a set of project templates that I use to experiment with modeling ML problems using Python and Scikit-Learn.

Version 14 of the templates contains minor adjustments and corrections to the prevision version of the model. The updated templates include the following:

- Replaced SMTP-based email notification code with AWS Simple Notification Service (SNS)
- Replace the code for pre-processing and scaling data before training models

You will find the Python templates on the Machine Learning Project Templates page.

]]>SUMMARY: The purpose of this project is to construct a predictive model using various machine learning algorithms and to document the end-to-end steps using a template. The Santander Customer Satisfaction dataset is a binary classification situation where we are trying to predict one of the two possible outcomes.

INTRODUCTION: Santander Bank sponsored a Kaggle competition to help them identify dissatisfied customers early in their relationship. Doing so would allow Santander to take proactive steps to improve a customer’s happiness before it’s too late. In this competition, Santander has provided hundreds of anonymized features to predict if a customer is satisfied or dissatisfied with their banking experience. The exercise evaluates the submissions on the area under the ROC curve (AUC) between the predicted probability and the observed target.

In iteration Take1, we constructed several Multilayer Perceptron (MLP) models with two hidden layers. We also observed the best result that we could obtain using the two-layer model. Lastly, we applied the MLP model to Kaggle’s test dataset and submitted a list of predictions to Kaggle for evaluation.

In iteration Take2, we constructed several Multilayer Perceptron (MLP) models with three hidden layers. We also observed the best result that we could obtain using the three-layer model. Lastly, we applied the MLP model to Kaggle’s test dataset and submitted a list of predictions to Kaggle for evaluation.

In iteration Take3, we constructed several Multilayer Perceptron (MLP) models with four hidden layers. We also observed the best result that we could obtain using the four-layer model. Lastly, we applied the MLP model to Kaggle’s test dataset and submitted a list of predictions to Kaggle for evaluation.

In this Take4 iteration, we will construct several Multilayer Perceptron (MLP) models with five hidden layers. We will also observe the best result that we can obtain using the five-layer model. Lastly, we will apply the MLP model to Kaggle’s test dataset and submit a list of predictions to Kaggle for evaluation.

ANALYSIS: From iteration Take1, all two-layer models achieved a ROC-AUC performance of between 79.9% and 81.1% after 25 epochs using the validation dataset. The 64/32-node model appeared to have the highest ROC-AUC of 81.142% with low variance. Lastly, when we applied the two-layer neural network model to the test dataset from Kaggle, we obtained a ROC-AUC score of 80.460%.

From iteration Take2, all three-layer models achieved a ROC-AUC performance of between 79.3% and 81.5% after 25 epochs using the validation dataset. The 224/160/96-node model appeared to have the highest ROC-AUC of 81.56% with low variance. Lastly, when we applied the three-layer neural network model to the test dataset from Kaggle, we obtained a ROC-AUC score of 81.193%.

From iteration Take3, all four-layer models achieved a ROC-AUC performance of between 78.5% and 81.3% after 25 epochs using the validation dataset. The 224/160/128/64-node model appeared to have the highest ROC-AUC of 81.51% with low variance. Lastly, when we applied the four-layer neural network model to the test dataset from Kaggle, we obtained a ROC-AUC score of 81.665%.

From this Take4 iteration, all five-layer models achieved a ROC-AUC performance of between 78.9% and 82.0% after 25 epochs using the validation dataset. The 256/224/160/128/64-node model appeared to have the highest ROC-AUC of 81.80% with low variance. Lastly, when we applied the five-layer neural network model to the test dataset from Kaggle, we obtained a ROC-AUC score of 81.615%.

CONCLUSION: For this iteration, the five-layer model with 256/224/160/128/64 nodes appeared to have yielded the best result. For this dataset, we should consider experimenting with more MLP models with different configurations.

Dataset Used: Santander Customer Satisfaction Data Set

Dataset ML Model: Binary classification with numerical and categorical attributes

Dataset Reference: https://www.kaggle.com/c/santander-customer-satisfaction/overview

One potential source of performance benchmark: https://www.kaggle.com/c/santander-customer-satisfaction/leaderboard

The HTML formatted report can be found here on GitHub.

]]>SUMMARY: The purpose of this project is to construct a predictive model using various machine learning algorithms and to document the end-to-end steps using a template. The Santander Customer Satisfaction dataset is a binary classification situation where we are trying to predict one of the two possible outcomes.

INTRODUCTION: Santander Bank sponsored a Kaggle competition to help them identify dissatisfied customers early in their relationship. Doing so would allow Santander to take proactive steps to improve a customer’s happiness before it’s too late. In this competition, Santander has provided hundreds of anonymized features to predict if a customer is satisfied or dissatisfied with their banking experience. The exercise evaluates the submissions on the area under the ROC curve (AUC) between the predicted probability and the observed target.

In iteration Take1, we constructed several Multilayer Perceptron (MLP) models with two hidden layers. We also observed the best result that we could obtain using the two-layer model. Lastly, we applied the MLP model to Kaggle’s test dataset and submitted a list of predictions to Kaggle for evaluation.

In iteration Take2, we constructed several Multilayer Perceptron (MLP) models with three hidden layers. We also observed the best result that we could obtain using the three-layer model. Lastly, we applied the MLP model to Kaggle’s test dataset and submitted a list of predictions to Kaggle for evaluation.

In this Take3 iteration, we will construct several Multilayer Perceptron (MLP) models with four hidden layers. We will also observe the best result that we can obtain using the three-layer model. Lastly, we will apply the MLP model to Kaggle’s test dataset and submit a list of predictions to Kaggle for evaluation.

ANALYSIS: From iteration Take1, all two-layer models achieved a ROC-AUC performance of between 79.9% and 81.1% after 25 epochs using the validation dataset. The 64/32-node model appeared to have the highest ROC-AUC of 81.142% with low variance. Lastly, when we applied the two-layer neural network model to the test dataset from Kaggle, we obtained a ROC-AUC score of 80.460%.

From iteration Take2, all three-layer models achieved a ROC-AUC performance of between 79.3% and 81.5% after 25 epochs using the validation dataset. The 224/160/96-node model appeared to have the highest ROC-AUC of 81.56% with low variance. Lastly, when we applied the three-layer neural network model to the test dataset from Kaggle, we obtained a ROC-AUC score of 81.193%.

From this Take3 iteration, all four-layer models achieved a ROC-AUC performance of between 78.5% and 81.3% after 25 epochs using the validation dataset. The 224/160/128/64-node model appeared to have the highest ROC-AUC of 81.51% with low variance. Lastly, when we applied the four-layer neural network model to the test dataset from Kaggle, we obtained a ROC-AUC score of 81.665%.

CONCLUSION: For this iteration, the four-layer model with 224/160/128/64 nodes appeared to have yielded the best result. For this dataset, we should consider experimenting with more MLP models with different configurations.

Dataset Used: Santander Customer Satisfaction Data Set

Dataset ML Model: Binary classification with numerical and categorical attributes

Dataset Reference: https://www.kaggle.com/c/santander-customer-satisfaction/overview

One potential source of performance benchmark: https://www.kaggle.com/c/santander-customer-satisfaction/leaderboard

The HTML formatted report can be found here on GitHub.

]]>These are some of my favorite quotes and takeaways from reading the book.

So, how can we minimize our tendency for Loserthink by thinking like a psychologist?

What to watch out for: The Mind Reading Illusion.

“If our opinion depends on reliably knowing another person’s inner thoughts, we might be experiencing Loserthink.”

“If an ordinary explanation fits the facts, but we have chosen an extraordinary interpretation instead, we might have too much confidence in our opinion.”

What to watch out for: Branding People Evil

“If we think we can gaze into the soul of a stranger and see evil, we might be experiencing a Loserthink hallucination.”

What to watch out for: Occam’s Razor

“In science, the simplest explanation that fits the facts is preferred. In life, we are all under the illusion that our explanations of things are the simplest one.”

What to watch out for: Projection

“Psychological projection is a real phenomenon, but if we think untrained people can identify it in strangers, we might be experiencing Loserthink.”

What to watch out for: The Ego Problem

“When we are fit, we will feel more confident in any situation.”

“Effectiveness is more important than ego.”

“If we think ego is who we are, as opposed to a tool we can dial up and down as needed, we might be experiencing Loserthink.”

“Put ourselves in potentially embarrassing situations on a regular basis for practice. If we get embarrassed as planned, watch one year later we are still alive. Maybe we even have a funny story because of it.”

“Note how other people’s embarrassments mean little to us when we are an observe. That is how much our embarrassments mean to them: nothing.”

What to watch out for: Focusing on What is Wrong

“If we cannot think of anything good about a situation, and yet we observe that other can, we might experiencing Loserthink.”

“If we allow our mental shelf space to fill up with negative thoughts, we are punishing ourselves with an unhealthy form of Loserthink.”

]]>考慮三個路徑之一。哪個適合您？

尊重腦中的噪音。做您認為自己天生能完成的工作。創建一件可視化但尚未還沒出爐的東西。這樣做時無需考慮批評者，市場或所有因素的影響。這全是你自己做出的傑作。

擁抱您所能見的市場。滿足它的需求。通過開發資產來贏得一席之地，並利用它為您所服務的人們創造真正的價值。相應定價。

保持忙碌，以小小低於平均的價格，做出個會比一般工作的價格略好的項目。

這很難能看到您如何針對同一類型的客戶同時完成所有這三個操作。這三個選擇都是可行的，任何一個都可以讓您成功，但這是一個值得做出選擇的決定。

]]>SUMMARY: The purpose of this project is to construct a predictive model using various machine learning algorithms and to document the end-to-end steps using a template. The Santander Customer Satisfaction dataset is a binary classification situation where we are trying to predict one of the two possible outcomes.

INTRODUCTION: Santander Bank sponsored a Kaggle competition to help them identify dissatisfied customers early in their relationship. Doing so would allow Santander to take proactive steps to improve a customer’s happiness before it’s too late. In this competition, Santander has provided hundreds of anonymized features to predict if a customer is satisfied or dissatisfied with their banking experience. The exercise evaluates the submissions on the area under the ROC curve (AUC) between the predicted probability and the observed target.

In iteration Take1, we constructed several Multilayer Perceptron (MLP) models with two hidden layers. We also observed the best result that we could obtain using the two-layer model. Lastly, we applied the MLP model to Kaggle’s test dataset and submitted a list of predictions to Kaggle for evaluation.

In this Take2 iteration, we will construct several Multilayer Perceptron (MLP) models with three hidden layers. We will also observe the best result that we can obtain using the three-layer model. Lastly, we will apply the MLP model to Kaggle’s test dataset and submit a list of predictions to Kaggle for evaluation.

ANALYSIS: From iteration Take1, all two-layer models achieved a ROC-AUC performance of between 79.9% and 81.1% after 25 epochs using the validation dataset. The 64/32-node model appeared to have the highest ROC-AUC of 81.142% with low variance. Lastly, when we applied the two-layer neural network model to the test dataset from Kaggle, we obtained a ROC-AUC score of 80.460%.

From this Take2 iteration, all three-layer models achieved a ROC-AUC performance of between 79.3% and 81.5% after 25 epochs using the validation dataset. The 224/160/96-node model appeared to have the highest ROC-AUC of 81.56% with low variance. Lastly, when we applied the three-layer neural network model to the test dataset from Kaggle, we obtained a ROC-AUC score of 81.193%. We captured additional performance measurements using different model configurations.

CONCLUSION: For this iteration, the three-layer model with 224/160/96 nodes appeared to have yielded the best result. For this dataset, we should consider experimenting with more MLP models with different configurations.

Dataset Used: Santander Customer Satisfaction Data Set

Dataset ML Model: Binary classification with numerical and categorical attributes

Dataset Reference: https://www.kaggle.com/c/santander-customer-satisfaction/overview

One potential source of performance benchmark: https://www.kaggle.com/c/santander-customer-satisfaction/leaderboard

The HTML formatted report can be found here on GitHub.

]]>In this Take1 iteration, we will construct several Multilayer Perceptron (MLP) models with two hidden layers. We will also observe the best result that we can obtain using the two-layer model. Lastly, we will apply the MLP model to Kaggle’s test dataset and submit a list of predictions to Kaggle for evaluation.

ANALYSIS: From this Take1 iteration, all two-layer models achieved a ROC-AUC performance of between 79.9% and 81.1% after 25 epochs using the validation dataset. The 64/32-node model appeared to have the highest ROC-AUC of 81.142% with low variance. Lastly, when we applied the two-layer neural network model to the test dataset from Kaggle, we obtained a ROC-AUC score of 80.460%. We captured additional performance measurements using different model configurations.

- Two-Layer 32/16-Node MLP Model – ROC-AUC: 80.9190%
- Two-Layer 64/32-Node MLP Model – ROC-AUC: 81.1427%
- Two-Layer 128/64-Node MLP Model – ROC-AUC: 80.0112%
- Two-Layer 160/96-Node MLP Model – ROC-AUC: 79.9403%
- Two-Layer 256/128-Node MLP Model – ROC-AUC: 80.6845%
- Two-Layer 320/192-Node MLP Model – ROC-AUC: 80.5817%
- Two-Layer 512/256-Node MLP Model – ROC-AUC: 80.8535%

CONCLUSION: For this iteration, the two-layer model with 64/32 nodes appeared to have yielded the best result. For this dataset, we should consider experimenting with more MLP models with different configurations.

Dataset Used: Santander Customer Satisfaction Data Set

Dataset ML Model: Binary classification with numerical and categorical attributes

Dataset Reference: https://www.kaggle.com/c/santander-customer-satisfaction/overview

The HTML formatted report can be found here on GitHub.

]]>