Annie Duke on How to Decide, Part 8

In her book, How to Decide: Simple Tools for Making Better Choices, Annie Duke discusses how to train our brains to combat our own bias and help ourselves make more confident and better decisions.

These are some of my favorite concepts and takeaways from reading the book.

Chapter 7, “Breaking Free from Analysis Paralysis”

In this chapter, Annie Duke discusses how we can spend our decision-making time more wisely and reaching working decisions faster. She offers the following recommendations:

  • A Sheep in Wolf’s Clothing:
    • We often get trapped and slow down in a decision-making process based on the closeness of the options. When two options are close to each other in payoff or quality, we become much slower in choosing.
    • Annie asserts that, when weighing two close options, the decision is easy. She also suggested that we ask this assessment question, “Whichever option I choose, how wrong can I be?”
    • When two options are close in payoff or quality, we can break through the bottleneck and decide quickly because, whichever one we choose, we cannot be so much far off or wrong.
  • Quitters Often Win, and Winners Often Quit:
    • Opportunity cost is another tool we can use to enhance our decision-making skills. When we pick an option, we lose the potential gains associated with the choice we do not pick.
    • Part of a good decision process includes asking ourselves, “If I pick this option, what’s the cost of quitting?” The lower the cost of quitting, the faster we can go. It is easier to unwind the decision and choose a different option, including options we may have rejected before.
    • Once we understand the importance of quitting a decision and making course adjustments, we can use the tool of decision stacking. Decision stacking is the prioritization habit of finding ways to make low-impact, the easy-to-quit decision in advance of a high-impact, harder-to-quit decision.
  • Is This Your Final Answer?
    • For every decision, there comes the point when we should stop analyzing and just decide. If our goal is to get to certainty about our choice on every decision, we will never be finished with the analysis.
    • We can ask ourselves this question, “Is there additional information that would establish a clearly preferred option or cause us to change our preferred option?” If yes, find that information first. If no, decide and move on.

每一項可導致成更多

(從我一個尊敬的作家,賽斯·高汀

我們可以選擇提交一條遞歸且無限的路徑,以優雅地創建更多相同的路徑。

我們可以選擇可能性。

我們可以選擇連接。

我們可以選擇樂觀。

我們可以選擇正義。

我們可以選擇仁慈。

我們可以選擇彈性。

我們可以決定承擔責任。

這每一項可相同的導致成更多的美好。

NLP Model for Disaster Tweets Classification Using TensorFlow Take 2

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a text classification model using a neural network and document the end-to-end steps using a template. The Disaster Tweets Classification dataset is a binary classification situation where we attempt to predict one of the two possible outcomes.

INTRODUCTION: Twitter has become an important communication channel in times of emergency. The ubiquitous nature of smartphones enables people to announce an emergency they are observing in real-time. Because of this, more agencies are interested in programmatically monitoring Twitter. In this practice Kaggle competition, we want to build a machine learning model that predicts which Tweets are about real disasters and which ones are not. This dataset was created by Figure-Eight and shared initially on their ‘Data for Everyone’ website.

From iteration Take1, we deployed a bag-of-words model to classify the Tweets. We also made predictions on Kaggle’s test dataset and submitted the results for evaluation.

In this Take2 iteration, we will deploy a word-embedding model to classify the Tweets. We will also submit the test predictions to Kaggle and obtain the performance score for the model.

ANALYSIS: From iteration Take1, the bag-of-words model’s performance achieved an average accuracy score of 75.49% after 20 epochs with five iterations of cross-validation. Furthermore, the final model processed the test dataset with an accuracy measurement of 75.02%.

In this Take2 iteration, the word-embedding model’s performance achieved an average accuracy score of 72.45% after 20 epochs with five iterations of cross-validation. Furthermore, the final model processed the test dataset with an accuracy measurement of 74.65%.

CONCLUSION: In this modeling iteration, the word-embedding TensorFlow model did not do as well as the bag-of-words model. However, we should continue to experiment with both natural language processing techniques for further modeling.

Dataset Used: Sentiment Labelled Sentences

Dataset ML Model: Binary class text classification with text-oriented features

Dataset Reference: https://www.kaggle.com/c/nlp-getting-started/

The HTML formatted report can be found here on GitHub.

Multi-Class Image Classification Deep Learning Model for Flower Photos Using TensorFlow Take 1

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Flower Photos dataset is a multi-class classification situation where we attempt to predict one of several (more than two) possible outcomes.

INTRODUCTION: The Flower Photos dataset is a collection of 3,670 flower photos in five different species. This dataset is part of the TensorFlow standard dataset collection.

In this Take1 iteration, we will construct and tune a machine learning model using a simple three-layer MLP network. We will also observe the best result that we can obtain using the validation dataset.

ANALYSIS: In this Take1 iteration, the baseline model’s performance achieved an accuracy score of 80.24% after 25 epochs using the training dataset. After tuning the model, the best model processed the validation dataset with an accuracy score of 74.69%.

CONCLUSION: In this iteration, the TensorFlow CNN model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: Flower Photos Dataset

Dataset ML Model: Multi-class image classification with numerical attributes

Dataset Reference: https://www.tensorflow.org/datasets/catalog/tf_flowers

One potential source of performance benchmarks: https://www.tensorflow.org/tutorials/images/classification

The HTML formatted report can be found here on GitHub.

Algorithmic Trading Model for Trend-Following with Holding Period Strategy Using Python Take 4

NOTE: This script is for learning purposes only and does not constitute a recommendation for buying or selling any stock mentioned in this script.

SUMMARY: This project aims to construct and test an algorithmic trading model and document the end-to-end steps using a template.

INTRODUCTION: This algorithmic trading model examines a simple trend-following strategy for a stock. The model buys a stock when the price reaches the highest price for the last X number of days. The model will exit the position when the stock price crosses below the mean of the same window size.

In addition to the stock price, the models will also use the trading volume indicator to confirm the buy/sell signal further. Finally, the strategy will also incorporate a fixed holding window. The system will exit the position when the holding window reaches the maximum window size.

From iteration Take1, we set up the models using a trend window size for long trades only. The window size varied from 10 to 50 trading days at a 5-day increment. The models also examined a volume indicator with a varying window size between 10 and 15 days to confirm the buy/sell signal further. We will not limit the holding period by setting the maximum holding period to 99 days for this iteration.

From iteration Take2, we set up the models using a trend window size for long trades only. The window size varied from 10 to 50 trading days at a 5-day increment. The model also examined a volume indicator with a varying window size between 10 and 15 days to confirm the buy/sell signal further. Furthermore, we limited the holding period by setting the maximum holding period between 10 and 90 days at a 10-day increment.

From iteration Take3, we set up the models using a trend window size for long and short trades. The window size varied from 10 to 50 trading days at a 5-day increment. The models also examined a volume indicator with a varying window size between 10 and 15 days to confirm the buy/sell signal further. We did not limit the holding period by setting the maximum holding period to 99 days for this iteration.

In this Take4 iteration, we will set up the models using a trend window size for long and short trades. The window size will vary from 10 to 50 trading days at a 5-day increment. The models will also examine a volume indicator with a varying window size between 10 and 15 days to confirm the buy/sell signal further. Furthermore, we will limit the holding period by setting the maximum holding period between 10 and 90 days at a 10-day increment.

ANALYSIS: From iteration Take1, we analyzed the stock prices for Apple Inc. (AAPL) between January 1, 2018 and December 31, 2020. The top trading model produced a profit of 95.90 dollars per share. The buy-and-hold approach yielded a gain of 91.54 dollars per share.

From iteration Take2, we analyzed the stock prices for Apple Inc. (AAPL) between January 1, 2018 and December 31, 2020. The top trading model produced a profit of 89.76 dollars per share. The buy-and-hold approach yielded a gain of 91.54 dollars per share.

From iteration Take3, we analyzed the stock prices for Apple Inc. (AAPL) between January 1, 2018 and December 31, 2020. The top trading model produced a profit of 83.43 dollars per share. The buy-and-hold approach yielded a gain of 91.54 dollars per share.

In this Take4 iteration, we analyzed the stock prices for Apple Inc. (AAPL) between January 1, 2018 and December 31, 2020. The top trading model produced a profit of 83.43 dollars per share. The buy-and-hold approach yielded a gain of 91.54 dollars per share.

CONCLUSION: For the stock of AAPL during the modeling time frame, the trading strategy did not produce a better return than the buy-and-hold approach. However, we should consider modeling this stock further by experimenting with more variations of the strategy.

Dataset ML Model: Time series analysis with numerical attributes

Dataset Used: Quandl

The HTML formatted report can be found here on GitHub.

Binary Image Classification Deep Learning Model for Yosemite Summer vs. Winter Using TensorFlow Take 4

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Yosemite Summer vs. Winter dataset is a binary classification situation where we attempt to predict one of the two possible outcomes.

INTRODUCTION: The CycleGAN dataset collection contains datasets that consist of images from two classes A and B (for example, apple vs. orange, horses vs. zebras, and so on). The researchers used the images to train machine learning models for research work in the area of General Adversarial Networks.

From iteration Take1, we constructed and tuned a machine learning model for this dataset using TensorFlow with a simple VGG-3 network. We also observed the best result that we could obtain using the test dataset. The final model from this iteration became our baseline model for future iterations.

From iteration Take2, we constructed and tuned a machine learning model for this dataset using TensorFlow with a VGG-16 network. We also observed the best result that we could obtain using the test dataset. The final model from this iteration became our baseline model for future iterations.

From iteration Take3, we constructed and tuned a machine learning model for this dataset using TensorFlow with a Inception V3 network. We also observed the best result that we could obtain using the test dataset. The final model from this iteration became our baseline model for future iterations.

In this Take4 iteration, we will construct and tune a machine learning model for this dataset using TensorFlow with a ResNet50 V2 network. We will also observe the best result that we can obtain using the test dataset and compare the performance with the baseline model.

ANALYSIS: From iteration Take1, the baseline model’s performance achieved an accuracy score of 99.00% after 20 epochs using the training dataset. The final model also processed the test dataset with 74.95% accuracy.

From iteration Take2, the VGG-16 model’s performance achieved an accuracy score of 74.44% after 20 epochs using the training dataset. The final model also processed the test dataset with 70.02% accuracy.

From iteration Take3, the Inception V3 model’s performance achieved an accuracy score of 83.36% after 20 epochs using the training dataset. The final model also processed the test dataset with 52.83% accuracy.

In this Take4 iteration, the ResNet50 V2 model’s performance achieved an accuracy score of 57.48% after 20 epochs using the training dataset. The final model also processed the test dataset with 46.80% accuracy.

CONCLUSION: In this iteration, the ResNet50 V2 CNN model did not appear to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: Yosemite Summer vs. Winter Dataset

Dataset ML Model: Binary image classification with numerical attributes

Dataset Reference: https://people.eecs.berkeley.edu/%7Etaesung_park/CycleGAN/datasets/

One potential source of performance benchmarks: https://arxiv.org/abs/1703.10593 or https://junyanz.github.io/CycleGAN/

The HTML formatted report can be found here on GitHub.

Binary Image Classification Deep Learning Model for Yosemite Summer vs. Winter Using TensorFlow Take 3

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: This project aims to construct a predictive model using a TensorFlow convolutional neural network (CNN) and document the end-to-end steps using a template. The Yosemite Summer vs. Winter dataset is a binary classification situation where we attempt to predict one of the two possible outcomes.

INTRODUCTION: The CycleGAN dataset collection contains datasets that consist of images from two classes A and B (for example, apple vs. orange, horses vs. zebras, and so on). The researchers used the images to train machine learning models for research work in the area of General Adversarial Networks.

From iteration Take1, we constructed and tuned a machine learning model for this dataset using TensorFlow with a simple VGG-3 network. We also observed the best result that we could obtain using the test dataset. The final model from this iteration became our baseline model for future iterations.

From iteration Take2, we constructed and tuned a machine learning model for this dataset using TensorFlow with a VGG-16 network. We also observed the best result that we could obtain using the test dataset. The final model from this iteration became our baseline model for future iterations.

In this Take3 iteration, we will construct and tune a machine learning model for this dataset using TensorFlow with a Inception V3 network. We will also observe the best result that we can obtain using the test dataset and compare the performance with the baseline model.

ANALYSIS: From iteration Take1, the baseline model’s performance achieved an accuracy score of 99.00% after 20 epochs using the training dataset. The final model also processed the test dataset with 74.95% accuracy.

From iteration Take2, the VGG-16 model’s performance achieved an accuracy score of 74.44% after 20 epochs using the training dataset. The final model also processed the test dataset with 70.02% accuracy.

In this Take3 iteration, the Inception V3 model’s performance achieved an accuracy score of 83.36% after 20 epochs using the training dataset. The final model also processed the test dataset with 52.83% accuracy.

CONCLUSION: In this iteration, the Inception V3 CNN model appeared to be suitable for modeling this dataset. We should consider experimenting with TensorFlow for further modeling.

Dataset Used: Yosemite Summer vs. Winter Dataset

Dataset ML Model: Binary image classification with numerical attributes

Dataset Reference: https://people.eecs.berkeley.edu/%7Etaesung_park/CycleGAN/datasets/

One potential source of performance benchmarks: https://arxiv.org/abs/1703.10593 or https://junyanz.github.io/CycleGAN/

The HTML formatted report can be found here on GitHub.

Annie Duke on How to Decide, Part 7

In her book, How to Decide: Simple Tools for Making Better Choices, Annie Duke discusses how to train our brains to combat our own bias and help ourselves make more confident and better decisions.

These are some of my favorite concepts and takeaways from reading the book.

Chapter 7, “Breaking Free from Analysis Paralysis”

In this chapter, Annie Duke discusses how we can spend our decision-making time more wisely and reaching working decisions faster. She offers the following four recommendations:

  • The Happiness Test:
    • We spend an enormous amount of time on routine, inconsequential decisions. Recognizing when decisions are low-impact can maximize our opportunities to make high-impact decisions more effectively.
    • Every decision will have a time-accuracy trade-off. Increasing accuracy costs time and saving time costs accuracy.
    • The Happiness Test asks us if how our decision turns out will likely have a significant effect on our happiness in a year. If the pending decision passes the test, we can speed up the decision-making process.
    • Often, a decision can fall into the “repeating options” category. That is when the same type of decision comes up repeatedly. When we have these “repeating options” decisions, we can go even faster.
  • Freerolling:
    • Freerolling is a situation where there is an asymmetry between the upside and downside because the potential losses are insignificant. The key feature of a freeroll is a limited downside.
    • When we identify a freeroll, we should spend less time deciding whether to engage with the opportunity. The faster we engage, the less likely it is that the chance goes away.
    • We should go fast on a freeroll, but we still want to take time to plan and execution of the decision.