Seth Godin’s Akimbo: Difficult Conversations

In his Akimbo podcast, Seth Godin teaches us how to adopt a posture of possibility, change the culture, and choose to make a difference. Here are my takeaways from the episode.

In this podcast, Seth discussed the art of having difficult conversations, how to have one, and how not to have one.

We all dislike or maybe even fear to have difficult conversations with someone. Our culture has taught us that difficult conversations are the thing to avoid.

Seth wants to argue that a difficult conversation is awkward because we want two things. We want someone to change, and we want them to like it.

When we try to achieve these two goals in one conversation, the conversation gets a lot more complicated. We magnify the impact of one or the other, whichever one is harder, and we end up avoiding it altogether.

It is essential to keep several things in mind as we break down our feelings about the difficult conversation. The first one has to do with authority and other’s feelings. When we have the power, and other’s feeling is not a critical consideration, we usually do not view it as an awkward conversation.

Also, when we need to give a direct statement or feedback because the situation requires it, we do not see it as a difficult conversation.

But most of the time, we are stuck because we do not see the conversation can be direct without caring for the other’s feelings. In those situations, the concept of enrollment becomes essential.

Enrollment means that two people want to reach the same destination, and one person is relying on the other person to show the way. When two people are enrolled on the same journey, difficult conversations get easier. That is because the conversation is more about reaching the goal, rather than about someone’s feeling.

So the best way to move past a difficult conversation is to spend far more time working on enrollment. Enrollment means sharing not only a common objective but also resources, gains, and risks. Enrollment also requires mutual trust between the two parties.

Too often, in an industrialized culture, enrollment is hard to achieve between the boss and the worker. The industrialized culture does not promote an environment where the boss and worker share gains and risks equitably. The same culture also does not encourage mutual trust.

So, the best way to change our culture is by earning trust and enrollment first. Once we have trust and enrollment, then difficult conversations become a lot less difficult. With enrollment, the conversation is no longer personal.

In summary, there are two approaches to change a difficult conversation. One approach is to make it direct with a solid basis. That is not a difficult conversation anymore because it is based on a clear, mutually understood standard or guideline.

The second approach is to invest in building involvement, trust, and enrollment. When we agree to go on the same journey by enrolling ourselves, the conversation becomes much less about anyone’s feeling.

向前

(從我一個尊敬的作家,賽斯·高汀

我們對時間的體驗總是朝著單一個方向發展。

它似乎在加速或減速,但是混亂與外界總是使事情從昨天移到明天。

鑑於沒有什麼事永遠都是一樣,倒退也不是一個選擇,我們唯一的選擇就是向前進。

人生一直都是這樣。一旦我們遇到那種事物,任何事物,環境將永遠不會恢復原狀。

但幸運的是,我們有機會能使事情變得更好,每天都有。

Algorithmic Trading Model for Cramer COVID-19 Index Using Python Take 1

SUMMARY: The purpose of this project is to construct and test an algorithmic trading model and document the end-to-end steps using a template.

INTRODUCTION: CNBC’s Jim Cramer, the host of Mad Money show, presented a list of stocks on April 27th that he believes will work well in this coronavirus-plagued market. The ‘Cramer COVID-19 Index’ contains 100 companies that touch 17 sectors where investors can expect a positive return in this volatile market environment. The project aims to analyze these 100 stocks and develop strategies for trading these stocks, either individually or in groups.

In this Take1 iteration, we will construct the necessary code segments for downloading and visualizing the index and the stocks. The script will leverage various data sources, both free and paid subscriptions, for pulling the required dataset together.

NOTE: This script calculates the index value by using the number of outstanding shares from each company. Such an approach may not match how CNBC calculates this index (https://www.cnbc.com/cramer-covid-19-stocks/). This script is for educational purposes only and does not constitute a recommendation for buying or selling any stock mentioned in this script.

ANALYSIS: Not applicable for this iteration.

CONCLUSION: Not applicable for this iteration.

Dataset ML Model: Time series analysis with numerical attributes

Dataset Used: Yahoo Finance, IEX Cloud, and Quandl

The HTML formatted report can be found here on GitHub.

Algorithmic Trading Model for Exponential Moving Average Crossover Take 4

SUMMARY: The purpose of this project is to construct and test an algorithmic trading model and document the end-to-end steps using a template.

INTRODUCTION: This algorithmic trading model uses the 20-day and 50-day exponential moving averages to generate trading signals. When the fast moving-average curve crosses above the slow moving-average curve, the strategy goes long (buys) on the stock. When the opposite occurs, we will exit the position. We apply the analysis on the GOOG stock for the three years of 2017-01-01 thru 2019-12-31.

In iteration Take1, we constructed and tested a trading strategy using the daily stock data from Yahoo! Finance with an initial investment pool of 1,500 USD.

In iteration Take2, we constructed and tested a trading strategy using the daily stock data from Alpha Vantage with an initial investment pool of 1,500 USD. We also leveraged the technical indicators available from Alpha Vantage.

In iteration Take3, we constructed and tested a trading strategy using the daily stock data from IEX Cloud with an initial investment pool of 1,500 USD. We also leveraged the technical indicators available from IEX Cloud.

In this Take4 iteration, we will construct and test a trading strategy using the daily stock data from Quandl with an initial investment pool of 1,500 USD.

ANALYSIS: In iteration Take1, the trading strategy returned 27.94%. In the meantime, the long-only approach achieved an accumulated return of 37.21%.

In iteration Take2, the trading strategy returned 27.94%. In the meantime, the long-only approach achieved an accumulated return of 37.21%.

In iteration Take3, the trading strategy returned 27.94%. In the meantime, the long-only approach achieved an accumulated return of 37.21%.

In this Take4 iteration, the trading strategy returned 27.94%. In the meantime, the long-only approach achieved an accumulated return of 37.21%.

CONCLUSION: For this period, the trading strategy did not outperform the more straightforward long-only approach. We should consider more modeling with different methods for this stock. However, we were able to successfully use the data sources from Quandl to perform our modeling.

Dataset ML Model: Time series analysis with numerical attributes

Dataset Used: Quandl

The HTML formatted report can be found here on GitHub.

Algorithmic Trading Model for Simple Moving Average Crossover Take 4

SUMMARY: The purpose of this project is to construct and test an algorithmic trading model and document the end-to-end steps using a template.

INTRODUCTION: This algorithmic trading model uses the 20-day and 50-day simple moving averages to generate trading signals. When the fast moving-average curve crosses above the slow moving-average curve, the strategy goes long (buys) on the stock. When the opposite occurs, we will exit the position. We apply the analysis on the GOOG stock for the three years of 2017-01-01 thru 2019-12-31.

In iteration Take1, we constructed and tested a trading strategy using the daily stock data from Yahoo! Finance with an initial investment pool of 1,500 USD.

In iteration Take2, we constructed and tested a trading strategy using the daily stock data from Alpha Vantage with an initial investment pool of 1,500 USD. We also leveraged the technical indicators available from Alpha Vantage.

In iteration Take3, we constructed and tested a trading strategy using the daily stock data from IEX Cloud with an initial investment pool of 1,500 USD. We also leveraged the technical indicators available from IEX Cloud.

In this Take4 iteration, we will construct and test a trading strategy using the daily stock data from Quandl with an initial investment pool of 1,500 USD.

ANALYSIS: In iteration Take1, the trading strategy returned 32.79%. In the meantime, the long-only approach achieved an accumulated return of 37.21%.

In iteration Take2, the trading strategy returned 32.79%. In the meantime, the long-only approach achieved an accumulated return of 37.21%.

In iteration Take3, the trading strategy returned 32.79%. In the meantime, the long-only approach achieved an accumulated return of 37.21%.

In this Take4 iteration, the trading strategy returned 32.79%. In the meantime, the long-only approach achieved an accumulated return of 37.21%.

CONCLUSION: For this period, the trading strategy did not outperform the more straightforward long-only approach. We should consider more modeling with different methods for this stock. However, we were able to successfully use the data sources from Quandl to perform our modeling.

Dataset ML Model: Time series analysis with numerical attributes

Dataset Used: Quandl

The HTML formatted report can be found here on GitHub.

Multi-Class Deep Learning Model for Forest Cover Type Using TensorFlow Take 7

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The purpose of this project is to construct a predictive model using various machine learning algorithms and to document the end-to-end steps using a template. The Forest Cover Type dataset is a multi-class classification situation where we are trying to predict one of several (more than two) possible outcomes.

INTRODUCTION: This experiment tries to predict forest cover type from cartographic variables only. This study area includes four wilderness areas located in the Roosevelt National Forest of northern Colorado. These areas represent forests with minimal human-caused disturbances, so that existing forest cover types are more a result of ecological processes rather than forest management practices.

The actual forest cover type for a given observation (30 x 30-meter cell) was determined from the US Forest Service (USFS) Region 2 Resource Information System (RIS) data. Independent variables were derived from data initially obtained from the US Geological Survey (USGS) and USFS data. Data is in raw form (not scaled) and contains binary (0 or 1) columns of data for qualitative independent variables (wilderness areas and soil types).

In iteration Take1, we established the baseline accuracy for comparison with future rounds of modeling.

In iteration Take2, we examined the feature selection technique of attribute importance ranking by using the Gradient Boosting algorithm. By selecting the essential attributes, we decreased the modeling time and still maintained a similar level of accuracy when compared to the baseline model.

In iteration Take3, we examined the feature selection technique of recursive feature elimination (RFE) with the use of the Extra Trees algorithm. By selecting no more than 40 attributes, we maintained a similar level of accuracy when compared to the baseline model.

In iteration Take4, we constructed and tuned an XGBoost machine learning model for this dataset. We also observed the best accuracy result that we could obtain using the XGBoost model with the training dataset from Kaggle. Furthermore, we applied the XGBoost model to Kaggle’s test dataset and submitted a list of predictions to Kaggle for evaluation.

In iteration Take5, we constructed several Multilayer Perceptron (MLP) models with one hidden layer. These simple MLP models will serve as the baseline models as we build more complex MLP models in future iterations. Furthermore, we applied the MLP model to Kaggle’s test dataset and submitted a list of predictions to Kaggle for evaluation.

In iteration Take6, we constructed several Multilayer Perceptron (MLP) models with two hidden layers. These MLP models will serve as a benchmark as we build more complex MLP models in future iterations. Furthermore, we applied the MLP model to Kaggle’s test dataset and submitted a list of predictions to Kaggle for evaluation.

In this Take7 iteration, we will construct several Multilayer Perceptron (MLP) models with three hidden layers. These MLP models will serve as a benchmark as we build more complex MLP models in future iterations. Furthermore, we will apply the MLP model to Kaggle’s test dataset and submit a list of predictions to Kaggle for evaluation.

ANALYSIS: Note: Performance measurements for iterations Take1, Take2, and Take3 are available from the Take4 blog posts.

In iteration Take4, the XGBoost algorithm achieved a baseline accuracy performance of 75.29%. After a series of tuning trials, XGBoost turned in an accuracy result of 85.58%. When we applied the tuned XGBoost algorithm to the test dataset, we obtained an accuracy score of only 87.72%, which was even better than the predictions from the training data.

However, when we applied the tuned XGBoost algorithm to the test dataset from Kaggle, we obtained an accuracy score of only 75.45%. Keep in mind that Kaggle uses only 2.6% of the original dataset to predict the remaining 97.5% of test data.

In iteration Take5, all single-layer models achieved an accuracy performance of between 70.6% and 77.8% after 75 epochs using the test dataset. The 36-node model appears to have the highest accuracy with low variance. However, when we applied the single-layer 36-node neural network model to the test dataset from Kaggle, we obtained an accuracy score of only 60.96%.

In iteration Take6, all dual-layer models achieved an accuracy performance of between 75.5% and 80.4% after 75 epochs using the test dataset. The 36/28-node model appears to have the highest accuracy with low variance. However, when we applied the dual-layer 36/28-node neural network model to the test dataset from Kaggle, we obtained an accuracy score of only 65.872%.

In this Take7 iteration, all three-layer models achieved an accuracy performance of between 78.1% and 80.1% after 75 epochs using the test dataset. The 36/28/24-node model appears to have the highest accuracy with low variance.

However, when we applied the three-layer 36/28/8-node neural network model to the test dataset from Kaggle, we obtained an accuracy score of only 65.744%. We captured additional performance measurements using different model configurations.

  • Three-Layer 36/28/08-Node MLP Model – Accuracy: 65.744%
  • Three-Layer 36/28/12-Node MLP Model – Accuracy: 63.707%
  • Three-Layer 36/28/16-Node MLP Model – Accuracy: 65.485%
  • Three-Layer 36/28/20-Node MLP Model – Accuracy: 63.663%
  • Three-Layer 36/28/24-Node MLP Model – Accuracy: 63.455%

CONCLUSION: For this iteration, the baseline model with a three-layer of 36/28/8 nodes appeared to have yielded the best result. For this dataset, we should consider experimenting with more and different MLP models.

Dataset Used: Forest Cover Type Data Set

Dataset ML Model: Multi-Class classification with numerical attributes

Dataset Reference: https://archive.ics.uci.edu/ml/datasets/Covertype

One source of potential performance benchmarks: https://www.kaggle.com/c/forest-cover-type-prediction/overview

The HTML formatted report can be found here on GitHub.

Multi-Class Deep Learning Model for Forest Cover Type Using TensorFlow Take 6

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The purpose of this project is to construct a predictive model using various machine learning algorithms and to document the end-to-end steps using a template. The Forest Cover Type dataset is a multi-class classification situation where we are trying to predict one of several (more than two) possible outcomes.

INTRODUCTION: This experiment tries to predict forest cover type from cartographic variables only. This study area includes four wilderness areas located in the Roosevelt National Forest of northern Colorado. These areas represent forests with minimal human-caused disturbances, so that existing forest cover types are more a result of ecological processes rather than forest management practices.

The actual forest cover type for a given observation (30 x 30-meter cell) was determined from the US Forest Service (USFS) Region 2 Resource Information System (RIS) data. Independent variables were derived from data initially obtained from the US Geological Survey (USGS) and USFS data. Data is in raw form (not scaled) and contains binary (0 or 1) columns of data for qualitative independent variables (wilderness areas and soil types).

In iteration Take1, we established the baseline accuracy for comparison with future rounds of modeling.

In iteration Take2, we examined the feature selection technique of attribute importance ranking by using the Gradient Boosting algorithm. By selecting the essential attributes, we decreased the modeling time and still maintained a similar level of accuracy when compared to the baseline model.

In iteration Take3, we examined the feature selection technique of recursive feature elimination (RFE) with the use of the Extra Trees algorithm. By selecting no more than 40 attributes, we maintained a similar level of accuracy when compared to the baseline model.

In iteration Take4, we constructed and tuned an XGBoost machine learning model for this dataset. We also observed the best accuracy result that we could obtain using the XGBoost model with the training dataset from Kaggle. Furthermore, we applied the XGBoost model to Kaggle’s test dataset and submitted a list of predictions to Kaggle for evaluation.

In iteration Take5, we constructed several Multilayer Perceptron (MLP) models with one hidden layer. These simple MLP models will serve as the baseline models as we build more complex MLP models in future iterations. Furthermore, we applied the MLP model to Kaggle’s test dataset and submitted a list of predictions to Kaggle for evaluation.

In this Take6 iteration, we will construct several Multilayer Perceptron (MLP) models with two hidden layers. These MLP models will serve as a benchmark as we build more complex MLP models in future iterations. Furthermore, we will apply the MLP model to Kaggle’s test dataset and submit a list of predictions to Kaggle for evaluation.

ANALYSIS: Note: Performance measurements for iterations Take1, Take2, and Take3 are available from the Take4 blog posts.

In iteration Take4, the XGBoost algorithm achieved a baseline accuracy performance of 75.29%. After a series of tuning trials, XGBoost turned in an accuracy result of 85.58%. When we applied the tuned XGBoost algorithm to the test dataset, we obtained an accuracy score of only 87.72%, which was even better than the predictions from the training data.

However, when we applied the tuned XGBoost algorithm to the test dataset from Kaggle, we obtained an accuracy score of only 75.45%. Keep in mind that Kaggle uses only 2.6% of the original dataset to predict the remaining 97.5% of test data.

In iteration Take5, all single-layer models achieved an accuracy performance of between 70.6% and 77.8% after 75 epochs using the test dataset. The 36-node model appears to have the highest accuracy with low variance. However, when we applied the single-layer 36-node neural network model to the test dataset from Kaggle, we obtained an accuracy score of only 60.96%.

In this Take6 iteration, all dual-layer models achieved an accuracy performance of between 75.5% and 80.4% after 75 epochs using the test dataset. The 36/28-node model appears to have the highest accuracy with low variance.

However, when we applied the dual-layer 36/28-node neural network model to the test dataset from Kaggle, we obtained an accuracy score of only 65.872%. We captured additional performance measurements using different model configurations.

  • Dual-Layer 36/08-Node MLP Model – Accuracy: 62.255%
  • Dual-Layer 36/12-Node MLP Model – Accuracy: 62.410%
  • Dual-Layer 36/16-Node MLP Model – Accuracy: 60.732%
  • Dual-Layer 36/20-Node MLP Model – Accuracy: 63.096%
  • Dual-Layer 36/24-Node MLP Model – Accuracy: 61.015%
  • Dual-Layer 36/28-Node MLP Model – Accuracy: 65.872%
  • Dual-Layer 36/32-Node MLP Model – Accuracy: 62.934%

CONCLUSION: For this iteration, the baseline model with a dual-layer of 36/28 nodes appeared to have yielded the best result. For this dataset, we should consider experimenting with more and different MLP models.

Dataset Used: Forest Cover Type Data Set

Dataset ML Model: Multi-Class classification with numerical attributes

Dataset Reference: https://archive.ics.uci.edu/ml/datasets/Covertype

One source of potential performance benchmarks: https://www.kaggle.com/c/forest-cover-type-prediction/overview

The HTML formatted report can be found here on GitHub.

Kathy Sierra on Making Users Awesome, Part 10

In the book, Badass: Making Users Awesome, Kathy Sierra analyzed and discussed the new ways of thinking about designing and sustaining successful products and services.

These are some of my takeaways from reading the book.

In previous chapters, we learned from Kathy that we should our users train their brains to pay attention. Our minds continuously try to filter out spam/noises, so we need to work hard in making our tasks not get caught in the brain’s spam filter.

Even more so, we need to create situations where the user’s brain will think the tasks to be significant enough for the brain to care to interact with it.

  • We want our tasks to be something the brain will care about.
  • We want our tasks to be something the brain will pay attention to.
  • We want our tasks to be something the brain will want to remember.

Why? Because …

  • The brain cares about things that cause a feeling or an emotional response.
  • The brain pays attention to things that are odd, surprising, and unexpected.
  • The brain wants things resolved so that it can remember the answer/outcome for future reference.

Most marketing content is more learnable and memorable than most learning content for good reasons. The critical difference is that marketing focuses on the compelling context, while the manuals and learning content focuses on the tool.

It is no wonder that the brain chooses to remember the marketing messages because it considers those messages with the emotional response are more worthy. The learning content that is about the tool becomes more of emotional-flatline spam that the brain wants to filter out.

Kathy stated, “The best way to deal with the brain’s spam filter is to reduce the number of things that need to get past it.” This means we should strive the deliver the learning and the knowledge to our brain in a Just-In-Time fashion.

Unfortunately, Just-In-Case is the predominant model for most forms of learning. To the brain, the Just-In-Case style of knowledge can seem useless and like spam.

Not every piece of knowledge can be delivered in a Just-In-Time style. There will be some Just-In-Case knowledge the user must learn before they need to use it. For those pieces of knowledge that the users don’t know but need to know before their skill can make progress, we need to do two things.

  1. We need to validate the absolute need for the knowledge
  2. We need to help the users convince their brain that the knowledge is necessary

We need to be selective about the Just-In-Case knowledge to acquire now because our cognitive resources for a given day are limited.

Kathy also asserted, “If we really care about our users, we’ll help them do what they want, not what we want.” The goal of our users is not merely to become badass at our product or service.

Our users want to become more skillful, more knowledgeable, and more capable. Our users want to be badass in life!