Drucker on Effective Decisions, Part 4

In his book, The Essential Drucker: The Best of Sixty Years of Peter Drucker’s Essential Writings on Management, Peter Drucker analyzed the ways that management practices and principles affect the performance of organizations, individuals, and society. The book covers the basic principles of management and gives professionals the tools to perform the tasks that the environment of tomorrow will require of them.

These are my takeaways from reading the book.

In the chapter “Effective Decisions,” Drucker discussed the five aspects of the effective decision-making process.

Step 1. The decision-maker reaches a clear realization that the problem was a generic situation and not a random exception.

Step 2. The decision-maker understand the specifications that the answer to the problem had to satisfy.

Step 3. The decision-maker thinks through what is the “right” solution.

Step 4. The decision-maker builds actions into the decision.

Step 5. The decision-maker gathers feedback that tests the validity and effectiveness of the decision.

After steps one through four, Drucker asserted that we must build feedback into the decision process. The purpose of the feedback process is to test our expectations that underlie the decision against actual events.

The feedback is necessary because humans make decisions and human beings are fallible. Even the best decision has a high probability of being wrong. Decisions also might have a long shelf life. Even the most effective one eventually becomes obsolete.

Drucker saw the feedback step would be even more critical with the information age. With the help of computers in decision-making, we run a risk that the decision-makers are removed from the reality. Drucker suggested that we verify the abstractions with constant checks against the concrete. Otherwise, we run the danger that we will be making decisions using assumptions that are not in alignment with the reality. Computers can make the laborious work of feedback verification easier through automation.

Drucker encouraged us to go out and look for evidence to test our assumptions about a decision or results of a decision compared against the reality. Reality never stands still for very long, so we all need organized information for the feedback.

In summary, Drucker believed that effective people do not make many decisions. Instead, they concentrate on important decisions. The important decisions will be strategic and generic, rather than tailored to solve one particular problem. Effective people also try to make the few important decisions on the highest level of conceptual understanding. They try to find the constants in a situation.

Most importantly, effective people know that the most time-consuming step in the process is not making the decision but putting it into action. Unless a decision has “degenerated into work,” it is not a decision; it is at best a good intention. While the effective decision is based on the highest level of conceptual understanding, the action to carry it out should be as close as possible to the working level and as simple as possible.

Time Series Model for Annual Water Usage in Baltimore Using Python

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

Code Credit: Adapted from a blog post made available by Dr. Jason Brownlee of Machine Learning Mastery.

PREFACE: This is a replication of Python code from Dr. Brownlee’s blog post on time series. I have combined all the code snippets into one script so that I can turn the whole process into a template. The comments and analysis were also part of the blog post and annotated here to explain each coding block.

SUMMARY: The purpose of this project is to construct a time series prediction model and document the end-to-end steps using a template. The Annual Water Usage in Baltimore dataset is a time series situation where we are trying to forecast future outcomes based on the past data points.

INTRODUCTION: The problem is to predict annual water usage. The dataset provides the annual water usage in Baltimore from 1885 to 1963, or 79 years of data. The dataset contains 79 observations in the units of liters per capita per day and is credited to Hipel and McLeod, 1994.

ANALYSIS: The baseline prediction (or persistence) for the dataset resulted in an RMSE of 21.975. The manually configured model was simplified to ARIMA(4,1,1) and produced an RMSE of 31.097, which was higher than the persistent model. After applying the grid search technique to the dataset, the final RMSE of the ARIMA(2,1,0) model was 21.733. This is only a slightly smaller error than the persistent model, and it may or may not be statistically different.

CONCLUSION: The final RMSE for the validation period is predicted at 16 liters per capita per day. This is not too different from the expected error of 21, but we would expect that it is also not too different from a simple persistence model. The forecast does have the characteristics of a persistence forecast. This suggests that although this time series does have an obvious trend, it is still a reasonably difficult problem.

Dataset Used: Annual Water Usage in Baltimore

Dataset ML Model: Time series forecast with numerical attributes

Dataset Reference: https://datamarket.com/data/set/22sl/baltmore-city-annual-water-use-liters-per-capita-per-day-1885-1968#!ds=22sl&display=line

One potential source of performance benchmark: https://machinelearningmastery.com/time-series-forecast-study-python-annual-water-usage-baltimore/

The HTML formatted report can be found here on GitHub.

The Chicken and the Egg, Part 1

In his podcast, Akimbo, Seth Godin teaches us how to adopt a posture of possibility, change the culture, and choose to make a difference. Here are my takeaways from the episode.

Through the mechanics of genetic inheritance and evolution of species, Seth explains how ideas and culture work in a very similar way.

For genes and species:

  • Two creatures get together and create a third creature. Another word, two sets of genes combined to create a third set. The baby creature is not the replica of either parent but inherits, randomly, many of the traits that the parents carry.
  • The traits that help a creature survive the environment get passed on to the offspring. The traits that did not help the species survive long enough to produce offspring die off eventually because the new environment isn’t hospitable to those baby creatures.
  • Over time, the randomized changes in trait add up. The species and the traits they carry either evolve successfully in surviving the environment, or they die off and become extinct.
  • Given enough time and trait changes, species can also evolve away from one branch to form an entirely different branch.
  • Often, we may say a species has adapted and survived. That sounds like a planned move on the species part, but that is not the case. The species were not responding to the outside world, as Mother Nature operates on her terms and timelines. The outside world determines whether those traits get passed on, and the species have little say in the evolution process.

looking at our culture through the eyes of genetics and evolution

  • Ideas are like genes. They are often inherited from other ideas but with some mutation.
  • Some ideas spread and some do not – nothing is guaranteed. When a culture adapts an idea, ideas that help the culture to sustain or prosper stand a much better chance to survive.
  • Ideas that are part of a dying or extinct culture eventually die off along with the culture.
  • The world keeps changing as ideas spread and cultures adapt or do not adapt
  • Changes in ideas and cultures can add up. Given enough time, a culture will spread and even evolve away from its original set of ideas and beliefs.

Along the way, the myth developed that one solid, well-formed idea born out of nowhere is how great ideas appear. “Oh yeah, a genius thought that one up.” But that is not actually what happens.

As a human society, we have built an incredibly fertile ground for ideas to replicate and to spread. Ideas continue to change the culture and may end up making themselves extinct. Along the way, ideas can replicate, mutate, and became something completely unrecognizable to the person who originally put the idea into the world. That is what our culture is, the sum-total of all the ideas we have intercepted and spread to others.

Multi-Class Classification Model for Human Activities and Postural Transitions Using Python Take 1

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The purpose of this project is to construct a prediction model using various machine learning algorithms and to document the end-to-end steps using a template. The Human Activities and Postural Transitions dataset is a classic multi-class classification situation where we are trying to predict one of the 12 possible outcomes.

INTRODUCTION: The research team carried out experiments with a group of 30 volunteers who performed a protocol of activities composed of six basic activities. There are three static postures (standing, sitting, lying) and three dynamic activities (walking, walking downstairs and walking upstairs). The experiment also included postural transitions that occurred between the static postures. These are stand-to-sit, sit-to-stand, sit-to-lie, lie-to-sit, stand-to-lie, and lie-to-stand. All the participants were wearing a smartphone on the waist during the experiment execution. The research team also video-recorded the activities to label the data manually. The research team randomly partitioned the obtained data into two sets, 70% for the training data and 30% for the testing.

In the current iteration Take1, the script will focus on evaluating various machine learning algorithms and identifying the model that produces the best overall metrics. Because the dataset has many attributes that are collinear with other attributes, we will eliminate the attributes that have a collinearity measurement of 99% or higher. Iteration Take1 will establish the baseline performance for accuracy and processing time.

ANALYSIS: In the current iteration Take1, the baseline performance of the machine learning algorithms achieved an average accuracy of 88.52%. Two algorithms (Linear Discriminant Analysis and Stochastic Gradient Boosting) achieved the top accuracy metrics after the first round of modeling. After a series of tuning trials, Linear Discriminant Analysis turned in the top overall result and achieved an accuracy metric of 94.19%. By using the optimized parameters, the Linear Discriminant Analysis algorithm processed the testing dataset with an accuracy of 94.71%, which was even better than the training data.

From the model-building perspective, the number of attributes decreased by 108, from 561 down to 453.

CONCLUSION: For this iteration, the Linear Discriminant Analysis algorithm achieved the best overall results. For this dataset, we should consider using the Linear Discriminant Analysis algorithm for further modeling or production use.

Dataset Used: Smartphone-Based Recognition of Human Activities and Postural Transitions Data Set

Dataset ML Model: Multi-class classification with numerical attributes

Dataset Reference: https://archive.ics.uci.edu/ml/datasets/Smartphone-Based+Recognition+of+Human+Activities+and+Postural+Transitions

The HTML formatted report can be found here on GitHub.

Web Scraping of Merely Do It Blog Entries Using Python and Scrapy

SUMMARY: The purpose of this project is to practice web scraping by gathering specific pieces of information from a website. The web scraping code was written in Python and leveraged the Scrapy framework.

INTRODUCTION: David Lowe hosts his blog at merelydoit.blog. The purpose of this exercise is to practice web scraping by gathering the blog entries from Merely Do It’s RSS feed. This iteration of the script automatically traverses the RSS feed to capture all entries from the blog site.

Starting URLs: https://merelydoit.blog/feed or https://merelydoit.blog/feed/?paged=1

The source code and JSON output can be found here on GitHub.

排名前百分之五

(從我的一個喜歡與尊敬的作家,賽斯 高汀

在每個領域中,非凡的利益大多是給前百分之五來賺到,或二十分之一。

當然,最大的獎項通常會歸功於千載難逢的超級巨星。但這對多數人在很大程度上是遙不可及的。但事實已證明,如果你有思想和勤奮,排名前百分之五是可以實現的。

方法是選擇的正確的市場然後成為其中的一部分。不是“所有外科醫生中排名前百分之五”,但也許是“明尼蘇達州胸外科醫生中排名前百分之五。”具體而言,找到你的市場一個小領域並去服務它。

這是個具有挑戰性的任務,因為一旦你開始具體化,你就是決定你的表態。標準會成為更清晰,而且也沒有胡扯的餘地。

這另一半的方法是:努力工作。那工作就是進行升級和對你尋求服務的人實際擁有的選擇有誠實的認識。如果他們知道你所知道的,他們會選擇你嗎?你需要學習什麼,足夠的實踐什麼,和足夠的投資在什麼方面,才能真正成為前百分之五的人之一?

這是你可以用注意力焦點和努力來實現的。成為排名前百分之的一人需要有運氣和魔術和天賦來結合起來。但是,進入前百分之五,或是二十分之一,這主要還是歸於個人的選擇。

問題是,你不是與其他十九個人競爭,因為真正的是你正在與你自己競爭。你是在一次競爭的旅程中來確定你對影響某件項目的關注程度。

Binary Classification Model for MiniBooNE Particle Identification Using Python Take 4

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The purpose of this project is to construct a prediction model using various machine learning algorithms and to document the end-to-end steps using a template. The MiniBooNE Particle Identification dataset is a classic binary classification situation where we are trying to predict one of the two possible outcomes.

INTRODUCTION: This dataset is taken from the MiniBooNE experiment and is used to distinguish electron neutrinos (signal) from muon neutrinos (background). The data file is set up as follows. In the first line is the number of signal events followed by the number of background events. The records with the signal events come first, followed by the background events. Each line, after the first line, has the 50 particle ID variables for one event.

From the previous iteration Take1, the script focused on evaluating various machine learning algorithms and identifying the model that produces the best overall metrics. Iteration Take1 established the baseline performance for accuracy and processing time.

From the previous iteration Take2, we examined the feature selection technique of eliminating collinear features. By eliminating the collinear features, we hoped to decrease the processing time and maintain a similar level of accuracy compared to iteration Take1.

From the previous iteration Take3, we examined the feature selection technique of attribute importance ranking. By taking only the most important attributes, we hoped to decrease the processing time and maintain a similar level of accuracy compared to iterations Take1 and Take2.

In the current iteration Take4, we will explore the Recursive Feature Elimination (or RFE) technique by recursively removing attributes and building a model on those attributes that remain.

ANALYSIS: From the previous iteration Take1, the baseline performance of the machine learning algorithms achieved an average accuracy of 90.58%. Two algorithms (Bagged CART and Stochastic Gradient Boosting) achieved the top accuracy metrics after the first round of modeling. After a series of tuning trials, Stochastic Gradient Boosting turned in the top overall result and achieved an accuracy metric of 93.95%. By using the optimized parameters, the Stochastic Gradient Boosting algorithm processed the testing dataset with an accuracy of 93.85%, which was just slightly below than the training data.

From the previous iteration Take2, the baseline performance of the machine learning algorithms achieved an average accuracy of 90.16%. Two algorithms (Bagged CART and Stochastic Gradient Boosting) achieved the top accuracy metrics after the first round of modeling. After a series of tuning trials, Stochastic Gradient Boosting turned in the top overall result and achieved an accuracy metric of 93.92%. By using the optimized parameters, the Stochastic Gradient Boosting algorithm processed the testing dataset with an accuracy of 93.82%, which was just slightly below the training data.

From the previous iteration Take3, the baseline performance of the machine learning algorithms achieved an average accuracy of 91.04%. Two algorithms (Random Forest and Stochastic Gradient Boosting) achieved the top accuracy metrics after the first round of modeling. After a series of tuning trials, Stochastic Gradient Boosting turned in the top overall result and achieved an accuracy metric of 93.84%. By using the optimized parameters, the Stochastic Gradient Boosting algorithm processed the testing dataset with an accuracy of 93.74%, which was just slightly below the training data.

In the current iteration Take4, the baseline performance of the machine learning algorithms achieved an average accuracy of 90.11%. Two algorithms (Bagged CART and Stochastic Gradient Boosting) achieved the top accuracy metrics after the first round of modeling. After a series of tuning trials, Stochastic Gradient Boosting turned in the top overall result and achieved an accuracy metric of 93.68%. By using the optimized parameters, the Stochastic Gradient Boosting algorithm processed the testing dataset with an accuracy of 93.70%, which was just slightly below the training data.

From the model-building perspective, the number of attributes decreased by 10, from 50 down to 40 in iteration Take4. The processing time went from 7 hours 11 minutes in iteration Take1 down to 6 hours 56 minutes in Take4, which was a reduction of 3.4% from Take1. It was a slight increase in comparison to Take3, which had a processing time of 6 hours 22 minutes. It was also a slight increase in comparison to Take2, which had a processing time of 6 hours 33 minutes.

CONCLUSION: For this iteration, the Stochastic Gradient Boosting algorithm achieved the best overall results with an improved processing time after running Recursive Feature Elimination. For this dataset, the Stochastic Gradient Boosting algorithm should be considered for further modeling or production use.

Dataset Used: MiniBooNE particle identification Data Set

Dataset ML Model: Binary classification with numerical attributes

Dataset Reference: https://archive.ics.uci.edu/ml/datasets/MiniBooNE+particle+identification

The HTML formatted report can be found here on GitHub.

Drucker on Effective Decisions, Part 3

In his book, The Essential Drucker: The Best of Sixty Years of Peter Drucker’s Essential Writings on Management, Peter Drucker analyzed the ways that management practices and principles affect the performance of organizations, individuals, and society. The book covers the basic principles of management and gives professionals the tools to perform the tasks that the environment of tomorrow will require of them.

These are my takeaways from reading the book.

In the chapter “Effective Decisions,” Drucker discussed the five aspects of the effective decision-making process.

  • Step 1. The decision-maker reaches a clear realization that the problem was a generic situation and not a random exception.
  • Step 2. The decision-maker understand the specifications that the answer to the problem had to satisfy.
  • Step 3. The decision-maker thinks through what is the “right” solution.
  • Step 4. The decision-maker builds actions into the decision.
  • Step 5. The decision-maker gathers feedback that tests the validity and effectiveness of the decision.

In Step 3, Drucker suggested that we think about what a right solution is rather than just what is acceptable. The reason is that every decision always has some form of compromise in the end. If we do not know what the specifications and boundary conditions that can satisfy the right solution are, we cannot distinguish between the right compromise and the wrong compromise.

Drucker also suggested that there are two kinds of compromise. One kind is expressed as half of something is better than having nothing to work with. The other kind is expressed as half of something is worse than not having anything. In the first expression, some boundary conditions can be satisfied but not all. In the second expression, the partially satisfied boundary conditions still might not be enough to solve the problem.

Rather than worrying about what is acceptable, begin the solution formulation process by asking what the “right” thing to do is. When we focus on answering the “acceptable” question, we gain nothing and often would lose any chance to come up with an effective, let alone the right, answer.

While thinking through the boundary conditions in Step 3 is the most difficult step in decision-making, converting the decision into effective action is usually the most time-consuming for Step 4. However, a decision will not become effective unless we build the action commitments into the decision from the start.

Drucker asserted that we had not made a decision unless we carry it out in specific steps that turn into someone’s actions and responsibility. Until then, all the work we have done so far are only good intentions.

Drucker suggested we ask the following questions when converting a decision into action:

  • Who has to know of this decision?
  • What action has to be taken?
  • Who is to take it?
  • What does the action have to be so that the people who have to do it can do it?

The first and the last of these are too often overlooked—with dire results, according to Drucker.