The ShipIt Journal, Part 2

In the podcast series, Seth Godin’s Startup School, Seth Godin gave a guided tour to a group of highly-motivated early-stage entrepreneurs on some of the questions they will have to dig deep and ask themselves while they build up their business. Here are my takeaways from various topics discussed in the podcast episodes.

The ShipIt Journal, now in free PDF format

  • At some point, we define all events and activities that need to happen. We exhaustively list every dependency and everything that everyone must do. The goal is on the first day to be clear with everybody about what all those dependencies are and to rip out every dependency that is there. We want to move the thrashing all the way to the first day.
  • The lean approach to startup involves building a minimum viable product. Minimum viable products (MVPs) allow us to iterate and produce versions of the products that are “less bad” than the previous versions. If our product is less than viable, we will likely launch something that has no chance to capture the hearts and minds of our audience and to extract the value we need. Building an MVP also requires us to hang out with the potential customers and find out whether this is something they might want. Sometimes those customers can come in groups. We might not know our MVP will resonate with which group, and sometimes we must pick a group to focus our effort.
  • Who becomes our competition gets back to the fear? On the day we launch, we now have competition. By outlining our potential competitors, we have one less reason to hesitate to ship.
  • “Plus It” and “Minus It” allow us to fine-tune our offering for two reasons. First, it is more likely we are going to finish our project. Second, it can make us more likely that we will become the best in the world at something. When we start with “This is not for you,” it makes it much easier to be remarkable. It makes it much easier to be the best in the world because it is not aiming for the impossible task of being there for everybody. After it is shipped and in the hands of people with whom we have built a relationship, we can now do more legitimate testing to see what we should add as opposed to trying to guess what we can take away.
  • For many projects that produce work that matter, we would often be doing the things for the very first time. By putting ourselves in a position that feels foreign and might not work, it requires emotional labor. What professionals do is they stay professionals by regularly doing things that require emotional labor. They are regularly doing things that are outside of their comfort zone. That is why we are doing the project, and it is all about we are doing this because we have never done it before.
  • Shame is the project killer. The fear of shame is what people used to keep us in line. We often use one hundred words to answer a ten-word question because the other ninety words are designed to distract the person who asked the question. Just get to the ten words because we have nothing to be ashamed. We can trump the shame by being meaningful and by going into the world by doing something that needed to be done in the first place. Project after project, we learn to ship. Over time, the foundation of thrashing and shipping gets stronger, and we get better at it. Suddenly our dreams become projects, and our projects become businesses.

一個沒有老闆的工作

(從我的一個喜歡與尊敬的作家,賽斯 高汀

這就是許多自由職業者所夢寐以求的。

你做一個能力所及的工作,但是沒有一個他人來監督你該怎麼做的麻煩。

問題是,現在要去找到一份沒有老闆的高薪工作比過去要難得多。往最底層的競爭是非常激烈的,唯一能避免它的方法就是創造項目,開發創新戰略,並去建立一個值得尋找的東西。

換句話說,你需要一個更好的老闆。

Web Scraping of Quotes from Famous People using Python Take 1

SUMMARY: The purpose of this project is to practice web scraping by gathering specific pieces of information from a website. The web scraping code was written in Python 3 and leveraged the Scrapy framework maintained by Scrapinghub.

INTRODUCTION: A demo website, created by Scrapinghub, lists quotes from famous people. It has many endpoints showing the quotes in different ways, and each endpoint presents a different scraping challenge for practicing web scraping. For this Take1 iteration, the Python script attempts to follow the page links and scrape the quote information off each page.

Starting URLs: http://quotes.toscrape.com/

import scrapy
class QuotesSpider(scrapy.Spider):
    name = "quotes"
    start_urls = ['http://quotes.toscrape.com/']
    def parse(self, response):
        self.log('I just visited: ' + response.url)
        for quote in response.css('div.quote'):
            item = {
                'author_name': quote.css('small.author::text').extract_first(),
                'text': quote.css('span.text::text').extract_first(),
                'tags': quote.css('div.tags a.tag::text').extract(),
                'author_url': quote.css('div.quote > span > a::attr(href)').extract_first(),
            }
            yield item
        # follow pagination link
        next_page_url = response.css('li.next > a::attr(href)').extract_first()
        if next_page_url:
            next_page_url = response.urljoin(next_page_url)
            yield scrapy.Request(url=next_page_url, callback=self.parse)

The source code and JSON output can be found here on GitHub.

Entrepreneurial Strategies, Part 6

In his book, Innovation and Entrepreneurship, Peter Drucker presented how innovation and entrepreneurship can be a purposeful and systematic discipline. That discipline is still as relevant to today’s business environment as when the book was published back in 1985. The book explains the challenges faced by many organizations and analyzes the opportunities which can be leveraged for success.

[https://smile.amazon.com/Innovation-Entrepreneurship-Routledge-Classics-Drucker/dp/1138168343]

Drucker wrote that entrepreneurship requires two combined approaches, entrepreneurial strategies and entrepreneurial management. Entrepreneurial management are practices and policies that live internally within the enterprise. Entrepreneurial strategies, on the other hand, are practices and policies required for working with the external element, the marketplace.

Drucker further believed that there are four important and distinct entrepreneurial strategies we should be aware of. These are:

  1. Being “Fustest with the Mostest”
  2. “Hitting Them Where They Ain’t”
  3. Finding and occupying a specialized “ecological niche”
  4. Changing the economic characteristics of a product, a market, or an industry.

These four strategies need not be mutually exclusive. A successful entrepreneur often combines two, sometimes even three elements, in one strategy.

These four strategies need not be mutually exclusive. A successful entrepreneur often combines two, sometimes even three elements, in one strategy.

Successful practitioners of “Fustest with the Mostest” and “Hitting Them Where They Ain’t” can become big and highly visible companies. Successful practitioners of the ecological niche take the cash and wallow in their anonymity. The whole point of the ecological niche strategy is to be so inconspicuous that no one is likely to try to compete in the same segment.

To practice an ecological niche, Drucker outlined three distinct niche strategies, each with its requirements, its limitations, and its risks:

  • the toll-gate strategy;
  • the specialty skill strategy; and
  • the specialty market strategy.

Within the “Ecological Niche” strategy, the specialty market builds around specialized knowledge of a market. The specialty skill, on the other hand, builds around a product or service. Other than that difference, both specialty strategies are similar.

The specialty market is found by looking at a new development with the question, What opportunities are there in this that would give us a unique niche, and what do we have to do to fill it ahead of everybody else?

The specialty market niche has the same requirements as the specialty skill niche: systematic analysis of a new trend, industry, or market; a specific innovative contribution; and continuous work to improve the product and service. After the specialty market organizations achieve the leadership, they often could retain it.

And it has the same limitations. The greatest threat to the specialty market position is a success. In the early days of the specialty market’s success, the market was not large enough to tempt anyone else. Furthermore, running a specialty market organization often requires a specialized organization, which had to be maintained anyhow to service their customers. Nobody else had any reason to build one. Once the specialty market becomes a mass market, it gives incentives for outside competitors to invade the market and compete with the established leader.

Binary-Class Classification Model for German Credit Risks Using Python Take 2

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The purpose of this project is to construct a prediction model using various machine learning algorithms and to document the end-to-end steps using a template. The German Credit Risks Dataset is a binary-class classification situation where we are trying to predict one of the two possible outcomes.

INTRODUCTION: This dataset contains 1,000 entries with 20 categorial/symbolic attributes prepared by Prof. Hofmann. In this dataset, each entry represents a person who takes on credit risk by a German bank. Each person is classified as good or bad credit risks according to the set of attributes.

Because the case study also stipulated that it is worse to classify a customer as good when they are bad (weight of 5), than it is to classify a customer as bad when they are good (weight of 1). For this iteration, the script focuses on tuning various machine learning algorithms and identify the algorithm that can produce the best cost-and-accuracy tradeoffs.

CONCLUSION: From the previous iteration Take 1, the baseline performance of the eight algorithms achieved an average accuracy of 71.80%. Three algorithms (Logistic Regression, Extra Trees, and Stochastic Gradient Boosting) achieved the top three accuracy scores after the first round of modeling. After a series of tuning trials, Stochastic Gradient Boosting turned in the top result using the training data. It achieved an average accuracy of 76.14%. Using the optimized tuning parameter available, the Stochastic Gradient Boosting algorithm processed the validation dataset with an accuracy of 77.66%, which was slightly better than the accuracy from the training data.

From the cost vs accuracy comparison, both the Logistic Regression and Stochastic Gradient Boosting achieved high accuracy while keeping the costs of incorrect predictions low. Either algorithm should be considered for further modeling or production use.

Dataset Used: German Credit Data Set

Dataset ML Model: Binary classification with numerical and categorical attributes

Dataset Reference: https://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29

One potential source of performance benchmarks: https://www.kaggle.com/uciml/german-credit/home

The HTML formatted report can be found here on GitHub.

穀物企業家

(從我的一個喜歡與尊敬的作家,賽斯 高汀

備註:在英語中,連環(serial)與穀物(cereal)發音非常相似

在開始創立一個新企業的最大轉變之一,就是你放棄了薪水。

幸福就是看到現金在往正數的方向在周轉,一個最簡單的方法就是減少開支。

一個正在朋友的沙發上睡覺,吃玉米片早餐,午餐和晚餐的企業家,他的情緒要比那些在一個豪華住宅中為維持一家四口的人的感覺要更好。

去籌集資金首先來為付自己的錢是很誘人但很不容易的。因為債主都想要在你的組織資產和市場上來投資,他們不想投資在你的開銷上。

當你的短期資本的不足壓倒其他一切情況時,我們會很難做出明智的長期決策。

這有兩種策略是相輔相成的:

第一,在需要之前減少費用。沒花的每一塊錢都是你不需要籌集的每一元。三餐吃穀物,而不是壽司。(這就是一個你上大學時年輕創業的最佳理由)。

第二,找到一些願意提前付款的客戶,那是因為您的服務或產品對他們來說會是非常實用,如果因為沒有它,那些客戶會感到無法生存。如果您的服務或產品還不是那麼實用,那你就要先把它們做的更好。

Binary-Class Classification Model for German Credit Risks Using Python Take 1

Template Credit: Adapted from a template made available by Dr. Jason Brownlee of Machine Learning Mastery.

SUMMARY: The purpose of this project is to construct a prediction model using various machine learning algorithms and to document the end-to-end steps using a template. The German Credit Risks Dataset is a binary-class classification situation where we are trying to predict one of the two possible outcomes.

INTRODUCTION: This dataset contains 1,000 entries with 20 categorial/symbolic attributes prepared by Prof. Hofmann. In this dataset, each entry represents a person who takes on credit risk by a German bank. Each person is classified as good or bad credit risks according to the set of attributes.

For this iteration, the script focuses on evaluating various machine learning algorithms and identify the algorithm that produces the best accuracy metric.

CONCLUSION: The baseline performance of the eight algorithms achieved an average accuracy of 71.80%. Three algorithms (Logistic Regression, Extra Trees, and Stochastic Gradient Boosting) achieved the top three accuracy scores after the first round of modeling. After a series of tuning trials, Stochastic Gradient Boosting turned in the top result using the training data. It achieved an average accuracy of 76.14%. Using the optimized tuning parameter available, the Stochastic Gradient Boosting algorithm processed the validation dataset with an accuracy of 77.66%, which was slightly better than the accuracy from the training data.

From the model-building activities, the Stochastic Gradient Boosting ensemble algorithm yielded the top-notch training and validation results. It is the recommended algorithm to use from the accuracy perspective.

Dataset Used: German Credit Data Set

Dataset ML Model: Binary classification with numerical and categorical attributes

Dataset Reference: https://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29

One potential source of performance benchmarks: https://www.kaggle.com/uciml/german-credit/home

The HTML formatted report can be found here on GitHub.

The ShipIt Journal, Part 1

In the podcast series, Seth Godin’s Startup School, Seth Godin gave a guided tour to a group of highly-motivated early-stage entrepreneurs on some of the questions they will have to dig deep and ask themselves while they build up their business. Here are my takeaways from various topics discussed in the podcast episodes.

The ShipIt Journal, now in free PDF format

  • To use the ShipIt journal, we need to do this with everyone on the team at the same time. Everyone on the team uses the journal to track their projects, and it works the best when in print. It is very important to involve pen and paper because people act differently when they must write something down.
  • The purpose of the ShipIt journal is closely related to the idea of the dip. If we care enough to fill out the journal in pen and tell the truth on every page, there is a good chance we will ship our project because we committed. If we do not care enough to ship your project, we should not waste time to fill out the journal for this project.
  • The ShipIt journal helps the team to do the “thrashing” earlier on during the project, instead of later. “Thrashing” in the early phase costs much less to the organization than “thrashing” later in the project. Thrash at the beginning and then take people off the project until we launch.
  • Every project needs to be manageable, finite, time-dated and doable.
  • Every project needs to have the name of one person who oversees making this date happens.
  • We need to confront our fears by writing them down in the journal. All these emotions need to be brought up early, so we can be clear about what this project is about and what it is for.
  • Along the way to get this project to ship, we are going to have to make compromises. Every project needs to pick some edges that it stands for and those it does not care about.
  • Every project needs to decide who are we trying to please or who is our customer.
  • Every project has devil’s advocates. We write down the comments from devil’s advocates, so we do not need to discuss them again. “The Devil is doing fine, and he doesn’t need an advocate.”
  • We need to identify a list of people who can stop or disapprove the project, if any. Those people may introduce compromises into the project, so we need to stay aware of their opinions.
  • We need to write down and understand fully what a “perfect” project looks like. We also need to be 100% clear about what “good enough” looks like. Perfect should not be the same as the definition of good enough because perfect is the enemy of the good. We cannot be in the business of shipping on a regular basis if perfect is our only option.