My most memorable instance of data-driven decision-making was my development of a simple and profitable strategy for online poker. It also provided content for my guest lectures for Harvey Mudd College’s popular Mathematics of Games course on how the marriage of data and mathematics can result in successful strategies that defy conventional wisdom.
It all started when I saw a “poker corner” segment on TV stating that a player who is short-stacked (has few chips remaining) has only one move: all-in. This was presented as a bad situation, but in my mind it was a great opportunity to make the game tractable. Some poker sites allowed you to start with a short-stack, so if my hypothesis was correct, I could actually profit. Being somewhat risk-averse, I only ever desposited $50 into my online poker account.
After utilizing an initial all-in or fold strategy that allowed me to gather hand history files on my opponents, I engineered an exploitive strategy by calculating the expected call equity (value when my bet is called), fold equity (value when everyone folds), and the cost of patience (the blinds). Conventional wisdom states that repetitive strategies can’t work, and that your specific opponents and position at the table are the most important things to consider. However, my data was telling me that all of this was incorrect and that a handy profit could be made.
While the strategy was simple, the analysis was not. In addition to creating a predictive model to evaluate potential strategies, I also had to estimate my precise edge in the game, in order to use the Kelly Criterion to minimize exposure to bad luck while maximizing hourly winnings.
In the end, my $50 became $30,000, and after sharing the strategy with friends, we all collected crazy stories to tell disbelieving family members.
Why is the average age of death for male rappers under 30? Why are the best scoring schools the smallest ones? Are large earthquakes on the rise? Do dead salmon have brain activity when shown photographs? Are the Sophomore Slump and the Sport Illustrated Jinx real or imaginary? Do drugs for relaxation help students score higher on the SAT? Why does punishment seem to work better than reward? Why are movie sequels rarely as good as the originals? These are the types of questions that data scientists should be well positioned to answer, but knowing the specific answers isn’t as important as being familiar with the underlying principles and pitfalls which lead less careful thinkers astray.
Businesses call themselves "data driven" and think they know what data is telling them ("up is up"). However, many are not analyzing things in a scientifically valid way and are setting themselves up to be duped by data. My goal is to help train the next generation of data scientists to avoid the pitfalls, whether it's through my book or by directly teaching them what I've learned. Most books contain success stories, but mine is mostly filled with "failure stories", which should be more instructive. Data science works, but only if you do it right.