Why is the average age of death for male rappers under 30? Why are the best scoring schools the smallest ones? Are large earthquakes on the rise? Do dead salmon have brain activity when shown photographs? Are the Sophomore Slump and the Sport Illustrated Jinx real or imaginary? Do drugs for relaxation help students score higher on the SAT? Why does punishment seem to work better than reward? Why are movie sequels rarely as good as the originals? These are the types of questions that data scientists should be well positioned to answer, but knowing the specific answers isn’t as important as being familiar with the underlying principles and pitfalls which lead less careful thinkers astray.
Businesses call themselves "data driven" and think they know what data is telling them ("up is up"). However, many are not analyzing things in a scientifically valid way and are setting themselves up to be duped by data. My goal is to help train the next generation of data scientists and managers to avoid the pitfalls, whether it's through my book or by directly speaking to them about what I've learned. Most books contain success stories, but mine is mostly filled with "failure stories", which should be more instructive. Data science works, but only if you do it right.