Why is the average age of death for male rappers under 30? Why are the best scoring schools the smallest ones? Are large earthquakes on the rise? Do dead salmon have brain activity when shown photographs? Are the Sophomore Slump and the Sport Illustrated Jinx real or imaginary? Do drugs for relaxation help students score higher on the SAT? Why does punishment seem to work better than reward? Why are movie sequels rarely as good as the originals? These are the types of questions that data scientists should be well positioned to answer, but knowing the specific answers isn’t as important as being familiar with the underlying principles and pitfalls which lead less careful thinkers astray.
Businesses call themselves "data driven" and think they know what data is telling them ("up is up"). However, many are not analyzing things in a scientifically valid way and are setting themselves up to be duped by data. My goal is to help train the next generation of data scientists and managers to avoid the pitfalls, whether it's through my book or by directly speaking to them about what I've learned. Most books contain success stories, but mine is mostly filled with "failure stories", which should be more instructive. Data science works, but only if you work like a scientist.
I'm Jay and I am passionate about data science. I love lecturing about it, reading about it, and blogging about it.
I earned my B.A. in Mathematics at Pomona College, but it wasn’t until my final year at Pomona when the seeds for my lifelong love of data science were planted. My enjoyment of classes like "Mathematical Modeling" and "Probability and Its Applications" provided the first clue that something like data science could be in my future.
Even after working as a software developer, I hadn’t realized the magic of combining programming with math until my former professor Art Benjamin challenged me with a proof he was working on for his upcoming book “Proofs that Really Count: The Art of Combinatorial Proof.” I didn’t think I could possibly solve such a difficult problem, but somehow he thought I could bring a unique approach to the problem. It turns out that he was right, as I created a custom computer program that used a kind of Monte Carlo approach to search possible solutions while also allowing the user to nudge the program in the right direction. Before long, I had surprised myself by conquering four problems and was excited to get mentioned in his book.
After this, I was a strategic advisor for the winning entry in the international 2007 AAAI Computer Poker Competition. In an article in the San Bernardino County Sun, Michael Bowling, from University of Alberta’s Computing Science Department, stated “they are going up against top-notch universities that are doing cutting-edge research in this area, so it was very impressive that they were not only competitive, but they won.”
Following 11 years as a software developer, I followed my inner data wonk to the Analytics department at Oversee.net. I participated in my grand successes and epic failures in the years that followed and it was there that I learned the value of scientific rigor.