Wasting Money With Data Science

Giovanni Lanzani has a post with the controversial title above:

Some data is gathered, given to data scientists, and — after two weeks — the first demo takes place. The results are promising, but they need a bit more time.

Fine. After all, the data was messy: they had to clean it up and go back to the source a couple of times.

Two weeks pass and the new results are even nicer. With 70% accuracy, they can predict if a patient will go home after their visit to the emergency room.

This is much better than random (50%)! A full-fledged pilot starts.

They are faced with a couple of challenges to go from model to data product:

  • How to send the source data to the model is unclear;

  • Where the model should run;

  • The hospital operations need to change, as the intake happens with pen and paper;

  • They realize that without knowing to which department the patient will go, they won’t add any value;

  • To predict the department, the model need the diagnosis. But once the diagnosis gets typed in the computer, the patient has reached their destination: the model is useless!

I think it’s a fair point:  it’s easy from the standpoint of internal researchers to look for things which they can do, but which don’t have much business value.  The risk on the other side is that you’ll start diving into a high-potential-value problem and then realize that the data isn’t there to draw conclusions or that the relationships you expected simply aren’t there.

Related Posts

Solving Naive Bayes By Hand

I have a post that requires math and is meaner toward the Buffalo Bills than I normally am: Trust the ProcessThere are three steps to the process of solving the simplest of Naive Bayes algorithms. They are:1. Find the probability of winning a game (that is, our prior probability).2. Find the probability of winning given each input variable: whether Josh Allen starts the game, whether the team is […]

Read More

The Basics Of Naive Bayes Classifiers

I have the first post in a series up on using the Naive Bayes class of algorithms for classifying inputs: Why Should We Use Naive Bayes? Is It the Best Classifier Out There?Probably not, no. In fact, it’s typically a mediocre classifier—it’s the one you strive to beat with your fancy algorithm. So why even […]

Read More

Categories

September 2018
MTWTFSS
« Aug Oct »
 12
3456789
10111213141516
17181920212223
24252627282930