The Data Science Delusion

Anand Ramanathan has a strong critique of “data science” as it stands today:

Illustration: Consider the sentiment-tagging task again. A Q1 resource uses an off-the-shelf model for movie reviews, and applies it to a new task (say, tweets about a customer service organization). Business is so blinded by spectacular charts [14] and anecdotal correlations (“Look at that spiteful tweet from a celebrity … so that’s why the sentiment is negative!”), that even questions about predictive accuracy are rarely asked until a few months down the road when the model is obviously floundering. Then too, there is rarely anyone to challenge the assumptions, biases and confidence intervals (Does the language in the tweets match the movie reviews? Do we have enough training data? Does the importance of tweets change over time?).

Overheard“Survival analysis? Never heard of it … Wait … There is an R package for that!”

This is a really interesting article and I recommend reading it.

Related Posts

Explaining Neural Networks With H2O

Shirin Glander explains some of the concepts behind neural networks using H2O as a guide: Before, when describing the simple perceptron, I said that a result is calculated in a neuron, e.g. by summing up all the incoming data multiplied by weights. However, this has one big disadvantage: such an approach would only enable our neural net […]

Read More

Exploratory Data Analysis In R

Laura Ellis walks us through some easy techniques for learning about our data using R: DIM AND GLIMPSE Next, we will run the dim function which displays the dimensions of the table. The output takes the form of row, column. And then we run the glimpse function from the dplyr package. This will display a […]

Read More

Categories

November 2016
MTWTFSS
« Oct Dec »
 123456
78910111213
14151617181920
21222324252627
282930