Evaluating A Data Science Project

Tom Fawcett gives us an interesting evaluation of a data science case study:

The model is a fully connected neural network with three hidden layers, with a ReLU as the activation function. They state that data from Google Compute Engine was used to train the model (implemented in TensorFlow), and Cloud Machine Learning Engine’s HyperTune feature was used to tune hyperparameters.

I have no reason to doubt their representation choices or network design, but one thing looks odd. Their output is two ReLU (rectifier) units, each emitting the network’s accuracy (technically: recall) on that class. I would’ve chosen a single Softmax unit representing the probability of Large Loss driver, from which I could get a ROC or Precision-Recall curve. I could then threshold the output to get any achievable performance on the curve. (I explain the advantages of scoring over hard classification in this post.)

But I’m not a neural network expert, and the purpose here isn’t to critique their network design, just their general approach. I assume they experimented and are reporting the best performance they found.

Read the whole thing.

Related Posts

Lasso and Ridge Regression in Python

Kristian Larsen shows off a few regression techniques using Python: Variables with a regression coefficient equal to zero after the shrinkage process are excluded from the model. Variables with non-zero regression coefficients variables are most strongly associated with the response variable. Therefore, when you conduct a regression model it can be helpful to do a […]

Read More

Using Cohen’s D for Experiments

Nina Zumel takes us through Cohen’s D, a useful tool for determining effect sizes in experiments: Cohen’s d is a measure of effect size for the difference of two means that takes the variance of the population into account. It’s defined asd = | μ1 – μ2 | / σpooledwhere σpooled is the pooled standard deviation over both cohorts. […]

Read More


August 2017
« Jul Sep »