Evaluating A Data Science Project

Tom Fawcett gives us an interesting evaluation of a data science case study:

The model is a fully connected neural network with three hidden layers, with a ReLU as the activation function. They state that data from Google Compute Engine was used to train the model (implemented in TensorFlow), and Cloud Machine Learning Engine’s HyperTune feature was used to tune hyperparameters.

I have no reason to doubt their representation choices or network design, but one thing looks odd. Their output is two ReLU (rectifier) units, each emitting the network’s accuracy (technically: recall) on that class. I would’ve chosen a single Softmax unit representing the probability of Large Loss driver, from which I could get a ROC or Precision-Recall curve. I could then threshold the output to get any achievable performance on the curve. (I explain the advantages of scoring over hard classification in this post.)

But I’m not a neural network expert, and the purpose here isn’t to critique their network design, just their general approach. I assume they experimented and are reporting the best performance they found.

Read the whole thing.

Related Posts

Calculating Lifetime Value With R

Sergey Bryl shows how to calculate the lifetime value of a subscription service: Predicting LTV is a common issue for a new, recently launched product/service/application when we don’t have a lot of historical data but want to calculate LTV as soon as possible. Even though we may have a lot of historical data on customer […]

Read More

Interpreting The Area Under The Receiver Operating Characteristic Curve

Roos Colman explains what a Receiver Operating Characteristic (ROC) curve is and how we interpret the Area Under the Curve (AUC): The AUC can be defined as “The probability that a randomly selected case will have a higher test result than a randomly selected control”. Let’s use this definition to calculate and visualize the estimated […]

Read More

Categories

August 2017
MTWTFSS
« Jul Sep »
 123456
78910111213
14151617181920
21222324252627
28293031