Comparing Classification Model Quality

Stephanie Glen looks at ways to compare model evaluation for classification models:

In part 1, I compared a few model evaluation techniques that fall under the umbrella of ‘general statistical tools and tests’. Here in Part 2 I compare three of the more popular model evaluation techniques for classification and clustering: confusion matrix, gain and lift chart, and ROC curve. The main difference between the three techniques is that each focuses on a different type of result:

– Confusion matrix: false positives, false negatives, true positives and true negatives.
– Gain and lift: focus is on true positives.
– ROC curve: focus on true positives vs. false positives.

These are good tools for evaluation and Stephanie does a good job explaining each.

Related Posts

MAPE and Its Flaws

Jan Fischer takes us through Mean Absolute Percentage Error as a measure of forecast quality: Particular small actual values bias the MAPE.If any true values are very close to zero, the corresponding absolute percentage errors will be extremely high and therefore bias the informativity of the MAPE (Hyndman & Koehler 2006). The following graph clarifies this […]

Read More

Calculating AUC in R

Andrew Treadway shows how you can calculate Area Under the Curve in R: AUC is an important metric in machine learning for classification. It is often used as a measure of a model’s performance. In effect, AUC is a measure between 0 and 1 of a model’s performance that rank-orders predictions from a model. For […]

Read More


July 2019
« Jun Aug »