Downsides Of Logistic Regression

Vincent Granville points out a few flaws in logistic regression:

I recently read a very popular article entitled 5 Reasons “Logistic Regression” should be the first thing you learn when becoming a Data Scientist. Here I provide my opinion on why this should no be the case.

It is nice to have logistic regression on your resume, as many jobs request it, especially in some fields such as biostatistics. And if you learned the details during your college classes, good for you. However, for a beginner, this is not the first thing you should learn. In my career, being an isolated statistician (working with marketing guys, sales people, or engineers) in many of my roles, I had the flexibility to choose which tools and methodology to use. Many practitioners today are in a similar environment. If you are a beginner, chances are that you would use logistic regression as a black-box tool with little understanding about how it works: a recipe for disaster.

Read on for his reasons.  I’m not totally convinced, but he does lay out his argument clearly.

Related Posts

The Intuition Behind Principal Component Analysis

Holger von Jouanne-Diedrich gives us an intuition behind how principal component analysis (PCA) works: Principal component analysis (PCA) is a dimension-reduction method that can be used to reduce a large set of (often correlated) variables into a smaller set of (uncorrelated) variables, called principal components, which still contain most of the information.PCA is a concept […]

Read More

Gradient Boosting And XGBoost

Shirin Glander has another English-language transcript from a German video, this time covering gradient boosting techniques: Let’s look at how Gradient Boosting works. Most of the magic is described in the name: “Gradient” plus “Boosting”. Boosting builds models from individual so called “weak learners” in an iterative way. In the Random Forests part, I had already discussed the […]

Read More


May 2018
« Apr Jun »