Overfitting With Polynomial Regression

Vincent Granville shows us a few problems with polynomial regression:

Even if the function to be estimated is very smooth, due to machine precision, only the first three or four coefficients can be accurately computed. With infinite precision, all coefficients would be correctly computed without over-fitting. We first explore this problem from a mathematical point of view in the next section, then provide recommendations for practical model implementations in the last section.

This is also a good read for professionals with a math background interested in learning more about data science, as we start with some simple math, then discuss how it relates to data science. Also, this is an original article, not something you will learn in college classes or data camps, and it even features the solution to a linear regression involving an infinite number of variables.

Granville’s point that overfitting is a relatively small concern is rather interesting.  But the advice to avoid polynomial regression is generally pretty solid.

Related Posts

Principal Component Analysis With Faces

Mic at The Beginner Programmer shows us how to creepy PCA diagrams with human faces: PCA looks for a new the reference system to describe your data. This new reference system is designed in such a way to maximize the variance of the data across the new axis. The first principal component accounts for as […]

Read More

Using Uncertainty For Model Interpretation

Yoel Zeldes and Inbar Naor explain how uncertainty can help you understand your models better: One prominent example is that of high risk applications. Let’s say you’re building a model that helps doctors decide on the preferred treatment for patients. In this case we should not only care about the accuracy of the model, but […]

Read More


May 2018
« Apr Jun »