Regularization Prevents Overfitting

Hui Li has an explanation of what regularization is and how it works to reduce the likelihood of overfitting training data:

Assume that the red line is the regression model we learn from the training data set. It can be seen that the learned model fits the training data set perfectly, while it cannot generalize well to the data not included in the training set. There are several ways to avoid the problem of overfitting.

To remedy this problem, we could:

  • Get more training examples.
  • Use a simple predictor.
  • Select a subsample of features.

In this blog post, we focus on the second and third ways to avoid overfitting by introducing regularization on the parameters βi of the model.

Read the whole thing.

Related Posts

Python and R Data Reshaping

John Mount takes us through a couple of data shaping packages: The advantages of data_algebra and cdata are: – The user specifies their desired transform declaratively by example and in data. What one does is: work an example, and then write down what you want (we have a tutorial on this here).– The transform systems can print what a transform is going to […]

Read More

When to Use Different ML Algorithms

Stefan Franczuk explains the different categories of machine learning algorithms available in Talend: Clustering is the task of grouping together a set of objects in such a way, that objects in the same group are more similar to each other than to those in other groups. Clustering is really useful for identify separate groups and […]

Read More

Categories

July 2017
MTWTFSS
« Jun Aug »
 12
3456789
10111213141516
17181920212223
24252627282930
31