Regularization Prevents Overfitting

Hui Li has an explanation of what regularization is and how it works to reduce the likelihood of overfitting training data:

Assume that the red line is the regression model we learn from the training data set. It can be seen that the learned model fits the training data set perfectly, while it cannot generalize well to the data not included in the training set. There are several ways to avoid the problem of overfitting.

To remedy this problem, we could:

  • Get more training examples.
  • Use a simple predictor.
  • Select a subsample of features.

In this blog post, we focus on the second and third ways to avoid overfitting by introducing regularization on the parameters βi of the model.

Read the whole thing.

Related Posts

Random Forests In R

Anish Sing Walia explains the basics of random forests and provides sample code in R: Random Forests are similar to a famous Ensemble technique called Bagging but have a different tweak in it. In Random Forests the idea is to decorrelate the several trees which are generated on the different bootstrapped samples from training Data.And […]

Read More

Neural Networks From Scratch

Ilia Karmanov explains neural nets and shows how to build one in R: Hence, my motivation for this post is two-fold: Understanding (by writing from scratch) the leaky abstractions behind neural-networks dramatically shifted my focus to elements whose importance I initially overlooked. If my model is not learning I have a better idea of what […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

Categories

July 2017
MTWTFSS
« Jun  
 12
3456789
10111213141516
17181920212223
24252627282930
31