If you’re familiar with linear regression in R, you’ve probably encountered the traditional
lm()function. While this is a powerful tool, it might not be the best choice when dealing with outliers or influential observations. In such cases, robust regression comes to the rescue, and in R, the
rlm()function from the MASS package is a valuable resource. In this blog post, we’ll delve into the step-by-step process of performing robust regression in R, using a dataset to illustrate the differences between the base R lm model and the robust rlm model.
The short version of
lm() is that Ordinary Least Squares (the form of linear regression we use with
lm()) is quite susceptible to outliers. Meanwhile,
rlm() uses a technique known as M-estimation, which ends up weighting outlier points different from inliers, making it less susceptible to a small number of outliers wrecking the chart.