So far so good. Let’s now remove the “intercept term” by adding the “

`0+`

” from the fitting command.`m2 <- lm(y~0+x, data=d)t(broom::glance(m2))`

`## [,1] ## r.squared 7.524811e-01 ## adj.r.squared 7.474297e-01 ## sigma 3.028515e-01 ## statistic 1.489647e+02 ## p.value 1.935559e-30 ## df 2.000000e+00 ## logLik -2.143244e+01 ## AIC 4.886488e+01 ## BIC 5.668039e+01 ## deviance 8.988464e+00 ## df.residual 9.800000e+01`

`d$pred2 <- predict(m2, newdata = d)`

Uh oh. That

appearedto vastly improve the reported`R-squared`

and the significance (“`p.value`

“)!

Read on to learn why this happens and how you can prevent this from tricking you in the future.

Kevin Feasel

2017-06-19

Data Science, R