Learning R

Kevin Feasel



Grant Fritchey is learning R:

Awesome. Fixed that algorithm problem, right?


That’s because algorithms are not the problem… the only problem. The real problem is data preparation. A lot of the examples you’ll read online are very straight forward with nice neat data sets. That’s because they were carefully groomed and prepared. Here I am looking at the wooly wild real data and I’m utterly lost in how to properly prepare this so that it’s appropriately set up as a continuous distribution(or a distribution at all). WOOF! The reason this is so hard is because I actually don’t understand the data fundamentals of the problem I’m trying to solve in exactly the way needed to solve the problem. More cogitation is necessary.

Just because you can write R code doesn’t mean you are a data scientist.  Grant has the right mindset, but this post is fair warning that R’s complexity isn’t so much in its being a DSL, but rather in the domain itself.

Related Posts

Using DALEX To Explain Black-Box Models

Przemyslaw Biecek explains that there’s more than LIME for explaining black-box models: I’ve heard about a number of consulting companies, that decided to use simple linear model instead of a black box model with higher performance, because ,,client wants to understand factors that drive the prediction’’. And usually the discussion goes as following: ,,We have tried LIME […]

Read More

Comparing Keras In Python Versus R

Dmitry Kisler performs image classification using Keras in both Python and R: From the plots above, one can see that: the accuracy of your model doesn’t depend on the language you use to build and train it (the plot shows only train accuracy, but the model doesn’t have high variance and the bias accuracy is […]

Read More


December 2015
« Nov Jan »