The Assumptive Nature Of R

Kevin Feasel



Tim Sweetser and Kyle Schmaus explain some of the less-obvious bits of R that make it harder to use as a production language:

For us, the biggest surprise when using an R data.frame is what happens when you try to access a nonexistent column. Suppose we wanted to do something with the prices of our diamonds. price is a valid column of diamonds, but say we forgot the name and thought it was title case. When we ask for diamonds[["Price"]], R returns NULL rather than throwing an error! This is the behavior not just for tibble, but for data.tableand data.frame as well. For production jobs, we need things to fail loudly, i.e. throw errors, in order to get our attention. We’d like this loud failure to occur when, for example, some upstream data change breaks our script’s assumptions. Otherwise, we assume everything ran smoothly and as intended. This highlights the difference between interactive use, where R shines, and production use.

Read on for several good points along these lines.

Related Posts

Linear Discriminant Analysis

Jake Hoare explains Linear Discriminant Analysis: Linear Discriminant Analysis takes a data set of cases (also known as observations) as input. For each case, you need to have a categorical variable to define the class and several predictor variables (which are numeric). We often visualize this input data as a matrix, such as shown below, with each case being a row and each variable a column. In this […]

Read More

Azure Data Lake Store File Management With httr

Leila Etaati shows how to generate RESTful statements in R using httr: In this post, I am going to share my experiment in how to do file management in ADLS using R studio, to do this you need to have below items 1. An Azure subscription 2. Create an Azure Data Lake Store Account 3. […]

Read More


June 2017
« May Jul »