Press "Enter" to skip to content

The Assumptive Nature Of R

Tim Sweetser and Kyle Schmaus explain some of the less-obvious bits of R that make it harder to use as a production language:

For us, the biggest surprise when using an R data.frame is what happens when you try to access a nonexistent column. Suppose we wanted to do something with the prices of our diamonds. price is a valid column of diamonds, but say we forgot the name and thought it was title case. When we ask for diamonds[["Price"]], R returns NULL rather than throwing an error! This is the behavior not just for tibble, but for data.tableand data.frame as well. For production jobs, we need things to fail loudly, i.e. throw errors, in order to get our attention. We’d like this loud failure to occur when, for example, some upstream data change breaks our script’s assumptions. Otherwise, we assume everything ran smoothly and as intended. This highlights the difference between interactive use, where R shines, and production use.

Read on for several good points along these lines.