The Assumptive Nature Of R

Kevin Feasel

2017-06-16

R

Tim Sweetser and Kyle Schmaus explain some of the less-obvious bits of R that make it harder to use as a production language:

For us, the biggest surprise when using an R data.frame is what happens when you try to access a nonexistent column. Suppose we wanted to do something with the prices of our diamonds. price is a valid column of diamonds, but say we forgot the name and thought it was title case. When we ask for diamonds[["Price"]], R returns NULL rather than throwing an error! This is the behavior not just for tibble, but for data.tableand data.frame as well. For production jobs, we need things to fail loudly, i.e. throw errors, in order to get our attention. We’d like this loud failure to occur when, for example, some upstream data change breaks our script’s assumptions. Otherwise, we assume everything ran smoothly and as intended. This highlights the difference between interactive use, where R shines, and production use.

Read on for several good points along these lines.

Related Posts

Building An Image Recognizer With R

David Smith has a post showing how to build an image recognizer with R and Microsoft’s Cognitive Services Library: The process of training an image recognition system requires LOTS of images — millions and millions of them. The process involves feeding those images into a deep neural network, and during that process the network generates […]

Read More

Checkpointing Code For Reproduction

David Smith tells an interesting story about a reproducibility problem with data analysis: Timo Grossenbacher, data journalist with Swiss Radio and TV in Zurich, had a bit of a surprise when he attempted to recreate the results of one of the R Markdown scripts published by SRF Data to accompany their data journalism story about vested interests of Swiss […]

Read More

Categories

June 2017
MTWTFSS
« May Jul »
 1234
567891011
12131415161718
19202122232425
2627282930