For us, the biggest surprise when using an R
data.frame
is what happens when you try to access a nonexistent column. Suppose we wanted to do something with the prices of ourdiamonds
.price
is a valid column ofdiamonds
, but say we forgot the name and thought it was title case. When we ask fordiamonds[["Price"]]
, R returnsNULL
rather than throwing an error! This is the behavior not just fortibble
, but fordata.table
anddata.frame
as well. For production jobs, we need things to fail loudly, i.e. throw errors, in order to get our attention. We’d like this loud failure to occur when, for example, some upstream data change breaks our script’s assumptions. Otherwise, we assume everything ran smoothly and as intended. This highlights the difference between interactive use, where R shines, and production use.
Read on for several good points along these lines.
Comments closed