Press "Enter" to skip to content

Using drop = FALSE On Data Frames

John Mount explains why you might want to add drop = FALSE to your data.frame operations:

We were merely trying to re-order the rows and the result was converted to a vector. This happened because the rules for [ , ] change if there is only one result column. This happens even if the there had been only one input column. Another example is: d[,] is also vector in this case.

The issue is: if we are writing re-usable code we are often programming before we know complete contents of a variable or argument. For a data.frame named “g” supplied as an argument: g[vec, ] can be a data.frame or a vector (or even possibly a list). However we do know if g is a data.frame then g[vec, , drop = FALSE] is also a data.frame(assuming vec is a vector of valid row indices or a logical vector, note: NA induces some special cases).

We care as vectors and data.frames have different semantics, so are not fully substitutable in later code.

Definitely read the comments on this one as well, as John extends his explanation and others chime in with very useful notes.