Eric Cai demonstrates the difference between levels()
and unique()
when dealing with factors in R:
The new data set “iris2” does not have any rows containing “setosa” as a possible value of “Species”, yet the levels() function still shows “setosa” in its output.
According to the user G5W in Stack Overflow, this is a desirable behaviour for the levels() function. Here is my interpretation of the intent behind the creators of base R: The possible values of a factor are fundamental attributes of that variable, which should not be altered because of changes in the data.
There’s some back-and-forth in the comments; my takeaway is that both are useful functions depending upon what, exactly, you want to learn.
Comments closed