Sebastian Sauer shows us a pitfall of brute-force conversion of factors to integers:
Oh no! That’s not what we wanted! R has messed the thing up (?). The reason is that R sees the first factor level internally as the number 1 . The second level as number two. What’s the first factor level in our case? Let’s see:
factor(tips$sex) %>% head() #> [1] Female Male Male Male Female Male #> Levels: Female Male factor(tips$sex_r) %>% head() #> [1] 1 0 0 0 1 0 #> Levels: 0 1
That’s confusing: “0” is the first level of
sex_r
– internally for R represented by “1”. The second level ofsex_r
is “1” – internally represented by “2”.
Fortunately, we get the easy answer at the end of the post.