The Tidyverse Curse

Kevin Feasel

2017-03-27

R

Bob Muenchen notes a structural conflict between R and its most common set of packages:

There’s a common theme in many of the sections above: a task that is hard to perform using base a R function is made much easier by a function in the dplyr package. That package, and its relatives, are collectively known as the tidyverse. Its functions help with many tasks, such as selecting, renaming, or transforming variables, filtering or sorting observations, combining data frames, and doing by-group analyses. dplyr is such a helpful package that Rdocumentation.org shows that it is the single most popular R package (as of 3/23/2017.) As much of a blessing as these commands are, they’re also a curse to beginners as they’re more to learn. The main packages of dplyr, tibble, tidyr, and purrr contain a few hundred functions, though I use “only” around 60 of them regularly. As people learn R, they often comment that base R functions and tidyverse ones feel like two separate languages. The tidyverse functions are often the easiest to use, but not always; its pipe operator is usually simpler to use, but not always; tibbles are usually accepted by non-tidyverse functions, but not always; grouped tibbles may help do what you want automatically, but not always (i.e. you may need to ungroup or group_by higher levels). Navigating the balance between base R and the tidyverse is a challenge to learn.

Interesting read.  As Bob notes in the comments, he’s still a fan of the tidyverse, but it’s important to recognize that there are pain points there.

Related Posts

Visualizing with Heatmaps in R

Anisa Dhana shows how you can create a quick heatmap plot in R: To give your own colors use the scale_fill_gradientn function.ggplot(dat, aes(Age, Race)) + geom_raster(aes(fill = BMI)) + scale_fill_gradientn(colours=c("white", "red")) This is a quick example using ggplot2 but there are other heatmap libraries available too.

Read More

Predicting Intermittent Demand

Bruno Rodrigues shows one technique for forecasting intermittent data: Now, it is clear that this will be tricky to forecast. There is no discernible pattern, no trend, no seasonality… nothing that would make it “easy” for a model to learn how to forecast such data. This is typical intermittent demand data. Specific methods have been […]

Read More

Categories

March 2017
MTWTFSS
« Feb Apr »
 12345
6789101112
13141516171819
20212223242526
2728293031