I’d summarize the two “competing” curricula as follows:
- Base R first: teach syntax such as
[], loops and conditionals, data types (numeric, character, data frame, matrix), and built-in functions like
tapply. Possibly follow up by introducing dplyr or data.table as alternatives.
- Tidyverse first: Start from scratch with the dplyr package for manipulating a data frame, and introduce others like ggplot2, tidyr and purrr shortly afterwards. Introduce the
%>%operator from magrittr immediately, but skip syntax like
$or leave them for late in the course. Keep a single-minded focus on data frames.
I’ve come to strongly prefer the “tidyverse first” educational approach. This isn’t a trivial decision, and this post is my attempt to summarize my opinions and arguments for this position. Overall, they mirror my opinions about ggplot2: packages like dplyr and tidyr are not “advanced”; they’re suitable as a first introduction to R.
I think this is the better position of the two, particularly for people who already have some experience with languages like SQL.