Better Grouping With dplyr

Kevin Feasel

2017-07-13

R

John Mount builds a function to improve upon the group-by to mutate model in dplyr:

The advantages of the shorthand are:

  • The analyst only has to specify the grouping column once.
  • The data (mtcars) enters the pipeline only once.
  • The analyst doesn’t have to start thinking about joins immediately.

Frankly I’ve never liked the shorthand. I feel it is a “magic extra” that a new user would have no way of anticipating from common use of group_by() and summarize(). I very much like the idea of wrapping this important common use case into a single verb. Adjoining “windowed” or group-calculated columns is a common and important step in analysis, and well worth having its own verb.

Below is our attempt at elevating this pattern into a packaged verb.

Click through for the script.  I’d like to see something like this make its way into dplyr.

Related Posts

Installing R From Powershell

Tomaz Kastrun shows us how to install R and RStudio via Powershell: For the brevity of this post, I will only download couple of R packages from CRAN repository, but this list is indefinite.There are ways many ways to retrieve the CRAN packages for particular R version using powershell. I will just demonstrate this by […]

Read More

Solving The Monty Hall Problem With R

Miroslav Rajter builds a Monty Hall problem simulator using R: The original and most simple scenario of the Monty Hall problem is this: You are in a prize contest and in front of you there are three doors (A, B and C). Behind one of the doors is a prize (Car), while behind others is […]

Read More

Categories

July 2017
MTWTFSS
« Jun Aug »
 12
3456789
10111213141516
17181920212223
24252627282930
31