Useful dplyr Functions

Kevin Feasel

2017-07-12

R

S. Richter-Walsh explains seven important dplyr functions with plenty of examples:

There are many useful functions contained within the dplyr package. This post does not attempt to cover them all but does look at the major functions that are commonly used in data manipulation tasks. These are:

select()
filter()
mutate()
group_by()
summarise()
arrange()
join()

The data used in this post are taken from the UCI Machine Learning Repository and contain census information from 1994 for the USA. The dataset can be used for classification of income class in a machine learning setting and can be obtained here.

That’s probably the bare minimum you should know about dplyr, but knowing just these seven can make data analysis in R much easier.

Related Posts

Economic Articles With Data Included

Sebastian Kranz has a Shiny app to help you find economic papers with included data: One gets some information about the size of the data files and the used code files. I also tried to find and extract a README file from each supplement. Most README files explain whether all results can be replicated with […]

Read More

Giving A Name To The R Pipe

John Mount noodles an idea from Hadley Wickham: I’d say this fails on at least two counts, the first “%then%” doesn’t seem grammatical (as d is a noun), and magrittr pipes can’t be associated with a new name (as they are implemented by looking for theirselves by name in captured unevaluated code). However, the wrapr dot arrow pipe can take on new names. […]

Read More

Categories

July 2017
MTWTFSS
« Jun Aug »
 12
3456789
10111213141516
17181920212223
24252627282930
31