Useful dplyr Functions

Kevin Feasel

2017-07-12

R

S. Richter-Walsh explains seven important dplyr functions with plenty of examples:

There are many useful functions contained within the dplyr package. This post does not attempt to cover them all but does look at the major functions that are commonly used in data manipulation tasks. These are:

select()
filter()
mutate()
group_by()
summarise()
arrange()
join()

The data used in this post are taken from the UCI Machine Learning Repository and contain census information from 1994 for the USA. The dataset can be used for classification of income class in a machine learning setting and can be obtained here.

That’s probably the bare minimum you should know about dplyr, but knowing just these seven can make data analysis in R much easier.

Related Posts

Vectors for Programmers

John Mount has a couple of videos available: We have just released two new free video lectures on vectors from a programmer’s point of view. I am experimenting with what ideas do programmers find interesting about vectors, what concepts do they consider safe starting points, and how to condense and present the material. Click through […]

Read More

Defining Tidy Data

John Mount shares thoughts about the concept of tidy data: A question is: is such a data set “tidy”? The paper itself claims the above definitions are “Codd’s 3rd normal form.” So, no the above table is not “tidy” under that paper’s definition. The the winner’s date of birth is a fact about the winner […]

Read More

Categories

July 2017
MTWTFSS
« Jun Aug »
 12
3456789
10111213141516
17181920212223
24252627282930
31