Useful dplyr Functions

Kevin Feasel

2017-07-12

R

S. Richter-Walsh explains seven important dplyr functions with plenty of examples:

There are many useful functions contained within the dplyr package. This post does not attempt to cover them all but does look at the major functions that are commonly used in data manipulation tasks. These are:

select()
filter()
mutate()
group_by()
summarise()
arrange()
join()

The data used in this post are taken from the UCI Machine Learning Repository and contain census information from 1994 for the USA. The dataset can be used for classification of income class in a machine learning setting and can be obtained here.

That’s probably the bare minimum you should know about dplyr, but knowing just these seven can make data analysis in R much easier.

Related Posts

Reasons For Using Docker With R

Jeroen Ooms gives us a few reasons why we might want to containerize our R-based products: The flagship of the OpenCPU system is the OpenCPU server: a mature and powerful Linux stack for embedding R in systems and applications. Because OpenCPU is completely open source we can build and ship on DockerHub. A ready-to-go linux server […]

Read More

Linear Discriminant Analysis

Jake Hoare explains Linear Discriminant Analysis: Linear Discriminant Analysis takes a data set of cases (also known as observations) as input. For each case, you need to have a categorical variable to define the class and several predictor variables (which are numeric). We often visualize this input data as a matrix, such as shown below, with each case being a row and each variable a column. In this […]

Read More

Categories

July 2017
MTWTFSS
« Jun Aug »
 12
3456789
10111213141516
17181920212223
24252627282930
31