Getting Started With dplyr

Kevin Feasel



Abdul Majed Raja has a dplyr tutorial:

dplyr is one of the most popular r-packages and also part of tidyverse that’s been developed by Hadley Wickham. The mere fact that dplyr package is very famous means, it’s one of the most frequently used. Being a data scientist is not always about creating sophisticated models but Data Analysis (Manipulation) and Data Visualization play a very important role in BAU of many us – in fact, a very important part before any modeling exercise since Feature Engineering and EDA are the most important differentiating factors of your model and someone else’s.
Hence, this post aims to bring out some well-known and not-so-well-known applications of dplyr so that any data analyst could leverage its potential using a much familiar – Titanic Dataset.

This covers the main pieces  of dplyr, including its pipeline.  dplyr is a key part of the tidyverse, and knowing it well makes R so much easier.  H/T R-Bloggers

Related Posts

Solving The Monty Hall Problem With R

Miroslav Rajter builds a Monty Hall problem simulator using R: The original and most simple scenario of the Monty Hall problem is this: You are in a prize contest and in front of you there are three doors (A, B and C). Behind one of the doors is a prize (Car), while behind others is […]

Read More

Control Table Keys In cdata

John Mount announces a new feature in the cdata package: In our cdata R package and training materials we emphasize the record-oriented thinking and how to design a transform control table. We now have an additional exciting new feature: control table keys.The user can now control which columns of a cdata control table are the keys, including now using composite keys (that is […]

Read More


December 2017
« Nov Jan »