Getting Started With dplyr

Kevin Feasel

2017-12-29

R

Abdul Majed Raja has a dplyr tutorial:

dplyr is one of the most popular r-packages and also part of tidyverse that’s been developed by Hadley Wickham. The mere fact that dplyr package is very famous means, it’s one of the most frequently used. Being a data scientist is not always about creating sophisticated models but Data Analysis (Manipulation) and Data Visualization play a very important role in BAU of many us – in fact, a very important part before any modeling exercise since Feature Engineering and EDA are the most important differentiating factors of your model and someone else’s.
Hence, this post aims to bring out some well-known and not-so-well-known applications of dplyr so that any data analyst could leverage its potential using a much familiar – Titanic Dataset.

This covers the main pieces  of dplyr, including its pipeline.  dplyr is a key part of the tidyverse, and knowing it well makes R so much easier.  H/T R-Bloggers

Related Posts

The Theory Behind cdata

John Mount has a video explaining the concepts behind cdata: We also have two really nifty articles on the theory and methods: Fluid data reshaping with cdata Coordinatized Data: A Fluid Data Specification Please give it a try! Click through for the video, which I found very helpful in tying together a number of data […]

Read More

Microsoft R Open 3.4.3

David Smith announces Microsoft R Open 3.4.3: Microsoft R Open (MRO), Microsoft’s enhanced distribution of open source R, has been upgraded to version 3.4.3 and is now available for download for Windows, Mac, and Linux. This update upgrades the R language engine to the latest R (version 3.4.3) and updates the bundled packages (specifically: checkpoint, curl, doParallel, foreach, and iterators) to new versions. MRO is 100% compatible with […]

Read More

Categories

December 2017
MTWTFSS
« Nov Jan »
 123
45678910
11121314151617
18192021222324
25262728293031