Basic Data Tidying

Kevin Feasel



Sarah Dutkiewicz tidies up a data set in R:

Looking at this data, the first thing I thought was untidy. There has to be a better way. When I think of tidy data, I think of the tidyr package, which is used to help make data tidy, easier to work with. Specifically, I thought of the spread() function, where I could break things up. Once data was spread into appropriate columns, I figure I can operate on the data a bit better.

Sarah has also made the data set available in case you’re interested in following along.

Related Posts

Linear Discriminant Analysis

Jake Hoare explains Linear Discriminant Analysis: Linear Discriminant Analysis takes a data set of cases (also known as observations) as input. For each case, you need to have a categorical variable to define the class and several predictor variables (which are numeric). We often visualize this input data as a matrix, such as shown below, with each case being a row and each variable a column. In this […]

Read More

Azure Data Lake Store File Management With httr

Leila Etaati shows how to generate RESTful statements in R using httr: In this post, I am going to share my experiment in how to do file management in ADLS using R studio, to do this you need to have below items 1. An Azure subscription 2. Create an Azure Data Lake Store Account 3. […]

Read More