Moving From reshape2 To tidyr

Kevin Feasel

2017-12-19

R

Martin Johnsson talks about a couple tricky bits when moving from reshape2 to tidyr:

In practice, I don’t think people always take their data frames all the way to tidy. For example, to make a scatterplot, it is convenient to keep a couple of variables as different columns. The key is that we need to move between different forms rapidly (brain time-rapidly, more than computer time-rapidly, I might add).

And not everything should be organized this way. If you’re a geneticist, genotypes are notoriously inconvenient in normalized form. Better keep that individual by marker matrix.

The first serious piece of R code I wrote for someone else was a function to turn data into long form for plotting. I suspect plotting is often the gateway to tidy data. The function was like what you’d expect from R code written by a beginner who comes from C-style languages: It reinvented the wheel, and I bet it had nested for loops, a bunch of hard bracket indices, and so on. Then I discovered reshape2.

I’d not used reshape2 before, having started with tidyr, so it was interesting to see the contrast.

Related Posts

The Lesser-Known Apply Functions In R

Andrew Treadway covers a few of the lesser-known apply functions in R: rapply Let’s start with rapply. This function has a couple of different purposes. One is to recursively apply a function to a list. We’ll get to that in a moment. The other use of rapply is to a apply a function to only those elements in […]

Read More

Controlling Azure Services In R With AzureR

Hong Ooi announces a new set of packages called AzureR: As background, some of you may remember the AzureSMR package, which was written a few years back as an R interface to Azure. AzureSMR was very successful and gained a significant number of users, but it was never meant to be maintainable in the long term. As […]

Read More

Categories

December 2017
MTWTFSS
« Nov Jan »
 123
45678910
11121314151617
18192021222324
25262728293031