Data Manipulation In R

Kevin Feasel



Casimir Saternos has an article on matrix operations and other data transformations in R:

Operations that are conceptually simple can be difficult to perform using SQL.  Consider the common requirements to pivot or transpose a dataset.   Each of these actions are conceptually straightforward but are complex to implement using SQL.  The examples that follow are somewhat verbose, but the details are not significant. The main point is to illustrate is that, by using specialized functions outside of SQL,  R makes trivial some of those operations that would otherwise require complex SQL statements.  The contrast in the amount of code required is striking.  The simpler approach allows you to focus attention on the scientific or business problem at hand, rather than expending energy reading documentation or laboriously testing complex statements.

I consider this where the second-order value of R comes in.  The initial “wow” factor is in how easy you can plot things, and this ease of data cleansing is the next big time-saver.

Related Posts

Plotting ML Results In R

Bernardo Lares shows off the plots he creates in R to compare ML models: Split and compare quantiles This parameter is the easiest to sell to the C-level guys. “Did you know that with this model, if we chop the worst 20% of leads we would have avoided 60% of the frauds and only lose […]

Read More

Scatterplots For Multivariate Analysis

Neil Saunders declutters a complicated visual with a simple scatterplot: Sydney’s congestion at ‘tipping point’ blares the headline and to illustrate, an interactive chart with bars for city population densities, points for commute times and of course, dual-axes. Yuck. OK, I guess it does show that Sydney is one of three cities that are low density, […]

Read More


January 2016
« Dec Feb »