Slicing In R

Kevin Feasel



John Mount recommends learning about the array slicing system in R:

R has a very powerful array slicing ability that allows for some very slick data processing.

Suppose we have a data.frame “d“, and for every row where d$n_observations < 5 we wish to “NA-out” some other columns (mark them as not yet reliably available). Using slicing techniques this can be done quite quickly as follows.

d[d$n_observations < 5, qc(mean_cost, mean_revenue, mean_duration)] <- NA

Read on for more.  In general, I prefer the pipeline mechanics offered with the Tidyverse for readability.  But this is a good example of why you should know both styles.

Related Posts

Timing R Function Calls

Colin Gillespie shows off an R package for benchmarking: Of course, it’s more likely that you’ll want to compare more than two things. You can compare as many function calls as you want with mark(), as we’ll demonstrate in the following example. It’s probably more likely that you’ll want to compare these function calls against more […]

Read More

Exploratory Data Analysis with inspectdf

Laura Ellis continues a dive into Exploratory Data Analysis, this time using the inspectdf package: I like this package because it’s got a lot of functionality and it’s incredibly straightforward to use. In short, it allows you to understand and visualize column types, sizes, values, value imbalance & distributions as well as correlations. Better yet, […]

Read More


April 2018
« Mar May »