Slicing In R

Kevin Feasel



John Mount recommends learning about the array slicing system in R:

R has a very powerful array slicing ability that allows for some very slick data processing.

Suppose we have a data.frame “d“, and for every row where d$n_observations < 5 we wish to “NA-out” some other columns (mark them as not yet reliably available). Using slicing techniques this can be done quite quickly as follows.

d[d$n_observations < 5, qc(mean_cost, mean_revenue, mean_duration)] <- NA

Read on for more.  In general, I prefer the pipeline mechanics offered with the Tidyverse for readability.  But this is a good example of why you should know both styles.

Related Posts

Plotting ML Results In R

Bernardo Lares shows off the plots he creates in R to compare ML models: Split and compare quantiles This parameter is the easiest to sell to the C-level guys. “Did you know that with this model, if we chop the worst 20% of leads we would have avoided 60% of the frauds and only lose […]

Read More

Scatterplots For Multivariate Analysis

Neil Saunders declutters a complicated visual with a simple scatterplot: Sydney’s congestion at ‘tipping point’ blares the headline and to illustrate, an interactive chart with bars for city population densities, points for commute times and of course, dual-axes. Yuck. OK, I guess it does show that Sydney is one of three cities that are low density, […]

Read More


April 2018
« Mar May »