Parallelization With Rcpp

Kevin Feasel



Blazej Moska demonstrates how to use Rcpp to parallelize R code:

One of the frustrating moments while working with data is when you need results urgently, but your dataset is large enough to make it impossible. This happens often when we need to use algorithm with high computational complexity. I will demonstrate it on the example I’ve been working with.

Suppose we have large dataset consisting of association rules. For some reasons we want to slim it down. Whenever two rules consequents are the same and one rule’s antecedent is a subset of second rule’s antecedent, we want to choose the smaller one (probability of obtaining smaller set is bigger than probability of obtaining bigger set).

Read the whole thing.

Related Posts

Packages For Testing R Packages

Maelle Salmon shows us how to test our R packages within R: If you’re brand-new to unit testing your R package, I’d recommend reading this chapter from Hadley Wickham’s book about R packages. There’s an R package called RUnit for unit testing, but in the whole post we’ll mention resources around the testthat package since it’s the one we use in […]

Read More

Reshaping Data Frames With tidyr

Anisa Dhana shows off some of the data reshaping functionality available in the tidyr package: As it is shown above, the variable agegp has 6 groups (i.e., 25-34, 35-44) which has different alcohol intake and smoking use combinations. I think it would be interesting to transform this dataset from long to wide and to create a column for each […]

Read More


January 2018
« Dec Feb »