Parallelism In R

Kevin Feasel

2017-09-06

R

Florian Prive shows off a few methods for parallelizing code in R:

Parallelize with foreach

You need to do at least two things:

  • replace %do% by %dopar%. Basically, always use %dopar% because you can use registerDoSEQ() is you really want to run the foreach sequentially.

  • register a parallel backend using one of the packages that begin with do (such as doParalleldoMCdoMPI and more). I will list only the two main parallel backends because there are too many of them.

Check it out.  Florian spends a lot of time with foreach and doParallel, a little bit of time with flock, and mentions Microsoft R Open.  H/T R-Bloggers

Related Posts

Setting Up SparklyR In Azure

David Smith shows how you can spin up a Spark cluster in Azure and install SparklyR on top of it: The SparklyR package from RStudio provides a high-level interface to Spark from R. This means you can create R objects that point to data frames stored in the Spark cluster and apply some familiar R paradigms (like dplyr) […]

Read More

Zippy Base R

John Mount defends the honor of base R: The graph summarizes the performance of four solutions to the “scoring logistic regression by hand” problem: Optimized Base R: a specialized “pre allocate and work with vectorized indices” method. This is fast as it is able to express our particular task in a small number of purely […]

Read More

Categories

September 2017
MTWTFSS
« Aug Oct »
 123
45678910
11121314151617
18192021222324
252627282930