Loops Versus Apply: Speed Comparison

Kevin Feasel

2018-02-19

R

Mike Spencer compares lapply (single core and its multi-core version) versus a for loop in R:

But how fast were they? Can we get faster? Thankfully R provides `system.time()` for timing code execution. In order to get faster, it makes sense to use all the processing power our machines have. The ‘parallel’ library has some great tools to help us run our jobs in parallel and take advantage of multicore processing. My favourite is `mclapply()`, because it is very very easy to take an `lapply` and make it multicore. Note that mclapply doesn’t work on Windows. The following script runs the `read_clean_write()` function in a for loop (boo, hiss), lapply and mclapply. I’ve run these as list elements to make life easier later on.

It’s interesting reading, particularly because I had expected lapply to do a little bit better.  Also interesting is the relative overhead cost of mclapply in this scenario:  going from 1 core to 4 cut the time to approximately 1/3, not 1/4.

Related Posts

The Intuition Behind Principal Component Analysis

Holger von Jouanne-Diedrich gives us an intuition behind how principal component analysis (PCA) works: Principal component analysis (PCA) is a dimension-reduction method that can be used to reduce a large set of (often correlated) variables into a smaller set of (uncorrelated) variables, called principal components, which still contain most of the information.PCA is a concept […]

Read More

Plotting Diagrams In R With nest() And map()

Sebastian Sauer shows how to display multiple ggplot2 diagrams together using facets as well as a combination of the nest() and map() functions: One simple way is to plot several facets according to the grouping variable: d %>% ggplot() + aes(x = hp, y = mpg) + geom_point() + facet_wrap(~ cyl) Faceting is great, but it’s good to know […]

Read More

Categories

February 2018
MTWTFSS
« Jan Mar »
 1234
567891011
12131415161718
19202122232425
262728