Parallelizing Linear Regression With MapReduce

Kevin Feasel

2018-06-25

R

Arthur Charpentier shows us the math behind using MapReduce to parallelize a linear regression:

Sometimes, with big data, matrices are too big to handle, and it is possible to use tricks to numerically still do the map. Map-Reduce is one of those. With several cores, it is possible to split the problem, to map on each machine, and then to aggregate it back at the end.

Arthur gives us an interesting example in R to boot.

Related Posts

Combining Plots In R With cowplot

Abdul Majed Raja shows how to use the cowplot library in R to merge together independent plots into a single image: The way it works in cowplot is that, we have assign our individual ggplot-plots as an R object (which is by default of type ggplot). These objects are finally used by cowplot to produce […]

Read More

AzureR Packages In Cran

David Smith points out that the Azure packages for R are now in CRAN: The suite of AzureR packages for interfacing with Azure services from R is now available on CRAN. If you missed the earlier announcements, this means you can now use the install.packages function in R to install these packages, rather than having to install from the […]

Read More

Categories

June 2018
MTWTFSS
« May Jul »
 123
45678910
11121314151617
18192021222324
252627282930