In this note we will show how to speed up work in
Rby partitioning data and process-level parallelization. We will show the technique with three different
dplyr. The methods shown will also work with base-
Rand other packages.
For each of the above packages we speed up work by using
wrapr::execute_parallelwhich in turn uses
wrapr::partition_tablesto partition un-related
data.framerows and then distributes them to different processors to be executed.
rqdatatable::ex_data_table_parallelconveniently bundles all of these steps together when working with
There were some interesting results. I expected data.table to be fast, but did not expect dplyr to parallelize so well.