John Mount explains the motivation behind rqdatatable
and puts together a performance test:
rquery
is already one of the fastest and most teachable (due to deliberate conformity to Codd’s influential work) tools to wrangle data on databases and big data systems. And nowrquery
is also one of the fastest methods to wrangle data in-memory inR
(thanks todata.table
, via a thin adaption supplied byrqdatatable
).Teaching
rquery
and fully benchmarking it is a big task, so in this note we will limit ourselves to a single example and benchmark. Our intent is to use this example to promoterquery
andrqdatatable
, but frankly the biggest result of the benchmarking is how far out of the packdata.table
itself stands at small through large problem sizes. This is already known, but it is a much larger difference and at more scales than the typical non-data.table
user may be aware of.
Click through for the benchmark and information on how to grab the package before it goes into CRAN.