rqdatatable — Wrangling Lots Of Data, Fast

John Mount explains the motivation behind rqdatatable and puts together a performance test:

rquery is already one of the fastest and most teachable (due to deliberate conformity to Codd’s influential work) tools to wrangle data on databases and big data systems. And now rquery is also one of the fastest methods to wrangle data in-memory in R (thanks to data.table, via a thin adaption supplied by rqdatatable).

Teaching rquery and fully benchmarking it is a big task, so in this note we will limit ourselves to a single example and benchmark. Our intent is to use this example to promote rquery and rqdatatable, but frankly the biggest result of the benchmarking is how far out of the pack data.tableitself stands at small through large problem sizes. This is already known, but it is a much larger difference and at more scales than the typical non-data.table user may be aware of.

Click through for the benchmark and information on how to grab the package before it goes into CRAN.

Related Posts

wrapr 1.5.0 Now On CRAN

John Mount announces wrapr 1.5.0: wrapr includes a lot of tools for writing better R code: let() (let block) %.>% (dot arrow pipe) build_frame() / draw_frame() ( data.frame builders and formatters ) qc() (quoting concatenate) := (named map builder) %?% (coalesce) NEW! %.|% (reduce/expand args) NEW! uniques() (safe unique() replacement) NEW! partition_tables() / execute_parallel() NEW! DebugFnW() (function debug wrappers) λ() (anonymous function builder) John also includes an example using the coalesce operator %?%.

Read More

Methods For Detecting Anomalies In Business Metrics

Sergey Bryl’ gives us four methods for detecting anomalies in business data: In this article, by  business metrics, we mean numerical indicators we regularly measure and use to track and assess the performance of a specific business process. There is a huge variety of business metrics in the industry: from conventional to unique ones. The latter […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Categories

June 2018
MTWTFSS
« May  
 123
45678910
11121314151617
18192021222324
252627282930