John Mount shows us that `data.table`

is way faster for sorting than `dplyr`

‘s `arrange`

function:

Notice on the above semi-log plot the run time ratio is growing roughly linearly. This makes sense:

`data.table`

uses a radix sort which has the potential to perform in near linear time (faster than the`n log(n)`

lower bound known comparison sorting) for a range of problems (also we are only showing example sorting times, not worst-case sorting times).In fact, if we divide the

`y`

in the above graph by`log(rows)`

we get something approaching a constant.

John has also provided us with a markdown document for comparison.

Kevin Feasel

2018-08-14

R