John Mount shares some performance measures pitting data.table against various dplyr methods for calculating grouped means:
In this reproduction attempt we see:
– Thedplyrtime being around 0.05 seconds. This is about 5 times slower than claimed.
– Thedplyrsum()/n()time is about 0.2 seconds, about 5 times faster than claimed.
– Thedata.tabletime being around 0.004 seconds. This is about three times as fast as thedplyrclaims, and over ten times as fast as the actual observeddplyrbehavior.
Read the whole thing. If you want to replicate it yourself, check out the RMarkdown file.