John Mount helps us understand writing R code like a native:

This sort of difference, scalar oriented

`C++`

being so much faster than scalar oriented`R`

, is often distorted into “`R`

is slow.”

This is just not the case. If we adapt the algorithm to be vectorized we get an`R`

algorithm with performance comparable to the`C++`

implementation!

Not all algorithms can be vectorized, but this one can, and in an incredibly simple way. The original algorithm itself (`xlin_fits_R()`

) is a bit complicated, but the vectorized version (`xlin_fits_V()`

) is literally derived from the earlier one by crossing out the indices. That is: in this case we can move from working over very many scalars (slow in`R`

) to working over a small number of vectors (fast in`R`

).

This is akin to writing set-based SQL instead of cursor-based SQL: you’re thinking in terms which make it easier for the interpreter (or optimizer, in the case of a database engine) to operate quickly over your inputs. It’s also one of a few reasons why I think learning R makes a lot of sense when you have a SQL background.

Comments closed