T-SQL And R Performance Comparisons

Kevin Feasel

2016-10-10

R, T-SQL

Tomaz Kastrun does several performance comparisons between various R packages and T-SQL constructs:

Couple of packages I will mention for data manipulations are plyr, dplyr and data.table and compare the execution time, simplicity and ease of writing with general T-SQL code and RevoScaleR package. For this blog post I will use R packagedplyr and T-SQL with possibilites of RevoScaleR computation functions.

My initial query will be. Available in WideWorldImportersDW database. No other alterations have been done to underlying tables (fact.sale or dimension.city).

Read on for code and conclusions.  I don’t think there are any shocking conclusions:  the upshot is to filter data as early as possible.

Related Posts

Taking A Random Walk

Dan Goldstein describes the basics of Brownian motion: I was sitting in a bagel shop on Saturday with my 9 year old daughter. We had brought along hexagonal graph paper and a six sided die. We decided that we would choose a hexagon in the middle of the page and then roll the die to […]

Read More

Joins And Parentheses

Shane O’Neill walks through different ways of grouping tables in a SQL query: Asker: that’d be awesome if i can inner join two other tables instead of the table mentioned after FROM keyword Me: …wait, what? A: He’s asking t1 left join t12 t1 left join t13 t12 inner join t13 M: em…it’s possible but it’s…iffy […]

Read More

Categories

October 2016
MTWTFSS
« Sep Nov »
 12
3456789
10111213141516
17181920212223
24252627282930
31