Descriptive Statistics With SQL Server And R

Kevin Feasel



Mala Mahadevan digs into descriptive statistics:

With R integration into SQL Server 2016 we can pull an R script and integrate it rather easily. I will be covering all 3 approaches. I am using a small dataset – a single table with 915 rows, with a SQL Server 2016 installation and R Studio. The complexities of doing this type of analysis in the real world with bigger datasets involve setting various options for performance and dealing with memory issues – because R is very memory intensive and single threaded.

My table and the data it contains can be created with scripts here. For this specific post I used just one column in the table – age. For further posts I will be using the other fields such as country and gender.

Mala compares T-SQL versus R for calculating minimum, maximum, mean, and mode.  She wraps the post up by showing how to call her R code via T-SQL using SQL Server R Services.

Related Posts

Compacting Shared Libraries In R

Dirk Eddelbuettel compacts the tidyverse: Of course, there is a third way: just run strip --strip-debug over all the shared libraries after the build. As the path is standardized, and the shell does proper globbing, we can just do $ strip --strip-debug /usr/local/lib/R/site-library/*/libs/*.so using a double-wildcard to get all packages (in that R package directory) and all their shared […]

Read More

A dplyr Quiz

John Mount wants to know how well you understand dplyr: dplyr is one of the most popular R packages. It is powerful and important. But is it in fact easily comprehensible? dplyr makes sense to those of us who use it a lot. And we can teach part time R users a lot of the common good use patterns. But, is it an […]

Read More


July 2016
« Jun Aug »