Descriptive Statistics With SQL Server And R

Kevin Feasel

2016-07-19

R

Mala Mahadevan digs into descriptive statistics:

With R integration into SQL Server 2016 we can pull an R script and integrate it rather easily. I will be covering all 3 approaches. I am using a small dataset – a single table with 915 rows, with a SQL Server 2016 installation and R Studio. The complexities of doing this type of analysis in the real world with bigger datasets involve setting various options for performance and dealing with memory issues – because R is very memory intensive and single threaded.

My table and the data it contains can be created with scripts here. For this specific post I used just one column in the table – age. For further posts I will be using the other fields such as country and gender.

Mala compares T-SQL versus R for calculating minimum, maximum, mean, and mode.  She wraps the post up by showing how to call her R code via T-SQL using SQL Server R Services.

Related Posts

Interactive ggplot Plots with plotly

Laura Ellis takes us through ggplotly: As someone very interested in storytelling, ggplot2 is easily my data visualization tool of choice. It is like the Swiss army knife for data visualization. One of my favorite features is the ability to pack a graph chock-full of dimensions. This ability is incredibly handy during the data exploration […]

Read More

Goodbye, gather and spread; Hello pivot_long and pivot_wide

John Mount covers a change in tidyr which mimics Mount and Nina Zumel’s pivot_to_rowrecs and unpivot_to_blocks functions in the cdata package: If you want to work in the above way we suggest giving our cdatapackage a try. We named the functions pivot_to_rowrecs and unpivot_to_blocks. The idea was: by emphasizing the record structure one might eventually internalize what the transforms […]

Read More

Categories

July 2016
MTWTFSS
« Jun Aug »
 123
45678910
11121314151617
18192021222324
25262728293031