Mala Mahadevan discusses how to perform a Chi Square test:

For any dataset to lend itself to the Chi Square test it has to fit the following conditions –

1 Both variables are categorical (in this case – exposure to smoking – yes/no, and health condition – sick/not sick are both categorical).

2 Researchers used a random sample to collect data.

3 Researchers had an adequate sample size.Generally the sample size should be at least 100.

4 The number of respondents in each cell should be at least 5.

This is an easy case for using R over T-SQL—the Chi Square test is built in, whereas you have to roll your own T-SQL code. Mala does show you how to do this from within SQL Server R Services as well.

Kevin Feasel

2016-09-14

Data Science, R, T-SQL