Measuring Correlation In SQL

Phil Factor shows how to calculate Kendall’s Tau and Spearman’s Rho in SQL:

Kendall’s Tau rank correlation is a handy way of determining how correlated two variables are, and whether this is more than chance. If you just want a measure of the correlation then you don’t have to assume very much about the distribution of the variables. Kendall’s Tau is popular with calculating correlations with non-parametric data. Spearman’s Rho is possibly more popular for the purpose, but Kendall’s tau has a distribution with better statistical properties (the sample estimate is close to a population variance) so confidence levels are more reliable, but in general, Kendall’s tau and Spearman’s rank correlation coefficient are very similar. The obvious difference between them is that, for the standard method of calculation,  Spearman’s Rank correlation required ranked data as input, whereas the algorithm to calculate Kendall’s Tau does this for you.  Kendall’s Tau consumes any non-parametric data with equal relish.

Kendall’s Tau is easy to calculate on paper, and makes intuitive sense. It deals with the probabilities of observing the agreeable (concordant) and non-agreeable (discordant) pairs of rankings. All observations are paired with each of the others, A concordant pair is one whose members of one observation are both larger than their respective members of the other paired observation, whereas discordant pairs have numbers that differ in opposite directions. Kendall’s Tau-b takes tied rankings into account.

I appreciate Phil putting this series together.  I’d probably stick with R, but it’s good to have options.

Related Posts

Using RAISERROR For Debug Info

Doug Lane exhorts people to use RAISERROR instead of PRINT when printing messages: It wasn’t until a few years ago, when I started contributing to the First Responder Kit at Brent Ozar Unlimited, that I noticed every status message in the kit scripts was thrown with something other than PRINT. Strange, I thought, since those scripts like […]

Read More

So You Want To Wait…

If you need your queries to be slower, Kenneth Fisher has you covered: And in case you run into a development team that complains that when they time their code the duration is all over the place, this little gem will make sure their query will always take the same amount of time (assuming normal […]

Read More

Categories

April 2017
MTWTFSS
« Mar May »
 12
3456789
10111213141516
17181920212223
24252627282930