Measuring Correlation In SQL

Phil Factor shows how to calculate Kendall’s Tau and Spearman’s Rho in SQL:

Kendall’s Tau rank correlation is a handy way of determining how correlated two variables are, and whether this is more than chance. If you just want a measure of the correlation then you don’t have to assume very much about the distribution of the variables. Kendall’s Tau is popular with calculating correlations with non-parametric data. Spearman’s Rho is possibly more popular for the purpose, but Kendall’s tau has a distribution with better statistical properties (the sample estimate is close to a population variance) so confidence levels are more reliable, but in general, Kendall’s tau and Spearman’s rank correlation coefficient are very similar. The obvious difference between them is that, for the standard method of calculation,  Spearman’s Rank correlation required ranked data as input, whereas the algorithm to calculate Kendall’s Tau does this for you.  Kendall’s Tau consumes any non-parametric data with equal relish.

Kendall’s Tau is easy to calculate on paper, and makes intuitive sense. It deals with the probabilities of observing the agreeable (concordant) and non-agreeable (discordant) pairs of rankings. All observations are paired with each of the others, A concordant pair is one whose members of one observation are both larger than their respective members of the other paired observation, whereas discordant pairs have numbers that differ in opposite directions. Kendall’s Tau-b takes tied rankings into account.

I appreciate Phil putting this series together.  I’d probably stick with R, but it’s good to have options.

Related Posts

Computing Time to Payment on Invoices

Daniel Hutmacher has a painful but realistic problem to solve: Here’s an example customer. You’ll notice right off the bat that we’re sending this customer an invoice every day on the 20th of the month. To add some complexity, the customer will arbitrarily pay parts of the invoiced amount over time, and to add insult […]

Read More

Tracking Who Changed Data

Bert Wagner is on a quest to find out who moved his cheese: Have you ever wondered who was the last person (or process) to modify a piece of data in your database? SQL Server offers plenty of system views and functions that provide insight into how your server is running and how your queries are performing. […]

Read More


April 2017
« Mar May »