Kathi Kellenberger takes us through the PERCENTILE_CONT
window function:
I was recently playing with the analytical group of windowing functions, and I wanted to understand how they worked “under the covers.” I ran into a little logic puzzle with PERCENTILE_CONT
by trying to write a query that returned the same results using pre-2012 functionality.
Given a list of ranked values, you can use the PERCENTILE_CONT
function to find the value at a specific percentile. For example, if you have the grades of 100 students, you can use PERCENTILE_CONT
to locate the score in the middle of the list, the median, or at some other percent such as the grade at 90%. This doesn’t mean that the score was 90%; it means that the position of the score was at the 90th percentile. If there is not a value at the exact location, PERCENTILE_CONT
interpolates the answer.
I’m a bit disappointed with how poorly PERCENTILE_CONT
performs against large data sets, especially if you need multiple percentiles. It’s bad enough that going into ML Services and getting percentiles with R is usually faster for me. But for datasets of less than 100K or so rows, it’s the easiest non-CLR method to get the median (with the easiest CLR method being SQL#).