Calculating Median In SQL Server 2019

Tomaz Kastrun shows that batch aggregation mode on window functions allow PERCENTILE_CONT finally to become useful:

Next query, for median calculation was a window function query.


SELECT DISTINCT

PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY c1)

OVER (PARTITION BY (SELECT 1)) AS MedianCont

FROM t1


To my surprise, the performance was even worse, and at this time, I have to say, I was running this on SQL Server 2017 with CU7. But luckily, I had a SQL Server 2019 CTP 2.0 also installed and here, with no further optimization the query ran little over 1 second.

I’ve warned people away from this function for all but tiny data sets because of how poorly it performs. With SQL Server 2019, I might be able to recommend it.

Related Posts

Logical Equivalence and Query Tuning

Erik Darling didn’t warn us that there would be math: Often when query tuning, I’ll try a change that I think makes sense, only to have it backfire. It’s not that the query got slower, it’s that the results that came back were wrong different. Now, this can totally happen because of a bug in previously used […]

Read More

Pitfalls with Window Functions

Itzik Ben-Gan takes us through two issues you might run into when using window functions: There are two common pitfalls involving window functions, both of which are the result of counterintuitive implicit defaults that are imposed by the SQL standard. One pitfall has to do with calculations of running totals where you get a window […]

Read More

1 Comment

  • […] As Lukas implies, SQL Server is a step behind in terms of calculating percentiles, and calculating several percentiles over a large data set will be slow. Very slow. Though batch mode processing in 2019 does help here. […]

Comments are closed

Categories

January 2019
MTWTFSS
« Dec Feb »
 123456
78910111213
14151617181920
21222324252627
28293031