When Window Functions are Too Slow

Bert Wagner shows a scenario where a window function ends up performing poorly:

If you’ve used FIRST_VALUE before, this query should be easy to interpret: for each badge Name, return the first UserId sorted by Date (earliest date to receive the badge) and UserId (pick the lowest UserId when there are ties on Date).

This query was easy to write and is simple to understand. However, the performance is not great: it takes 46 seconds to finish returning results on my machine.

Bert’s response is to rewrite the query using a correlated subquery. My first shot would look at using APPLY though needing to aggregate the “parent” could lead to an awful result there if the join happened before aggregation.

The moral of the story here is to know different ways to write a query, as you can nudge the optimizer to better (or worse) behavior.

Related Posts

Table Variables and Parallelism

Erik Darling shows off a neat trick for inserting with parallelism into a table variable: One of the many current downsides of @table variables is that modifying them inhibits parallelism, which is a problem #temp tables don’t have. While updating and deleting from @table variables is fairly rare (I’ve seen it, but not too often), you at minimum need an insert […]

Read More

Adaptive Joins and Index Width

Erik Darling wants to pump your indexes up: Now, there’s an Extended Event that… Used to work. These days it just stares blankly at me. But since I’ve worked with this before, I know the problem. It’s that Key Lookup — I’ll explain more in a minute. Adaptive joins won’t do all the work for you, […]

Read More

Categories

April 2019
MTWTFSS
« Mar May »
1234567
891011121314
15161718192021
22232425262728
2930