Bert Wagner shows a scenario where a window function ends up performing poorly:
If you’ve used
FIRST_VALUE
before, this query should be easy to interpret: for each badgeName
, return the firstUserId
sorted byDate
(earliest date to receive the badge) andUserId
(pick the lowest UserId when there are ties onDate
).This query was easy to write and is simple to understand. However, the performance is not great: it takes 46 seconds to finish returning results on my machine.
Bert’s response is to rewrite the query using a correlated subquery. My first shot would look at using APPLY
though needing to aggregate the “parent” could lead to an awful result there if the join happened before aggregation.
The moral of the story here is to know different ways to write a query, as you can nudge the optimizer to better (or worse) behavior.