Poorly-Performing Parallel Queries

Joe Obbish shows off how skewed data can cause SQL Server parallelism to perform poorly in certain scenarios:

The query above is designed to not be able to take advantage of parallelism. The useless repartition streams step and the spill to tempdb suggest that the query might perform better with a MAXDOP 1 hint. With a MAXDOP 1 hint the query runs with an average time of 2473 ms. There is no longer a spill to tempdb.

What happens if the query is run with MAXDOP 3? Earlier I said that the hashing function or thread boundaries can change based on DOP. With MAXDOP 3 I get a much more even row distribution on threads:

I think the number of cases where it makes sense to use a specific, non-1 MAXDOP hint is pretty small, but here’s one of them.  The problem is that if this data changes regularly, the skewness of the data could change along with it, making your brilliant optimization unnecessary or even harmful.

Related Posts

Waiting on Stats Refreshes

Erik Darling looks at a new wait type: I mean ever really wondered just how long a query of yours waited on stats to automatically update before running? Have you ever been absolutely, positively sure that those gosh dang darn stats updates were putting your query on the trolley to Slowsville? Your wishes will be […]

Read More

Statistics and Multiple Single-Column Indexes

Erik Darling is fusing together queries like Dr. Frankenstein in his lab: You may have noticed that both queries get pretty bad estimates. You might even be thinking about leaving me a comment to update stats. The thing is that I created these indexes, which means they get stats built with a full scan, and […]

Read More

Categories

September 2017
MTWTFSS
« Aug Oct »
 123
45678910
11121314151617
18192021222324
252627282930