Poorly-Performing Parallel Queries

Joe Obbish shows off how skewed data can cause SQL Server parallelism to perform poorly in certain scenarios:

The query above is designed to not be able to take advantage of parallelism. The useless repartition streams step and the spill to tempdb suggest that the query might perform better with a MAXDOP 1 hint. With a MAXDOP 1 hint the query runs with an average time of 2473 ms. There is no longer a spill to tempdb.

What happens if the query is run with MAXDOP 3? Earlier I said that the hashing function or thread boundaries can change based on DOP. With MAXDOP 3 I get a much more even row distribution on threads:

I think the number of cases where it makes sense to use a specific, non-1 MAXDOP hint is pretty small, but here’s one of them.  The problem is that if this data changes regularly, the skewness of the data could change along with it, making your brilliant optimization unnecessary or even harmful.

Related Posts

Collecting Statistics Usage Info

Grant Fritchey shows us how (safely) to collect data on statistics usage: Years ago I was of the opinion that it wasn’t really possible to see the statistics used in the generation of a query plan. If you read the comments here, I was corrected of that notion. However, I’ve never been a fan of using […]

Read More

Understanding DBCC SHOW_STATISTICS Outputs

Bill Wolf continues his series on statistics by looking at what DBCC SHOW_STATISTICS gives you: When I was putting together the lesson plans for this, I wanted to make my own query for the comparisons, not borrow one from another site or blog.  Yes, I borrow plenty, but I wanted this to be mine.  When […]

Read More

Categories

September 2017
MTWTFSS
« Aug Oct »
 123
45678910
11121314151617
18192021222324
252627282930