For the purposes of this post the TSQL shown is elementary (don’t be surprised by that), the point is really about SHUFFLE. So, I select the estimated plan for the following code.
SELECT SOD.[SalesOrderID],SOD.[ProductID], SOH.[TotalDue] FROM [SalesLT].[SalesOrderDetail] SOD JOIN [SalesLT].[SalesOrderHeader] SOH ON SOH.[SalesOrderID] = SOD.[SalesOrderID] WHERE SOH.[TotalDue] > 1000Shuffle me once, why not shuffle me twice. If you REALLY want to see the EXPLAIN command output, then it looks like this snippet below.
The DSQL operation clearly states SHUFFLE_MOVE. Why am I getting this? What does it mean?
Shuffling data isn’t the worst thing in the world, but it is a fairly expensive operation all things considered. Ideally, your warehouse architecture limits the number of shuffle operations, but considering that you can only hash on one key, sometimes it’s inevitable.