The Shuffling Operator And Azure SQL DW

Arun Sirpal is ready to deal:

For the purposes of this post the TSQL shown is elementary (don’t be surprised by that), the point is really about SHUFFLE. So, I select the estimated plan for the following code.

SELECT SOD.[SalesOrderID],SOD.[ProductID], SOH.[TotalDue]
FROM [SalesLT].[SalesOrderDetail] SOD
JOIN [SalesLT].[SalesOrderHeader] SOH ON
SOH.[SalesOrderID] = SOD.[SalesOrderID]
WHERE SOH.[TotalDue] > 1000

Shuffle me once, why not shuffle me twice. If you REALLY want to see the EXPLAIN command output, then it looks like this snippet below.

The DSQL operation clearly states SHUFFLE_MOVE. Why am I getting this? What does it mean?

Shuffling data isn’t the worst thing in the world, but it is a fairly expensive operation all things considered.  Ideally, your warehouse architecture limits the number of shuffle operations, but considering that you can only hash on one key, sometimes it’s inevitable.

Related Posts

Automating Azure SQL Database Scaling

Arun Sirpal shows how to use Azure Logic Apps to auto-scale Azure SQL Database: When I was presenting my Azure SQL Database session at DataRelay (used to be SQLRelay) I was asked (over coffee) about auto scaling capabilities. Quite simply there is nothing out of the box to achieve this. The idea of auto scaling […]

Read More

Deploying An Azure Container Within A Virtual Network

Andrew Pruski shows us that you can now deploy an Azure container running SQL Server within an Azure virtual network: Up until now Azure Container Instances only had one option to allow us to connect. That was assigning a public IP address that was directly exposed to the internet. Not really great as exposing SQL […]

Read More

Categories

April 2018
MTWTFSS
« Mar May »
 1
2345678
9101112131415
16171819202122
23242526272829
30