SQL Data Warehouse Distribution Keys

Simon Whiteley explains the different distribution key options available in Azure SQL Data Warehouse and SQL Server APS:

Each record that is inserted goes onto the next available distribution. This guarantees that you will have a smooth, even distribution of data, but it means you have no way of telling which data is on which distribution. This isn’t always a problem!

If I wanted to perform a count of records, grouped by a particular field, I can perform this on a round-robin table. Each distribution will run the query in parallel and return it’s grouped results. The results can be simply added together as a second part of the query, and adding together 60 smaller datasets shouldn’t be a large overhead. For this kind of single-table aggregation, round-robin distribution is perfectly adequate!

However, the issues arise when we have multiple tables in our query. In order to join two tables. Let’s take a very simple join between a fact table and a dimension. I’ve shown 6 distributions for simplicity, but this would be happening across all 60.

Figuring out which distribution key to use can make a huge difference in performance.

Related Posts

Azure Dedicated Hosts in Preview

Mine Tokus covers the benefit of Azure Dedicated Hosts: Recently introduced, Azure Dedicated Host Preview provides single-tenant physical servers that can host one or more virtual machines. With this new hosting model, physical server is dedicated to your organization and capacity isn’t shared with other customers. Physical server-level isolation helps to address security and compliance requirements, brings visibility […]

Read More

Connecting to Redshift from Azure Analysis Services

Gilbert Quevauvilliers shows how we can connect to Amazon Redshift from Azure Analysis Services: I am busy working with a customer and had a challenge when using Azure Analysis Services to connect to Amazon Redshift via an ODBC connection. The first issue that I encountered was the following error: OLE DB or ODBC error: [Microsoft][ODBC […]

Read More

Categories