Azure SQL Data Warehouse Patterns

Murshed Zaman shows us a couple of patterns and anti-patterns for Azure SQL Data Warehouse:

Azure SQL DW is a Massively Parallel Processing (MPP) data warehousing service. It is a service because Microsoft maintains the infrastructure and software patching to make sure it’s always on up to date hardware and software on Azure. The service makes it easy for a customer to start loading their tables on day one and start running queries quickly and allows scaling of compute nodes when needed.

In an MPP database, table data is distributed among many servers (known as compute or slave nodes), and in many MPP systems shared-nothing storage subsystems are attached to those servers. Queries come through a head (or master) node where the location metadata for all the tables/data blocks resides. This head node knows how to deconstruct the query into smaller queries, introduce various data movement operations as needed, and pass smaller queries on to the compute nodes for parallel execution. Data movement is needed to align the data by the join keys from the original query. The topic of data movement in an MPP system is a whole another blog topic by itself, that we will tackle in a different blog. Besides Azure SQL DW, some other examples of a MPP data warehouses are Hadoop (Hive and Spark), Teradata, Amazon RedShift, Vertica, etc.

The opposite of MPP is SMP (Symmetric Multiprocessing) which basically means the traditional one server systems. Until the invention of MPP we had SMP systems. In database world the examples are traditional SQL Server, Oracle, MySQL etc. These SMP databases can also be used for both OLTP and OLAP purposes.

Murshed spends the majority of this blog post covering things you should not do, which is probably for the best.

Related Posts

Azure Data Lake Alerting

Jose Lara shows how to send alerts if you hit a utilization threshold: If you want to see the step-by-step guide to create a new Log Analytics alert, check out our recent blog post on creating Log Analytics Alerts. For the alert signal logic, use the following values: Use the query from the previous step Set […]

Read More

Running The Azure DTU Calculator On An Older Server

Jim Donahoe shows us how to get the Azure DTU calculator running on an older server without Powershell: I recently had to do an analysis of a client’s database workload using the Azure DTU Calculator(DTU Calculator) and thought it might be interesting to share just how I did that.  I have run this tool numerous […]

Read More


September 2017
« Aug Oct »