Azure SQL Data Warehouse Patterns

Murshed Zaman shows us a couple of patterns and anti-patterns for Azure SQL Data Warehouse:

Azure SQL DW is a Massively Parallel Processing (MPP) data warehousing service. It is a service because Microsoft maintains the infrastructure and software patching to make sure it’s always on up to date hardware and software on Azure. The service makes it easy for a customer to start loading their tables on day one and start running queries quickly and allows scaling of compute nodes when needed.

In an MPP database, table data is distributed among many servers (known as compute or slave nodes), and in many MPP systems shared-nothing storage subsystems are attached to those servers. Queries come through a head (or master) node where the location metadata for all the tables/data blocks resides. This head node knows how to deconstruct the query into smaller queries, introduce various data movement operations as needed, and pass smaller queries on to the compute nodes for parallel execution. Data movement is needed to align the data by the join keys from the original query. The topic of data movement in an MPP system is a whole another blog topic by itself, that we will tackle in a different blog. Besides Azure SQL DW, some other examples of a MPP data warehouses are Hadoop (Hive and Spark), Teradata, Amazon RedShift, Vertica, etc.

The opposite of MPP is SMP (Symmetric Multiprocessing) which basically means the traditional one server systems. Until the invention of MPP we had SMP systems. In database world the examples are traditional SQL Server, Oracle, MySQL etc. These SMP databases can also be used for both OLTP and OLAP purposes.

Murshed spends the majority of this blog post covering things you should not do, which is probably for the best.

Related Posts

Data Lakes Aren’t New

Shannon Lowder reveals one of the deep, dark data lake secrets: Turns out there are three basic zones or areas to a data lake. Raw, Managed, and Presentation. The raw zone should be optimized for fast storage.  The goal is to get the data in as quickly as possible.  Don’t make any changes to this […]

Read More

Master Data In Azure

Matt How explains why Master Data Services isn’t a great cloud-based master data management solution and offers up an alternative: Excel is easy to use, but not user friendly Excel is on nearly every desktop in any Windows based organisation and with the Master Data Services Add-in, it puts the data well within the reach […]

Read More

Categories

September 2017
MTWTFSS
« Aug Oct »
 123
45678910
11121314151617
18192021222324
252627282930