Scaling Azure Data Warehouse

Kevin Feasel

2016-08-01

Cloud

Vincent-Philippe Lauzon looks at how Azure Data Warehouse scales:

Which data gets stored in which database?

As long as you are doing simple select on one table and that your data is distributed evenly, you shouldn’t care, right?  The query will flow to the compute nodes, they will perform the query on each database and the result will be merged together by the control node.

But once you start joining data from multiple tables, ADW will have to swing data around from one database to another in order to join the data.  This is called Data Movement.  It is impossible to avoid in general but you should strive to minimize it to obtain better performance.

This is a look primarily at the underlying mechanics rather than testing a particular load.  Check it out.

Related Posts

Sizing Azure SQL Database

Arun Sirpal takes us through finding the right size for Azure SQL Database: Do you want to identify the correct Service Tier and Compute Size ( was once known as performance level) for your Azure SQL Database? How would you go about it? Would you use the DTU (Database Transaction Unit) calculator? What about the […]

Read More

Cleaning Up After Yourself in Azure Data Factory

Rayis Imayev shows how you can automatically delete old files in Azure Data Factory: File management may not be at the top of my list of priorities during data integration projects. I assume that once I learn enough about sourcing data systems and target destination platform, I’m ready to design and build a data integration […]

Read More

Categories

August 2016
MTWTFSS
« Jul Sep »
1234567
891011121314
15161718192021
22232425262728
293031