Tips for Optimizing Dedicated SQL Pools in Synpase Analytics

Tsuyoshi Matsuzaki shares some tips for improving query performance when using Dedicated SQL Pools in Azure Synapse Analytics:

By above BROADCAST_MOVE operation, the rows in dimension_City table are all copied in a temporary table (called TEMP_ID_3) on all distributed database. (See below.)
Since the size of dimension_City is small, then all rows in this table is duplicated in all database before joining. This time, we join only 2 tables, however, if a lot of tables are needed to join, this data movement will become large overhead for query execution.

The short version is, replicate smaller dimensions and align distribution keys for large tables which get joined together. Both of these minimize the changes of the engine needing to shuffle data between nodes. These sorts of things can make a huge difference when working with Dedicated SQL Pools, cutting query time down by an order of magnitude in some extreme cases.

M	T	W	T	F	S	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31