Combinatorics With Joins

Dmitry Zaytsev explains the math behind why query plans can be so inefficient when dealing with a large number of joins:

Let’s talk about the sequence of table joins in detail. It is very important to understand that the possible number of table joins grows exponentially, not linearly. Fox example, there are only 2 possible methods to join 2 tables, and the number can reach 12 methods for 3 tables. Different join sequences can have different query cost, and SQL Server optimizer must select the most optimal method. But when the number of tables is high, it becomes a resource-intensive task. If SQL Server begins going over all possible variants, such query may never be executed. That is why, SQL Server never does it and always looks for a quite good plan, not the best plan. SQL Server always tries to reach compromise between execution time and plan quality.

There are ways you can help the optimizer, and one of my favorite query tuning books was all about table selection.

Related Posts

Non-Blocking Aggregations

Daniel Hutmacher tilts at windmills: It’s not entirely uncommon to want to group by a computed expression in an aggregation query. The trouble is, whenever you group by a computed expression, SQL Server considers the ordering of the data to be lost, and this will turn your buttery-smooth Stream Aggregate operation into a Hash Match […]

Read More

When Stream Aggregates Require Sorting

Itzik Ben-Gan continues his series on grouping and aggregation: Let’s try and figure out the costing formula for the Sort operator. Remember, our focus is the estimated cost and scaling because our ultimate goal is to figure out optimization thresholds where the optimizer changes its choices from one strategy to another. The I/O cost estimate […]

Read More


June 2017
« May Jul »