SQL Server avoids joining to the Sales.Invoices table because it trusts the referential integrity maintained by the foreign key constraint defined on InvoiceID between Sales.InvoiceLines and Sales.Invoices; if a row exists in Sales.InvoiceLines, a row with the matching value for InvoiceID must exist in Sales.Invoices. And since we are only returning data from the Sales.InvoiceLines table, SQL Server doesn’t need to read any pages from Sales.Invoices at all.
We can verify that SQL Server is using the foreign key constraint to eliminate the join by dropping the constraint and running our query again:
ALTER TABLE [Sales].[InvoiceLines] DROP CONSTRAINT [FK_Sales_InvoiceLines_InvoiceID_Sales_Invoices];
Rob Farley has my favorite material on it. There’s an incredible amount of
lazinessingenuity built into the optimizer to keep your servers from doing unnecessary work.
That’s why I’d expect a query like this to throw away the join:
1234 SELECT COUNT(u.Id)FROM dbo.Users AS uJOIN dbo.Users AS u2ON u.Id = u2.Id;
After all, we’re joining the Users table to itself on the PK/CX. This doesn’t stand a chance at eliminating rows, producing duplicate rows, or producing NULL values. We’re only getting a count of the PK/CX, which isn’t NULLable anyway and…
So don’t do that.