Prioritizing Rows In A Union

Daniel Hutmacher shows several ways to combine data from multiple sources, prioritized by source:

For the sake of simplicity, I’ll assume that our example data has a clustered index on the “primary key” that we want to use to determine which rows have already been included in the set – in our example, the primary key is (FirstName, LastName). The data I’m using is a fictional example, but here’s the jist:

  • #Employees has about 33 000 rows.
  • #Customers has about 44 000 rows.
  • #Passengers has about 500 000 rows.

The data is constructed in a way that these queries should return 530 000 rows, so we’re looking at some overlap but far from totally overlapping rows.

Example #492,806 that T-SQL is not a true 4th Generation Language, that how you write the query can greatly matter for performance.

Related Posts

Recently Added String Functions

Lori Brown covers a few string functions added to SQL Server in the past two versions: STRING_ESCAPE (https://docs.microsoft.com/en-us/sql/t-sql/functions/string-escape-transact-sql) This function is available starting with SQL 2016 and is currently only able to escape JSON characters. To me it’s not super useful just yet but hopefully they will add more types soon. 1 SELECT STRING_ESCAPE(‘SQLRX’‘s beginning was […]

Read More

Nested Loops And Implicit Reordering

Dmitry Piliugin shows how the SQL Server optimizer can end up reordering data in a nested loops join to improve performance: The purpose is to minimize random access impact. If we perform an Index Seek (with a partial scan, probably) we read the entries in the index order, in our case, in the order of […]

Read More

Categories

July 2017
MTWTFSS
« Jun Aug »
 12
3456789
10111213141516
17181920212223
24252627282930
31