Prioritizing Rows In A Union

Daniel Hutmacher shows several ways to combine data from multiple sources, prioritized by source:

For the sake of simplicity, I’ll assume that our example data has a clustered index on the “primary key” that we want to use to determine which rows have already been included in the set – in our example, the primary key is (FirstName, LastName). The data I’m using is a fictional example, but here’s the jist:

  • #Employees has about 33 000 rows.
  • #Customers has about 44 000 rows.
  • #Passengers has about 500 000 rows.

The data is constructed in a way that these queries should return 530 000 rows, so we’re looking at some overlap but far from totally overlapping rows.

Example #492,806 that T-SQL is not a true 4th Generation Language, that how you write the query can greatly matter for performance.

Related Posts

Dealing With String Parsing In T-SQL

Andy Mallon has written a T-SQL function to parse file paths from strings: Writing & reading code is easier if you understand the logic before attacking the code. I find this to be particularly important when you anticipate complicated code. SQL Server sucks at parsing strings, so I anticipate complicated code. How do you identify […]

Read More

The Joy Of OUTPUT

Mark Wilkinson shows off some fun stuff you can do with the OUTPUT clause: A common command in the Linux world is the tee command. What tee allows you to do is pipe the output of a command to a file as well as the console. This same functionality can be implemented using multiple OUTPUT clauses in a T-SQL statement. In […]

Read More

Categories

July 2017
MTWTFSS
« Jun Aug »
 12
3456789
10111213141516
17181920212223
24252627282930
31