Prioritizing Rows In A Union

Daniel Hutmacher shows several ways to combine data from multiple sources, prioritized by source:

For the sake of simplicity, I’ll assume that our example data has a clustered index on the “primary key” that we want to use to determine which rows have already been included in the set – in our example, the primary key is (FirstName, LastName). The data I’m using is a fictional example, but here’s the jist:

  • #Employees has about 33 000 rows.
  • #Customers has about 44 000 rows.
  • #Passengers has about 500 000 rows.

The data is constructed in a way that these queries should return 530 000 rows, so we’re looking at some overlap but far from totally overlapping rows.

Example #492,806 that T-SQL is not a true 4th Generation Language, that how you write the query can greatly matter for performance.

Related Posts

Default Schemas In SQL Server

Daniel Hutmacher looks at specifying default schemas on a database: If your user is a database owner, (i.e. is a member of the db_owner group or has CONTROL permissions on the database) the default schema will always be dbo. This is something you can’t change. So if your legacy application needs quasi-administrative privileges in the database, you can’t make it […]

Read More

System Objects And Helper Functions

Kenneth Fisher compares and contrasts querying system tables versus using built-in helper functions which query the system tables: So which should you use? Well, like most things in the database world, it depends. Do you need to cross databases? Do you need to use NOLOCK for a diagnostic query? Do you need additional information available […]

Read More

Categories

July 2017
MTWTFSS
« Jun Aug »
 12
3456789
10111213141516
17181920212223
24252627282930
31