Change Detection With Hashes

Nigel Meakins shows how to use HashBytes to roll your own change detection:

So this all sounds very promising as a way of tracking changes to our Data Warehouse data, for purposes such as extracting deltas, inserts and updates to Type I and II dimensions and so forth. It doesn’t have any show-stopping overhead for the hashing operations for the sizes of data typically encountered and storage isn’t going to be an issue. It is native to T-SQL so we can rerun our hash value generation in the engine where our data resides rather than having to push through SSIS or some other tool to generate this for us. Algorithms are universal and as such will give us the same values wherever used for the same bytes of input. Let’s go back to the basic idea for a minute and consider how we implement this.

This is particularly useful in cases where you have metadata columns you don’t much care about (e.g., last modified time).  I do recommend using CONCAT or CONCAT_WS (if you’re on SQL Server 2017) to do string concatenation, though; it’d remove the need for util.CastAsNVarchar and possibly more.

Related Posts

Re-Shaping Data Flows

Maneesh Varshney explains some methods to trim the fat out of analytical data flows: Big data comes in a variety of shapes. The Extract-Transform-Load (ETL) workflows are more or less stripe-shaped (left panel in the figure above) and produce an output of a similar size to the input. Reporting workflows are funnel-shaped (middle panel in […]

Read More

Mapping File Shenanigans With The Import/Export Wizard

Angela Henry ran into problems copying a boatload of data from a mainframe-hosted DB2 server and has lived to tell the tale: This post talks about the issue I ran into with SSIS Mapping Files. We currently run DB2 on an IBM iSeries AS400 for our ERP system.  I was tasked with copying data from […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

Categories

June 2017
MTWTFSS
« May  
 1234
567891011
12131415161718
19202122232425
2627282930