I walk through a scenario which underscores the importance of normalization:

This joins my records to a tally table, which gives one row for each character in RemovedValue (that is, the numbers without recordset separators).  I then retain only the values which start a sequence, and use SUBSTRING to snatch up four digits. What I’m left with is a column named SplitVersion, which has one row for each customer, campaign, and 4-digit value (which is equivalent to my normalized table’s structure).

If that wasn’t exciting enough, we now need to slam this back together into our denormalized format, and that’s what tallyjoin does. It uses the FOR XML PATH trick to concatenate my four-digit values into one string, separated by commas. You might be wondering why I use comma instead of CHAR(30), and the answer is that converting CHAR(30) to XML returns a nasty result, so instead of trying to handle that, I use a character which is copacetic and translate it back using the REPLACE function after casting my “XML” result to varchar.

The implicit story here is, you can find someone who knows how to use tally tables, how to concatenate strings of data (quickly!), who knows how to tie various pieces of the puzzle together, and so on…or design the database the right way and avoid this pain later.

Related Posts

Polybase And HDInsight

I have a post up on trying to integrate Polybase with HDInsight: But now we run into a problem:  there are certain ports which need to be open for Polybase to work.  This includes port 50010 on each of the data nodes against which we want to run MapReduce jobs.  This goes back to the issue […]

Read More

Using Temporal Tables For SCD2

I have a post on pain that I experienced with temporal tables: This query succeeds but returns results we don’t really want: This brings back all 9 records tied to products 1 and 2 (because product 3 didn’t exist on July 2nd at 8 AM UTC). But it gives us the same start and end […]

Read More


June 2016
« May Jul »