Using The ROWVERSION Type For ETL

Max Vernon shows us how to use the ROWVERSION data type to tell how much work you have to do to ETL data over from one table to another:

The OLTP table implements a rowversion column that is automatically updated whenever a row is updated or inserted. The rowversion number is unique at the database level, and increments monotonically for all transactions that take place within the context of that database. The dbo.OLTP_Updates table is used to store the minimum row version available inside the transaction used to copy data from the OLTP table into the OLAP table. Each time this code runs it captures incremental changes. This is far more efficient than comparing all the rows in both tables using a hashing function since this method doesn’t require reading any data other than the source data that is either new, or has changed.

I think this is the first time I’ve seen someone use ROWVERSION types successfully.

Related Posts

Arrays in Azure Data Factory

Rayis Imayev takes us through arrays in Azure Data Factory: Currently, there are 3 data types supported in ADF variables: String, Boolean, and Array. The first two are pretty easy to use: Boolean for logical binary results and String for everything else, including the numbers (no wonder there are so many conversion functions in Azure Data Factory that we can […]

Read More

Building an ARM Template for Azure Data Factory

Andy Leondard takes the first steps to building an Azure Data Factory pipeline using Azure Resource Manager Templates: Azure Resource Manager, or ARM, “allows you to provision your applications using a declarative template.” So says the Azure Quickstart Templates page. ARM templates are JSON and allow administrators to import and export Azure resources using varying management patterns. […]

Read More

Categories

February 2019
MTWTFSS
« Jan Mar »
 123
45678910
11121314151617
18192021222324
25262728