Handling Late Arrivals In SSIS

Peter Schott shows us a pattern for dealing with late-arriving dimension members in SQL Server Integration Services ETL packages:

The general steps are

  1. Set up your source query.

  2. Pass the data through a Lookup for your Dimension with the missing results routed to a “No Match” output.

  3. Insert those “No Match” rows into your Dimension using a SQL task – checking to make sure that this particular row hasn’t already been inserted (this is important).

  4. Do another lookup using a “Partial Cache” to catch these newly-inserted members.

  5. Use a UNION ALL transform to bring the existing and late-arriving members together.

Click through for more information and a helpful package diagram.

Related Posts

Parallel Execution With SSIS Framework Community Edition

Andy Leonard has a great post on parallel execution with SSIS: Sit a spell and let Grandpa Andy tell yall a story about some data integratin’. Suppose for a minute that you’ve read and taken my advice about writing small, unit-of-work SSIS packages. I wouldn’t blame you for taking this advice. It’s not only online, […]

Read More

The Year Of The Data Engineer

Alex Woodie points out that data science also requires data engineers: The shortage of data scientists – those triple-threat types who possess advanced statistics, business, and coding skills – has been well-documented over the years. But increasingly, businesses are facing a shortage of another key individual on the big data team who’s critical to achieving […]

Read More

Categories

October 2017
MTWTFSS
« Sep Nov »
 1
2345678
9101112131415
16171819202122
23242526272829
3031