Handling Late Arrivals In SSIS

Peter Schott shows us a pattern for dealing with late-arriving dimension members in SQL Server Integration Services ETL packages:

The general steps are

  1. Set up your source query.

  2. Pass the data through a Lookup for your Dimension with the missing results routed to a “No Match” output.

  3. Insert those “No Match” rows into your Dimension using a SQL task – checking to make sure that this particular row hasn’t already been inserted (this is important).

  4. Do another lookup using a “Partial Cache” to catch these newly-inserted members.

  5. Use a UNION ALL transform to bring the existing and late-arriving members together.

Click through for more information and a helpful package diagram.

Related Posts

Data Transformation Tools In The Azure Space

James Serra gives us an overview of the major tools you would use for ETL and ELT in Azure: If you are building a big data solution in the cloud, you will likely be landing most of the source data into a data lake.  And much of this data will need to be transformed (i.e. […]

Read More

Apache Airflow Now A Top-Level Project

Fokko Driesprong announces that Apache Airflow is now a top-level Apache project: Today is a great day for Apache Airflow as it graduates from incubating status to a Top-Level Apache project. This is the next step of maturity for Airflow. For those unfamiliar, Airflow is an orchestration tool to schedule and orchestrate your data workflows. From […]

Read More

Categories

October 2017
MTWTFSS
« Sep Nov »
 1
2345678
9101112131415
16171819202122
23242526272829
3031