Press "Enter" to skip to content

Grouping Sets Of Tables In Biml ETL Loads

Ben Weissman puts together clusters of tables for data loads:

The table meta.containers could technically also be a temporary table. We’ve decided against that so you can see what’s happening behind the scenes.

Let’s focus on the meta.tables table for now. It has three columns:

– TableName – guess what we’ll store in there
– Container – this one will hold the information, which container we want this table to be loaded it, which will be automatically populated by our stored procedure
– Cost – this column will hold the load cost of this specific table. In our opinion, this should ideally be the average time it took SSIS to load this table in the recent past. If you don’t have that information available, it might as well something like the size of this table in either Gigabytes or Rows. The more accurate this column is, the better your results will be.

The only tricky part in Ben’s code is figuring out appropriate values for Cost, but if you’ve got rough timing measures or even good priors, you can get to a reasonable solution quickly.  And if time is of the essence, you can model, simulate, and apply results as part of an analytics project.