Traversing Foreign Keys Using Biml

Kevin Feasel


Biml, ETL

Ben Weissman has a two-part series on loading a set of tables based on foreign key constraints.  Part 1 is linear loads:

All our previous posts were running data loads in parallel, ignoring potential foreign key constraints. But in real life scenarios, your datawarehouse may actually have tables refering to each other using such, meaning that it is crucial to create and populate them in the right order.

In this blog post, we’ll actually solve 2 issues at once: We’ll provide a list of tables, will then identify any tables that our listed tables rely on (recursively) and will then create and load them in the right order.

In this sample, we’ll use AdventureWorksDW2014 as our source and transfer the FactInternetSales-table as well as all tables it is connected to through foreign key constraints. Eventually, we will create all these tables including the constraints in a new database AdventureWorksDW2014_SalesOnly (sorting them so we get no foreign key violations) and eventually populate them with data.

Part 2 is parallel loads:

After the first excitment about how easy it actually was to take care of that topology, you might ask yourself: Why does it have to run linear? That takes way too long. And you’re right – and it doesn’t have to.

All we need to do is:

– Create a list of all the tables that we’ve already loaded (which will be empty at that point)
– Identify all tables that do not reference any other tables
– Load these tables, each followed by all tables that only reference this single table – recursively and add them to list of loaded tables
– Once that is done, load all tables that are referencing multiple tables where all required tables have been loaded before – and again, add them to the list
– Repeat this until no table is left to load (or for a maximum of 10 times in this example)
– If, for whichever reason, any tables are left, load them sequentially using the TopoSort function:

This is a very interesting way of using Biml to traverse the foreign key tree.  I’ve normally used recursive CTEs in T-SQL to do the same, but I’ll have to play around with this method.

Related Posts

Using Biml With Azure Data Factory v2

Ben Weissman shows how you can use BimlStudio to build ADF v2 flows: As you may have seen at PASS Summit 2017 or another event, with the announcement of Azure Data Factory v2 (adf), Biml will natively support adf objects. Please note, that the native support is currently only available in BimlStudio 2018. If you’re […]

Read More

Building A Biml Project: Adding Tasks

Kevin Feasel



Martin Schoombee continues his series on building a Biml project by adding SSIS tasks: The <DirectInput> tag is pretty self-explanatory and executes the stored procedure (in this case). It could also have been a SQL query. The <Results> element contains a <Result> child element for each field returned in our query. I found this a little confusing at first, but the Name property’s […]

Read More


September 2016
« Aug Oct »