Press "Enter" to skip to content

Category: Integration Services

Looping Over Files In SSIS

Tim Mitchell shows us how to use the foreach loop component to iterate over a set of files:

The SSIS foreach loop is configured to allow you to easily ingest multiple data files in a single data flow task. For this to work, all of the files would need to reside in the same directory structure (either locally or on the network), and they must all have the same structure and metadata.
In this design, the data flow is contained within the foreach container, which will execute the contents of that data flow task once for each file found in the specified directory.

This gives us a good pattern for loading a bunch of text files, such as monthly extracts from a different system.

Comments closed

Tips For Migrating SSISDB

Kenneth Fisher shares some thoughts on SSISDB:

We’ve been doing a lot of upgrading recently and at one point had to move an instance from one 2016 server to another. In the process, we found out (the hard way) that it’s not that easy to move SSISDB (the SSIS Catalog that may or may not be named SSISDB). I mean it’s not hard, but it’s definitely not a basic backup/restore. The full BOL instructions on how to do this are here. That said, here are the elements that are involved.

Read on for the list as well as an order of operations.

Comments closed

Testing Package Properties With ssisUnit

Bartosz Ratajczyk shows how you can test certain properties on an Integration Services package using ssisUnit:

The command is simple. You can get or set the property using the value for given property path. As usual – when you get the value, you leave the value blank. The path – well – is the path to the element in the package or the project. You use backslashes to separate elements in the package tree, and at the end, you use .Properties[PropertyName] to read the property. If you use the elements collection – like connection managers – you can pick a single element using square brackets and the name of this element.

Read on for more, including limitations and useful testing scenarios.

Comments closed

Validating SSIS Packages Using T-SQL

Annie Xu shows us how to validate SSIS packages in the SSISDB catalog using T-SQL:

Recently, I need to do a data warehouse migration for a client. Since there might be some difference between the Dev environment source databases and Prod environment source databases. The migrated SSIS packages for building data warehouse might have some failures because of the changes. So the challenge is how can I validate all my DW packages (100 +) all at once.

Click through for the script.

Comments closed

Contrasting Integration Services And Pentaho Data Integration

Koen Verbeeck contrasts SQL Server Integration Services with Pentaho Data Integration:

For generating SSIS packages, you need to rely on Biml (much about that can be found on this blog or on the net), or older frameworks such as ezApi. Or you need 3rd party tools such as BimlStudio or TimeXtender. Using Biml means writing XML and .NET. Don’t get me wrong, I love Biml and I use it a lot in my SSIS projects.

But generating transformations in PDI is so much easier. First, you create a template (you create a transformation, but you leave certain fields empty, such as the source SQL statement and the destination table). Then you have another transformation reading metadata. This metadata is pushed to the template using the Metadata Injection Transformation. In this transformation, you point to the template and you map those empty fields to your metadata fields.

It’s interesting to see where each product stands out or falls flat compared to the other, and Koen’s comparison is definitely not a one-sided bout.

Comments closed

Using Biml To Read Excel Files Without Excel

Bill Fellows follows up on his prior post and shows how you can write BimlScript to parse an Excel file without having Microsoft Office installed:

My resources are quite simple: Excel Spreadsheet containing meta data, a driver program and a package template.

The template is your standard truncate and reload pattern with the target table being specified by a parameter. The client validates data by running processes in parallel so the existing mainframe process delivers data to the Billing table while ours delivers to a Billing_NEW table. Once they accept the new process, the target table becomes Billing and the NEW table is dropped. I decided the most native SSIS route would be use specify the target table in as a parameter. We originally have a boolean parameter indicating whether we were loading the new table or the production one but that was more logic and overhead that just specifying which table to load. I force their queries to be dirty reads as some of these queries can be rather messy.

Click through for the script and explanation.

Comments closed

Azure Data Factory Or Integration Services?

Teo Lachev contrasts use cases for Integration Services vesus Azure Data Factory V2:

So, ADF was incorrectly positioned as “SSIS for the Cloud” and unfortunately once that message made it out there was a messaging problem that Microsoft has been fighting ever since. Like Azure ML, on the glory road to the cloud things that were difficult with SSIS (installation, projects, deployment) became simple, and things that were simple became difficult. Naturally, Microsoft took a lot of criticism from the customers and community, including from your humble correspondent. ADF, or course, has nothing to do with SSIS, thus leaving many data integration practitioners with a difficult choice: should you take the risk and take the road less traveled with ADF, or continue with the tried-and-true SSIS for data integration on Azure?

To Microsoft’s credits, ADF v2 has made significant enhancements in features, usability, and maintainability. There is an also a “lift and shift” option to run SSIS inside ADF but since this architecture requires a VM, I consider it a narrow case scenario, such as when you need to extend ADF with SSIS features that it doesn’t have. Otherwise, why would you start new development with SSIS hosted under ADF, if you could provision and license the VM yourself and have full control over it?

All in all, Teo is not the biggest fan of ADF at this point and leans heavily toward SSIS; read on for the reasoning.

Comments closed

Azure Data Factory V2 Dependencies

Meagan Longoria has important notes on how Azure Data Factory V2 Dependencies differ from SQL Server Integration Services precedent constraints:

This sounds similar to SSIS precedence constraints, but there are a couple of big differences.

  1. SSIS allows us to define expressions to be evaluated to determine if the next task should be executed.
  2. SSIS allows us to choose whether we handle multiple constraints as a logical AND or a logical OR. In other words, do we need all constraints to be true or just one.

ADF V2 activity dependencies are always a logical AND. While we can design control flows in ADF similar to how we might design control flows in SSIS, this is one of several differences. Let’s look at an example.

Meagan gives us three methods of replicating SSIS functionality using ADF V2, so check it out.

Comments closed

An Update To ssisUnit

Bartosz Ratajczyk has added some functionality to ssisUnit:

Second – you can get and set the properties of the project and its elements. Like – overwriting project connection managers (I designed it with this particular need on my mind). You can now set the connection string the different server (or database) – in the PropertyPath of the PropertyCommand use \Project\ConnectionManagers, write the name of the connection manager with the extension, and use one of the Properties. You can do it during the Test setup (or all tests setup), but not during the test suite setup, as ssisUnit is not aware of the project until it loads it into the memory.

Good on Bartosz for resurrecting a stable but moribund project and adding some enhancements.

Comments closed

Azure Data Factory V2 Or SSIS?

Merrill Aldrich explains the differences between Azure Data Factory V2 and SQL Server Integration Services:

The next consideration is a bit more involved if you are new to data integration. Both of these tools excel at transporting data from place to place, but they have important differences in terms of what you can do to modify the data in transit. As a matter of emphasis, ADF has more features geared toward moving the data than performing any complex transformation along the way. SSIS, on the other hand, was built with a large library of transformations that you can chain together to make elaborate data flows including lookups, matching, splitting data, and more.

The tools also overlap quite a lot. In projects this seems to lead to the question of whether you’ll transform the data “in flight” using Extract Transform Load (ETL), or instead move the data to a destination where it’ll be transformed using Extract Load Transform (ELT).

These are not “pretty much the same thing” and Merrill does a good job of explaining what those differences in design mean for the products.

Comments closed