Press "Enter" to skip to content

Category: Integration Services

SSIS Term Extraction

Tim Mitchell takes us through term extraction in SQL Server Integration Services:

The SSIS term extraction component is a data flow transformation that will aggregate the frequency of words found in a specified column supplied by an upstream data source. This component expects a Unicode text or text stream field as an input, and calculates as an output the frequency of the nouns and/or noun phrases in the specified source column. As shown below on the data flow surface, the term extraction component (circled) is always used as a transformation, accepting exactly one input and expecting either one or two outputs (the second being an optional error output).

This is one component I’ve never used before.

Comments closed

SSIS Project Connections

Tim Mitchell shows how we can use project connections in SQL Server Integration Services:

In most use cases, the same connection will be used across multiple packages in the same project. In early versions of SSIS (pre-2012), each package would have its own connection manager for every connection used in that package. Creating and maintaining all those connection managers could be time-consuming as the number of packages grows. Starting with SQL Server 2012, Microsoft added project connections to SSIS, allowing for the creation of connections that were accessible across all packages in a project. Instead of having to create a copy of each connection manager in every package, developers can now simply create the connection at the project level. Project connections will automatically show up in the connection manager tray for all packages in that project.

Click through to see how you can create one and get rid of per-package connections.

Comments closed

Databricks versus Mapping Data Flows

Helge Rege Gardsvoll contrasts Azure Databricks, Azure Data Factory Mapping Data Flows, and SQL Server Integration Services:

Mapping Data Flows
One of the many data flows from Microsoft these days providing, for the first time, data transformation capabilities within Data Factory. This is not a U-SQL script or Databricks notebook that is orchestrated from Data Factory, but a tool integrated. This means that you can reuse (many of) the datasets you have defined in Data Factory, while in Databricks you don’t.

Mapping Data Flows runs on top of Databricks, but the cluster is handled for you and you don’t have to write any of that Scala code yourself.

Read on for the full comparison.

Comments closed

SSIS Design Preferences

Meagan Longoria systematizes a set of preferences regarding Integration Services package and ETL process design:

– Every table should have InsertDateTime and UpdateDateTime columns. The UpdateDateTime column should be populated with the same value as the InsertDateTime column upon creation of the row, rather than being left null.
– Whatever you use to create tables, include primary keys, foreign keys, and indexes with your table definitions. Provide explicit constraint names to simplify database comparisons. You can disable your foreign keys, but they need to be there to provide that metadata.
– Separate your final dimensional/reporting tables from audit tables and staging tables. This can be done with separate schemas or even separate databases.

People have added some more thoughts in the comments as well.

Comments closed

Maintaining SSISDB

John McCormack was in a jam:

I made 2 unsuccessful attempts at running the SSIS Server Maintenance Job. However, after several hours of processing and still no available free space in the database, I knew the job wasn’t coping with the sheer number of rows it had to delete. The deletes all happen from the parent table (internal.operations) and then all child tables using using cascading deletes. This approach maintains referential integrity but is not great for performance.

Due to this, I needed a new approach to the maintenance of SSISDB. As we hadn’t maintained these tables for 13/14 months, I was asking too much of SQL Server to let me delete everything at once. 

Read on for the solution.

Comments closed

Drawing SSIS Packages

Bartosz Ratajczyk continues a quest to draw SSIS packages as SVGs:

To get the Value and Expression properties I need to find the precedence constraint in the .dtsx file during the XSL transformations. It requires three changes in the package2svg.xsl:

– I have to pass the name of the .dtsx file
– I have to read the XML from the .dtsx file
– I have to use the DTS namespace because it’s the namespace of the .dtsx file

Read on for more. Bartosz to this point has covered the control flow.

Comments closed

Drawing SSIS Packages as SVGs

Bartosz Ratajczyk continues a series on taking SSIS packages and generating SVGs from their control flows:

To make things harder, the layout of the sequences and tasks is not some nested XML structure. All of the elements have the same parent – <GraphLayout>, meaning all of them are at the same tree level. Also – there is no attribute showing where a particular object belongs. Almost. In the example with the sequences, I see two regularities:
– the outer container is placed later in the XML, than the inner container
– the @Id attributes show the nesting of the objects

I’m not sure how often I’d use this in practice, but if you want to understand some of the internals of SSIS, this is an interesting series to follow.

Comments closed

Executing Azure SSIS Packages from Blob Storage

Andy Leonard cranks it to the next level:

I confess: I have been waiting for this feature since I first learned of Azure-SSIS.

When I first saw Azure-SSIS – which creates an Azure Data Factory Integration Runtime and SSIS Catalog in the cloud, my first thought was a paraphrase Ferris Bueller’s question about dying the river green: “If we can execute SSIS packages from the SSIS Catalog in Azure Data Factory, why can’t we execute SSIS packages from Azure Blob Storage?” Today, we can.

Read on to see how you can do it.

Comments closed

Converting Existing SSIS Packages to Biml

David Stein shows off a conversion tool built into BimlExpress:

BimlExpress is a free Visual Studio add-in created by the good folks at Varigence. Its a full featured Biml editor which allows you to dynamically create SSIS packages. It was first released back in 2017, and the latest version is 2019 (of course). The current version supports Visual Studio 2010 through 2019 as well as SQL Server 2005 through 2019.

Prior to it’s release, Biml was written with Bids Helper, now known as BI Developer Extensions. While BI Developer Extensions has many nice features, you should no longer use it to work with Biml as it is no longer being updated/supported.

I’m pleasantly surprised by this. It used to be limited to BimlStudio (nee Mist) and BimlOnline.

Comments closed