Press "Enter" to skip to content

Category: Integration Services

Using the SSIS Hadoop Components

Hadi Fadlallah walks us through the HDFS file source and destination components:

To test these components, we will create an SSIS package and add three connection managers:

1. Hadoop Connection Manager: to connect with the Hadoop cluster (check the previous article)
2. OLE DB Connection Manager: to connect to SQL Server instance where AdventureWorks2017 database is stored
3. Flat File Connection Manager: We will use it to export data from HDFS Source:

I wonder if they ever fixed the 4K screen resolution problem (kind of tells you how often I use SSIS anymore…). That was one of the things which made these components unusable for me on any modern screen.

Comments closed

Deleting Packages from the SSIS Catalog

Mala Mahadevan performs important cleanup work:

I will be blogging on a few things I learn on my journey to SSIS expertise. This is the first one. This came about as a result of wanting to delete a few packages from ssis catalog. We do not use these packages any more, and I wanted to clean them out from the project which resides on a few servers. I looked into a few ways of doing it.

Click through for three methods, including an in-depth discussion of the third (and least obstructive).

Comments closed

Scheduling SSIS Packages in Azure

Magi Naumova takes us through the process of running SSIS in Azure Data Factory, including the scheduling of jobs to run our SSIS packages:

The main purpose of these tools is to force the Lift and Shift approach of migrating and running existing SSIS Packages in Azure. I wouldn’t say that this is the most effective approach of transferring the ETL to Azure, but it could be a good start on a road of a Modern Azure Datawarehouse Architecture. If you have already deployed SSIS packages in Azure SSIS Catalog, then SSMS 18 helps you to put them on schedule very quickly.

Running SSIS Packages in Azure requires provisioning of SSIS Runtime Engine, an Azure Data Factory instance and a SQL Database which hosts the SSIS catalog. Scheduling SSIS Packages in Azure requires creating a data flow pipeline in ADF which has a trigger defined for scheduled execution. While describing all those concepts is far above the scope of this chapter, a short description would be useful.

Read on for a good amount of detail and a demo which walks through the process.

Comments closed

Formatting SSIS Packages

Tim Mitchell shows us different options available when formatting SSIS packages:

Most folks I know go to one extreme or the other when it comes to the visual layout of SQL Server Integration Services packages: either they don’t care a tiny bit about the appearance, or they insist on an easy-to-read layout before a project is considered complete. I am definitely in the second group, and will almost always spend the time to make sure my packages are properly laid out.

Regardless of which of these groups you are in, it is very easy to apply some simple visual formatting to your packages. In this quick tip, I’ll show you how to use the visual formatting options in SSIS to automate the layout.

I consider it quite important—it gives a future viewer an idea of your vision as developer. If I can’t read what you’re doing, that does not bode well.

Comments closed

Build and Deploy SSIS Projects with Azure DevOps

Joost van Rossum has a pair of posts on Azure DevOps updates. First, Azure DevOps supports building SSIS projects:

This new task is much easier to use than the PowerShell code and also easier than most of the third party tasks. With a little practice you can now easily create a build task under two minutes which is probably faster than the build itself.

If your build fails with the following error message then you are probably using a custom task or component (like Blob Storage Download Task). These tasks are not installed on the build agents hosted by Microsoft. The solution is to use a self hosted agent where you can install all custom components

Second, Azure DevOps supports deploying SSIS projects:

Microsoft just released the SSIS Deploy task (public preview) which makes it much easier to deploy an SSIS project. Below you will find the codeless steps to deploy artifacts created by the SSIS Build task.

Click through for the step-by-step instructions for each.

Comments closed

Embedding SSIS Packages in Azure Data Factory Pipelines

Andy Leonard shows us how to embed an SSIS package inside Azure Data Factory pipelines:

The Azure-SSIS Team has done it again; they’ve added more cool SSIS execution functionality to Azure Data Factory!

Click through to see what has Andy excited. I think this is a big thing for ADF as well, especially in shops which dedicated a lot of time and energy into building SSIS packages for ETL work over the years.

Comments closed

The SSIS Error Output

Tim Mitchell explains how to use the error output on data flow components in SQL Server Integration Services:

SSIS error outputs are a secondary path through which the data flow can send rows that do not conform to data type, length, or transformation standards defined by the ETL developer. That’s a lengthy way to say that it’s where you can send your junk data. In the SSIS designer, clicking on a source or transformation will often show not one but two possible outputs: the primary output (the “good” data, indicated by the blue line) and the error output (identified by the red line). As shown on the flat file source below, when selecting a source or transformation, those that have an available error output will appear with both output connectors ready for selection.

Tim elaborates quite a bit on what you can do with this output.

Comments closed

Documenting SSIS Catalogs

Dave Mason continues a series on documenting Integration Services:

In the last post, we looked at query options for documenting SSIS packages that are deployed via the legacy package deployment model. What about SSIS packages deployed to a catalog via the project deployment model? Is the package XML accessible so that we can query and shred it? I don’t know with certainty, but I think it is not.

However, there are .NET Framework interfaces we can use to programatically obtain SSIS catalog metadata. So for this post, I’ll soldier on without my dear friend T-SQL and proceed with PowerShell. Here’s a look at the projects and packages I was working with as I developed and debugged the script code. It’s a short list with just two projects, having one package each. The “DailyETLMain.dtsx” package is a sample from Microsoft’s GitHub repository.

Click through for Dave’s explanation and a link to the script.

Comments closed

Documenting Integration Services Packages

Dave Mason continues a quest for documentation:

The output is a tabular result showing each SSIS package, their Names and Descriptions, and the top-level task Names and Descriptions of each subplan. (In addition to the “DailyETLMain” package, we see metadata for some other plans related to the Management Data Warehouse Data Collector.) Note the 16 rows of metadata for the “DailyETLMain” package correspond to the 16 top-level objects of the package–the query doesn’t recurse into containers to obtain their object metadata. I decided not to attempt that–it seemed like overkill for documentation purposes. Another caveat: the results order may not match the order that’s mandated by Precedence Constraints in the Visual Studio designer.

Click through for the full example.

Comments closed

SSIS Term Extraction

Tim Mitchell takes us through term extraction in SQL Server Integration Services:

The SSIS term extraction component is a data flow transformation that will aggregate the frequency of words found in a specified column supplied by an upstream data source. This component expects a Unicode text or text stream field as an input, and calculates as an output the frequency of the nouns and/or noun phrases in the specified source column. As shown below on the data flow surface, the term extraction component (circled) is always used as a transformation, accepting exactly one input and expecting either one or two outputs (the second being an optional error output).

This is one component I’ve never used before.

Comments closed