Press "Enter" to skip to content

Category: Integration Services

Understanding Data Integration Lifecycle Management

Andy Leonard explains DILM:

Data Integration Lifecycle Management (DILM) is not about data integration development.

DILM is about everything else:

  • Configurations Management
  • Version Management
  • Deployment
  • Execution

Although DILM is not about development, implementing DILM will impact the design of SSIS solutions.

This is the first part in a multi-part series, and covers some of the conceptual basicsbehind DILM.

Comments closed

REPLACENULL

Louis Davidson shows a quick SSIS function to replace NULL values:

Which I looked up every..single..time I used it. “?” means THEN…not IF? “:” means ELSE? Huh?  I know this comes from one of those cool languages that I have never mastered, but as I was searching for the syntax again a few days ago, I found REPLACENULL. I had never seen this function before, so I figured I might not be the only one. And perhaps if a commenter feels like telling me how dumb I am to not know about other new expression features I will not be offended. REPLACENULL won’t replace every use of the these and the other symbols one must use for SSIS expressions, it does replace one of the more common ones.

Click through for usage.  It’s a bit easier to understand than the ternary operator.  To answer Louis’s question, a ? b : c comes from C# syntax.

Comments closed

Broken References In SSISDB

Andy Leonard explains how broken environment references can come into being within the SSIS Catalog:

If the reference was broken after the SSIS package execution was scheduled, we may see an error similar to that shown below in the SQL Agent log for the job step that attempted to execute the SSIS package:

Failed to execute IS server package because of error 0x80131904. Server: vmSql16\Test, Package path: \SSISDB\Test\ParametersTest\SensitiveTest.dtsx, Environment reference Id: 35.  Description: The environment ‘env2’ does not exist or you have not been granted the appropriate permissions to access it.

Andy has an explanation of what these are, how you might find them, and how to fix them.

Comments closed

Data Flow Sequence Containers

Todd McDermid is excited about data flow groups in Integration Services:

Data Flow Groups

Data Flow Groups is what they’re calling it, and it’s deceptively simple to use.  One of the reasons I’m sure I (and SSIS people I talk to who DID NOT LET ME KNOW IT WAS THERE) missed it is because I was expecting it to be a component in the toolbox.  Not so.
Code up your Data Flow as you normally would.  Then go and select the components that you want to group together – via clicking and dragging a selection window, or click-selecting components.  Any component combinations you want.  Then right-click and select Group.

I admit that I didn’t know it existed either.  This does seem rather useful.

Comments closed

SSIS Fast Load

Chris Taylor runs into an issue with the OLE DB Destination’s fast load option in Integration Services:

What I do want to bring to your attention is the differences between the two when it comes to redirecting error rows, specifically rows that are truncated. One of the beauties of SSIS is the ability to output rows that fail to import through the error pipeline and push them into an error table for example. With fast load there is a downside to this, the whole batch will be output even if there is only 1 row that fails, there are ways to handle this and a tried and tested method is to push those rows into another OLE DB Destination where you can run them either in smaller batches and keep getting smaller or simply push that batch to run in row-by-row to eventually output the 1 error you want. Take a look at Marco Schreuder’s blog for how this can be done.

One of the issues we have exerienced in the past is that any truncation of a column’s data in fast load will not force the package to fail. What? So a package can succeed when in fact the data itself could potentially not be complete!?! Yes this is certainly the case, lets take a quick look with an example.

Read on for details and potential workarounds.

Comments closed

Test Connection With HDInsight

I have a post trying to test a connection using HDInsight:

WebHCat is a web-based REST API for HCatalog, a management layer for dealing with files in HDFS.  If you’re looking for configuration settings for WebHCat, you’ll want generally to look for “templeton” in config files, as Templeton was the project name before WebHCat.  In Ambari, you can go to the Hive configs and look at webhcat-site.xml for configuration settings.  For WebHCat, the default port in HDInsight is 30111, which you should find in the templeton.port configuration setting.

I don’t like the fact that WebHDFS is blocked, but at least WebHCat is functional.

Comments closed

Alert On SQL Jobs Missing Schedules

Brian Hansen wraps up a three-part series on scheduled job alerts:

The first two parts of this series addressed the general approach that I use in an SSIS script task to discover and alert on missed SQL Agent jobs. With apologies for the delay in producing this final post in the series, here I bring these approaches together and present the complete package.

To create the SSIS, start with an empty SSIS package and add a data flow task. In the task, add the following transformations.

Regardless of how you do it, knowing when jobs fail is important enough to build some infrastructure around answering this question.

Comments closed

Finding Packages Which Use Configurations

Bill Fellows explains how to find SSIS packages still using the Configuration option in the classic deployment model:

Create an SSIS package. Add a Variable to your package called FolderSource and assign it the path to your SSIS packages. Add a Script Task to the package and then add @[User::FolderSource] to the ReadOnly parameters.

Double click the script, assuming C#, and when it opens up, use the following script as your Main

Bill continues on with the contents of his script task, so click through for more.

Comments closed

SSIS And SSRS Practices

Chris Seferlis has a list of practices which he’s learned over the years:

Use Source Control

  1. For anyone who was a developer in their past life, or is one now, this is a no-brainer, no-alternative best practice.  In my case, because I come from a management and systems background, I’ve had to learn this the hardway.  If this is your first foray into development, get ready, because you’re in for some mistakes, and you’re going to delete or change some code you really wish you didn’t.  Whether it be for reference purposes on something you want to change, or something you do by accident, you’re going to need that code you just got rid of yesterday, and we both know you didn’t back up your Visual Studio jobs… Hence, source control.  Github and Microsoftoffer great solutions for Visual Studio, and Redgate offers a great solution for SSMS.  I highly recommend checking them out and using the tools!  There are some other options out there that are free, or will save your code to local storage locations, but the cloud is there for a reason, and many of us are on the go, so having it available from all locations is very helpful.

Regarding source control for Integration Services packages, that’s a good reason to learn Biml—it works much better for source control than the native packages (which change every time you open the package and contain a lot of noise).

Comments closed

Control Flow Package Parts

Todd McDermid explains a feature new to Integration Services 2016:

The basic idea behind package parts makes complete sense to a coder – they’re macros.  You take code you’ve used in several places, put it in a separate file that you then include and “expand” in multiple other files.
If you have multiple packages with parts of the Control Flow that are identical – setting up a database in a certain way, sending emails, calling a set of stored procedures, … – then Control Flow Package Parts can help.
The assistance isn’t just limited to the initial coding, either.  Yes – creating a new package with the “duplicate” code is much easier.  But the real gain of Control Flow Package Parts is when your “standard” code needs changes.  Instead of having to edit multiple packages to address the modifications – you only have to alter the package part.  Deploying the project(s) that depend on this part automatically incorporates those improvements.

I’d be a lot more interested in this if Biml weren’t already a better option.  Read on for Todd’s rundown.

Comments closed