Press "Enter" to skip to content

Category: Integration Services

Calculating Commute Distance

Koos van Strien puts together some code to figure out commuting distance using Google Maps data:

Think well about this third step: you can save quite a few dollars annually by just keeping a local cache of travel distances, and only querying when the distance is unknown. My ETL process for this part consists of three global steps:

  1. Add unknown departure/destination pairs to the ‘to be queried’ table (PK of this table is start & end point address in less-structured format, ensuring uniqueness of commutes)

  2. Query Maps API for unknown travel distances. Add retrieved distances (or known unknowns) to the local cache table of travel distances

  3. Use the local cache of travel distance (as complete as it gets at this moment) as the primary lookup for travel distance

The Google Maps API allows free tier users to make about 1000 requests per day.  If you don’t need to pull more than that many data points back (or can queue them to run over the necessary time frame), there’s no marginal cost to calls.  Otherwise, it ends up being a few dollars per thousand calls, so that shouldn’t break your company’s budget.

Comments closed

You Should Use Biml

Meagan Longoria explains why you should use Biml if you’re building Integration Services packages:

Biml provides a way automate SSIS design patterns. This reduces the time required to complete a data integration project, and it helps employ consistent design patterns within and across projects. Re-generating multiple packages after making a change to a design pattern takes just a few minutes, so small changes to several similar packages are no longer a significant effort.

Automating SSIS design patterns allows teams to work more efficiently. Senior developers can stop solving the same problems over and over again. Instead, they can solve them once, automate the solution, and move on to new and interesting challenges. Junior developers still learn good development practices with Biml, but they require less supervision to create quality output in a shorter amount of time. SSIS developers that prefer typing code over the drag-and-drop interface of SQL Server Data Tools now get a better way to work in addition to the automation capabilities.

If there’s one piece of advice I can give ETL developers, it’s “learn Biml.”

Comments closed

SSIS Method Not Found

Regis Baccaro ran into a rather lengthy error when trying to create an SSISDB catalog:

The error I got was :

Method not found: ‘Void Microsoft.SqlServer.Management.IntegrationServices.EnableSsisSupportAlwaysOnSqmHelper.Initialize()’. (Microsoft.SqlServer.IntegrationServices.UITasks)

Looking at the documentation for the namespaceMicrosoft.SqlServer.Management.IntegrationServices I quickly figured out that I would be able to create the SSIS Catalog manually using PowerShell.

But then I couldn’t locate the Microsoft.SqlServer.Management.IntegrationServices dll anywhere except from in the GAC so I had to load it a somewhat cumbersome way (with help from Remo). Below is the script I used for doing that.

It’s a strange error, but Regis does provide a workaround.

Comments closed

SSIS And NUMA

SQL Sasquatch has some SSIS package issues stemming from a lack of NUMA awareness:

So the server had plenty of free RAM.  But NUMA node 1 was in a pinch.  And SSIS spooled its buffers to disk.  Doggone it.

I guess I’d figured that notifications were sent based on server-wide memory state.  But I guess maybe memory state on each NUMA node can lead to a memory notification?

The target SQL Server instance, a beefy one, was also on this physical server.  There’s 1.5 TB of RAM on the server.  🙂

It also looks like the easiest fix is something which was deprecated in Windows Server 2012 R2.

Comments closed

SSIS Parameterization

Slava Murygin shows how to use project parameters and expressions to make SSIS packages a bit more dynamic:

Being on SSIS presentation recently, I’ve realized that a lot of people, who are working with SSIS for years, still do not know what “Parameterizing” is and how to do it.

SSIS has been changed a lot in SQL Server 2012, where Microsoft announced “Project Deployment Model”. Since then you can deploy Project, and you can assign Parameters to that project, which can be passed to it for execution. Before that, developers used Configurations to supply values for internal variables and connections.

Adding parameters to packages grants you a huge level of flexibility when moving between environments or reusing components.

Comments closed

Create An SSIS Catalog

Andy Leonard shows how to create an SSIS catalog:

Check the “Enable CLR Integration” checkbox to enable the other controls on the form.

I recommend you also check the “Enable automatic execution of Integration Services stored procedure at SQL Server startup” checkbox. This feature causes a stored procedure to execute whenever SQL Server starts. The stored procedure will identify any SSIS packages in a running (or other “active”) status and mark them as “Ended Unexpectedly.” You want this. Trust me. (As my friend Kevin Boles (LinkedIn | @thesqlguru) says, “Push the trust me button and let’s move on,” (paraphrased).

You cannot alter the name of the SSIS Catalog database. It is SSISDB. And, as in Highlander, there can be only one SSIS Catalog per instance of SQL Server.

This post is full of helpful notes if you’ve never used the SSISDB database before.

Comments closed

Learning By Doing

Matt Cushing gives us some notes on learning SSIS:

Send Mail tasks won’t work unless you have things configured properly.  I was trying to find things on google and all I kept coming across was how to configure the task or how to Install SSIS and configure it to run, not how to configure the server to send it properly.  Thankfully John took pity on me and helped me realize that using an execute SQL task and sp_send_dbmail works more easily and cleanly – Sql Server Central

I’ve used Send Mail a few times, but have always had somebody else around to configure Exchange or whatever other mail server we were using at the time.

Comments closed

Early Thoughts On SQL Server 2016

Koen Verbeeck has some initial thoughts on using 2016 in a POC:

  • AutoAdjustBufferSize property of the SSIS data flow. Done with manually setting the Buffer Size and Buffer Max Rows. Just set this property to true and the data flow takes care of its own performance.

  • Custom logging levels in the SSIS Catalog. Now I can finally define a logging level that only logs errors and warnings AND set it as the server-wide default level.

  • The DROP TABLE IF EXISTS syntax. The shorter the code, the better 🙂

I was initially a bit concerned with AutoAdjustBufferSize because I figured I could do a better job of selecting buffer size.  Maybe on the margin I might be able to, but I think I’m going to give it a try.

Comments closed

XML Includes Tabs And Spaces

Sander Stad ran into an error creating a Biml script:

Apparently SSIS doesn’t agree with my code. So opening the editor of the raw file connection, changing the access mode to “File name” showed me this:

There are spaces and tabs in front of the path! SSIS doesn’t work well with spaces and that’s one of the reasons why you should not use spaces in file names in the first place.

This is one of the trickier bits of XML-based languages (like Biml):  spacing inside tags can matter…sometimes…

Comments closed

SSIS: Error Loading From XML

Matt Smith ran into an SSIS error on a new laptop:

So today I went to run a SSIS package on my new laptop and bam, error message.

Microsoft.SqlServer.Dts.Runtime.DtsRuntimeException: The package failed to load due to error 0xC0011008 “Error loading from XML. No further detailed error information can be specified for this problem because no Events object was passed where detailed error information can be stored.”. This occurs when CPackage::LoadFromXML fails.

This feels like one of those types of errors that you spend 3 hours trying to figure out.  Gotta love machine rebuild errors…

Comments closed