Press "Enter" to skip to content

Category: Integration Services

SSIS + Google Distance Matrix API

Terry McCann shows how to use the Google distance matrix API to calculate a distance from a starting point to an ending point:

The most basic use is to calculate the distance from A to B. This is what we will look at using SSIS. The API takes in an origin and a destination. These can either be a lat & lng combo or an address which we be translated into Lat & lng when executed. To demonstrate the process I have created a database and loaded it with some sample customer and store data. We have 2 customers and 4 stores. These customers are linked to different stores. You can download the example files at the bottom of the page.

I like that this shows just how easy it is to hit an API with an SSIS component.  If I had one wish with this article, I’d use Biml to generate the package rather than talking through the tasks and components.  Regardless, check this out; it’s a great use of the script component.

Comments closed

Scraping And Importing Web Data

Jon Morisi shows how to scrape a website and load the result into a SQL Server table:

Next save this as a csv file.

Now jump into SQL Server Management Studio, drill down to your database (you may want to create a new, empty database for your snarfing), right-click and start the Import and Export wizard, via “Import Data”:

This is the one-off solution.  If you need to do it regularly, read up on creating scrapers and use Integration Services to load.

Comments closed

AcquireConnection Failures

Ginger Grant explains what to do when you get AcquireConnection errors:

AcquireConnection method call to the connection manager failed with error code 0xC0202009/0xC020801C

Try as I might, the only thing I was able to do after an hour was periodically change the error code from 0xC0202009 to 0xC020801C. Nothing I did worked. I created a new connection, created a new OleDB Source, changed the Run64BitRuntime to False in Configuration Properties in the Debugging Section of the project execution, set the Data Flow task DelayValidation from False to True. None of these various suggestions that I got from various websites worked at all.  I thought about changing the SSIS Service ID’s execution properties, but since I was running in Debug mode I determined that this would not make any difference, so I abandoned that idea. Nothing worked. The only thing I was able to do was change the error code, not eliminate it. I could log into SQL Server with the same ID and password in my package and run the simple query in the data flow task and return data.  I could preview the data, what I couldn’t do is execute the SSIS package.  Out of desperation I rebooted, which also did nothing.

SSIS security issues.  Gotta love ’em.

Comments closed

Clustered Columnstore Index Load With SSIS

Koen Verbeeck looks at loading a clustered columnstore index using SSIS:

I stumbled upon this MSDN blog post: SQL Server 2016 SSIS Data Flow Buffer Auto Sizing capability benefits data loading on Clustered Columnstore tables (catchy title). It explains how you can set the buffer properties of the data flow to try to insert data directly into compressed row groups instead of in the delta store. They fail to achieve this using SSIS 2014 and then they explain how using the new AutoAdjustBufferSize property of SSIS 2016 works miracles and everything is loaded directly into compressed row groups. Hint: you want to avoid loading data into the delta store, as it is row storage and you need to wait for the  tuple mover to load the data to the CCI in the background.

However, it’s still possible to achieve the same using SSIS 2014 (or earlier). Niko Neugebauer (blog |twitter) shows this in his post Clustered Columnstore Indexes – part 51 (“SSIS, DataFlow & Max Buffer Memory”). It still depends on the estimated row size, but using these settings you should get better results:

This advice is a bit different from loading standard rowstore-based tables, but serves to pack as many rows into each columnstore row group as possible.

Comments closed

Echoing Variable Values

Bill Fellows shows how to send informational messages in SSIS and gives an example in Biml:

To aid in debugging, it’s helpful to have a “flight recorder” running to show you the state of variables. When I was first learning to program, the debugger I used was a lot of PRINT statements. Verify your inputs before you assume anything is a distillation of my experience debugging.

While some favor using MessageBox, I hate finding the popup window, closing it and then promptly forgetting what value was displayed. In SSIS, I favor raising events, FireInformation specifically, to emit the value of variables. This approach allows me to see the values in both the Progress/Execution Results tab as well as the Output window.

There’s value in putting in code like this as part of generic processing.  Flip the debug bit to true whenever you need this detailed information.  You can also think about calling the method multiple times, such as before and after an expected change block.

Comments closed

SSIS RC2 Issues

Andy Leonard walks through his experience trying SSIS 2016 RC2:

The SSMS 2016 RC2 link works just fine. The SSDT 2016 RC2 link does not work.

Fear not! The SSIS team has provided a set of updated links for SSIS 2016 SSDT for RC2. There’s other good information in that post. If you want to tinker with SSIS 2016 RC2, I encourage you to read it.

But Wait, There’s More

Once I’d done all this, I could create an SSIS project and add a Script Task to a package. But I could not open the Visual Studio Tools for Applications (VSTA) code editor. When I clicked the “Edit Script…” button in the Script Task Editor, nothing happened.

I contacted the SSIS Development Team (we hang out), and let them know what I was seeing. They are aware of the issue and sent the following screenshot

Sounds like there are still some kinks to work out before release.

Comments closed

Adding Multiple Packages To A Project

Koen Verbeeck shows a quick way to add multiple existing packages to a project:

You select the package you want to import and you’re done. But the problem is, you can select only one single object? What if you want to import 20 packages to your project? Kind of annoying to repeat the same process 20 times, isn’t it?

Luckily there’s an easier way. Instead of going for the obvious Add Existing Package, right-click on the project itself. In the context-menu, choose Add, ignore Existing Package and click on Existing Item.

If you have a large number of packages to import, this will save you a few minutes of tedium (or hand-editing a project file).

Comments closed

File Processing Pattern

Bill Fellows describes a pattern for processing files in an ETL scenario:

All ETL processing will use a common root node/directory. I call it SSISData to make it fairly obvious what is in there but call it as you will. On my dev machine, this is usually sitting right off C:. On servers though, oh hell no! The C drive is reserved for the OS. Instead, I work with the DBAs to determine where this will actually be located. Heck, they could make it a UNC path and my processing won’t care because I ensure that root location is an externally configurable “thing.” Whatever you’re using for ETL, I’m certain it will support the concept of configuration. Make sure the base node is configurable.

A nice thing about anchoring all your file processing to a head location is that if you are at an organization that is judicious with handing out file permissions, you don’t have to create a permission request for each source of data. Get your security team to sign off on the ETL process having full control to a directory structure starting here. The number of SSIS related permissions issues I answer on StackOverflow is silly.

It comes down to consistency and cleanliness.  Plan ahead and you’ll have a much nicer go of things.  Bill also provides a Biml POC, so check that out as well.

Comments closed

Ragged Right Files

Sifiso Ndlovu walks us through ragged right formatted files in Integration Services:

The configuration of columns is perhaps a critical part of the entire ETL process as it helps us build mapping metadata for your ETL. In fact, regardless of where or not SSIS/SSMS can detect delimiters, if you skip Column Mapping section – your ETL will fail validation. In order to clarify how Ragged right formatted files work, I have gone a step back and used Figure 4 to actually displayed a preview of our fictitious Fruits transaction dataset from Notepad++. It can already be seen from Notepad++ that the file only has row delimiter in a form of CRLF.

Read the whole thing.

Comments closed

Parameterizing Attunity Queries In SSIS

Melissa Coates shows us how to parameterize queries if you’re using the Attunity connector for Oracle in SSIS:

The Attunity connector for Oracle used inside of the SSIS data flow looks a little different than the typical OLE DB connector we most commonly use. Specifically, the Attunity source has two options: “Table Name” and “SQL command.” What the Attunity Oracle Source doesn’t have in this dialog box is “SQL command from variable” (like we see for an OLE DB source).

Expressions are your friend.

Comments closed