Previously we successfully prepared tests for variables and parameters using
ParameterCommand. Now it’s time to communicate with the database, and for that, I will use connection manager defined on the project level. I know from the ssisUnit tutorials it works perfect with package connection managers, so it’s time to verify it against the projects. I will test the package
10_ProjectCM.dtsx– it is just getting a single value from the table in a database and storing it in a variable. All the packages and unit tests are on my GitHub.
The package contains three SQL Tasks: the first just checks if we can communicate with the database using
SELECT 1statement, the second gets the information from the table, and the third repeats the second on the container level.
Click through for the tests.
I started answering a question on SQL Community Slack’s #ssis channel and I realized this would be better served as a blog post. The question was about three SSIS Data Flow properties: DefaultBufferSize, Engine Thread and DefaultBufferMaxRows.
I rarely change the EngineThreads property.
DefaultBufferSize and DefaultBufferMaxRows are two ways of managing the size limits of a Data Flow buffer. The two Data Flow Task properties can – and should – be treated as a single property. DefaultBufferSize is the number of bytes per buffer. DefaultBufferMaxRows is the number of rows per buffer. The defaults are 10,485,760 (10M) and 10,000, respectively.
Click through to learn more about these properties.
The result shows 1 test run, 1 test passed, 2 asserts run, 2 asserts passed.
Wait, what? We have prepared only one assert, why does it show two?
The second assert is: “Task Completed: Actual result (Success) was equal to the expected result (Success).“. Great. Where does it come from? Let’s find out.
This is a nice introduction to the topic; if you fuss about with SSIS packages, you should check this out.
My SQL SSIS package leverages the Translator Text API service. For those who want to learn the secret sauce then I suggest to check here – https://azure.microsoft.com/en-us/services/cognitive-services/translator-text-api/
essentially this API is pretty simple;
It accepts source text, source language and target language. (The API can translate to/from over 60 different languages.)
You call the API with your request parameters + API Key
The API will respond with the language translation of the source text you sent in
So Simple, so fast, so effective!
Click through for the full post. It really is simple.
In both of these scenarios you will not be able to save the package. So what the heck are you supposed to do?! Here’s where my tunnel vision (and panic) sets in. How was I supposed to get my SSAS objects processed?
I could always script out my processing tasks using SSMS and drop them in a SQL Agent job step. But I have multiple environments and multiple cubes so each one would have to be hard coded. Not a great idea, so scratch that.
Click through to learn the best way to fix this.
Sit a spell and let Grandpa Andy tell yall a story about some data integratin’.
Suppose for a minute that you’ve read and taken my advice about writing small, unit-of-work SSIS packages. I wouldn’t blame you for taking this advice. It’s not only online, it’s written in a couple books (I know, I wrote that part of those books). One reason for building small, function-y SSIS packages is that it promotes code re-use. For example, SSIS packages that perform daily incremental loads can be re-used to perform monthly incremental loads to a database that serves as a data mart by simply changing a few parameters.
Change the parameter values and the monthly incremental load can load both quarterly and yearly data marts.
You want better performance out of the daily process, so you read and implement the parallel execution advice* you’ve found online. For our purposes let’s assume you’ve designed a star schema instead of one of those pesky data vaults (with their inherent many-to-many relationships and the ability to withstand isolated and independent loads and refreshes…).
You have dependencies. The dimensions must be loaded before the facts. You decide to manage parallelism by examining historical execution times. Since you load data in chronological order and use a brute-force change detection pattern, the daily dimension loads always complete before the fact loads reach the latest data. You decide to fire all packages at the same time and your daily execution time drops by half, monthly executions time drops to 40% of its former execution time, and everyone is ecstatic…
…until the quarterly loads.
This is a great post.
The main part of the UI is a graphical map of the objects in a package. I have spent quite a lot of time trying to think of a better way of showing the elements that make up a package, and it is hard to argue with the workflow approach, all the ETL tools I found do the same thing and visualizing packages as a series of interconnected tasks is useful but maybe not everything and the only way.
Because of all these issues I have really tried to stay away from SSIS development and looked for other solutions such as using biml to write packages or writing code in other languages but I keep coming back to the fact that the SSIS engine is great at what it does and if you have a lot of data or a lot of files/tables then writing a similar framework would be a lot of wasted effort. So, how can we use the SSIS engine but make better, more efficient use of our development time?
Click through to see a couple of screenshots as well as a link to download K-SSIS.
During the December 2017 delivery of Expert SSIS, I was asked if there is an SSIS Catalog Logging Mode that will display the value of variables. I responded that I wasn’t aware of a logging level that accomplishes this, but then – as sometimes happens – I could not let it go. It is an excellent question. So I dug in…
I second his notion that you should learn how to use script tasks and script components. They’re not very difficult to pick up and even a basic knowledge will benefit you greatly.
Looking at the times of the upgrade (it takes milliseconds) we can live with automatic version migrations during thousands of executions. So – is there any gain if we retain it?
Let’s take a closer look at the SSIS Toolbox. We are migrating to SSIS 2017 from the lower version, let’s say the source is SSIS 2012. Open SQL Server Data Tools (for Visual Studio 2015 or 2017, does not matter for now) and load your project. I will use SSDT for VS 2017 with sample project created for SSIS 2012. See the SSIS Toolbox for the project in version SSIS 2012? There is a Script Task following an FTPTask.
I will upgrade the SSIS project to the latest version (and write more about it in few lines) and take a look at the SSIS Toolbox now.
Now we can see additional tasks for Hadoop. Upgrading the project does at least two things that are interesting to us: it uses the latest versions of the tasks and components, but also introduces the new elements to use.
There are some benefits, but those come with a little bit of risk.
I recently worked on a project where a client wanted to use Biml to create SSIS packages to stage data from Dynamics 365 CRM. My first attempt using a script component had an error, which I think is related to a bug in the Biml engine with how it currently generates script components, so I had to find a different way to accomplish my goal. (If you have run into this issue with Biml, please comment so I know it’s not just me! I have yet to get Varigence to confirm it.) This client owned the Pragmatic Works Task Factory, so we used the Dynamics CRM source to retrieve data.
Meagan has the code as well as some important notes, so read the whole thing.