Mapping Data Flows
One of the many data flows from Microsoft these days providing, for the first time, data transformation capabilities within Data Factory. This is not a U-SQL script or Databricks notebook that is orchestrated from Data Factory, but a tool integrated. This means that you can reuse (many of) the datasets you have defined in Data Factory, while in Databricks you don’t.
Mapping Data Flows runs on top of Databricks, but the cluster is handled for you and you don’t have to write any of that Scala code yourself.
Read on for the full comparison.
– Every table should have InsertDateTime and UpdateDateTime columns. The UpdateDateTime column should be populated with the same value as the InsertDateTime column upon creation of the row, rather than being left null.
– Whatever you use to create tables, include primary keys, foreign keys, and indexes with your table definitions. Provide explicit constraint names to simplify database comparisons. You can disable your foreign keys, but they need to be there to provide that metadata.
– Separate your final dimensional/reporting tables from audit tables and staging tables. This can be done with separate schemas or even separate databases.
People have added some more thoughts in the comments as well.
I made 2 unsuccessful attempts at running the SSIS Server Maintenance Job. However, after several hours of processing and still no available free space in the database, I knew the job wasn’t coping with the sheer number of rows it had to delete. The deletes all happen from the parent table (internal.operations) and then all child tables using using cascading deletes. This approach maintains referential integrity but is not great for performance.
Due to this, I needed a new approach to the maintenance of SSISDB. As we hadn’t maintained these tables for 13/14 months, I was asking too much of SQL Server to let me delete everything at once.
Read on for the solution.
To get the
Expressionproperties I need to find the precedence constraint in the
.dtsxfile during the XSL transformations. It requires three changes in the
– I have to pass the name of the
– I have to read the XML from the
– I have to use the
DTSnamespace because it’s the namespace of the .dtsx file
Read on for more. Bartosz to this point has covered the control flow.
To make things harder, the layout of the sequences and tasks is not some nested XML structure. All of the elements have the same parent –
<GraphLayout>, meaning all of them are at the same tree level. Also – there is no attribute showing where a particular object belongs. Almost. In the example with the sequences, I see two regularities:
– the outer container is placed later in the XML, than the inner container
@Idattributes show the nesting of the objects
I’m not sure how often I’d use this in practice, but if you want to understand some of the internals of SSIS, this is an interesting series to follow.
I confess: I have been waiting for this feature since I first learned of Azure-SSIS.
When I first saw Azure-SSIS – which creates an Azure Data Factory Integration Runtime and SSIS Catalog in the cloud, my first thought was a paraphrase Ferris Bueller’s question about dying the river green: “If we can execute SSIS packages from the SSIS Catalog in Azure Data Factory, why can’t we execute SSIS packages from Azure Blob Storage?” Today, we can.
Read on to see how you can do it.
BimlExpress is a free Visual Studio add-in created by the good folks at Varigence. Its a full featured Biml editor which allows you to dynamically create SSIS packages. It was first released back in 2017, and the latest version is 2019 (of course). The current version supports Visual Studio 2010 through 2019 as well as SQL Server 2005 through 2019.
Prior to it’s release, Biml was written with Bids Helper, now known as BI Developer Extensions. While BI Developer Extensions has many nice features, you should no longer use it to work with Biml as it is no longer being updated/supported.
I’m pleasantly surprised by this. It used to be limited to BimlStudio (nee Mist) and BimlOnline.
You’ll notice with the code below, that we’re able to extract that Project-object into a variable named $Proj, but when we pipe that to the Get-Member cmdlet we do not see a method called Export.
Read on to learn what the name of this mysterious method actually is and how you can make use of it.
It is not a joke: SSIS is available for Visual Studio 2019 as a preview. Whoa, hold on. SQL Server 2019 hasn’t been released yet? But there’s already an SSIS 2019? Didn’t we have to wait months after the release of SQL Server 2017 before we had an SSIS version for Visual Studio 2017?
Yes, we did, you can read all about there here.
But times have changed apparently. The SSIS team caught up with the rest of the BI tools: SSIS projects are now available from the Visual Studio market place.
Read on to see what this means for SQL Server Data Tools.
In my last post, I showed how you can use the SSIS PowerShell Provider to execute an SSIS package with PowerShell. Of course, in order to execute that SSIS package, it has to get deployed first. In Part 5 of Andy Leonard’s “SSIS, Docker, and Windows Containers” series he used some PowerShell code from Matt Masson’s blog post to deploy an .ISPAC file to the SSIS catalog.
Click through for the code.