Press "Enter" to skip to content

Curated SQL Posts

Speeding Up Azure Data Factory Pipelines

Hiram Fleitas doesn’t have all day to wait for that pipeline to finish:

His issue was pretty much as mentioned on the tile. Our bank’s Azure Data Factory pipeline is running slow moving data from on-prem, we’re copying all tables in a SQL Server database, files from ftp sites and network share drives to Azure SQL DB Managed Instance and to blob storage (our datalake) , do you have some recommendations how to make it go faster? Its around 300GBs and takes over 8 hrs.

So I replied with the following and figured to post it here as it may help others.

Hiram has a video, as well as specific advice to offer.

1 Comment

What to Do with Temp Tables in Stored Procedures

Chad Callihan invokes Betteridge’s Law of Headlines:

Generally speaking, it’s best to put things away that aren’t being used. Don’t keep indexes that aren’t getting utilized because they are taking up disk space and still have to be kept up to date with changes. A table is still loaded up with old data that’s not being used but needs kept? Maybe it’s time for options like an archive database or partitioning.

While it’s not on the same level of importance, one related argument I’ve seen and been in is how to handle temp tables in stored procedures. Do you drop them at the end of a stored procedure or do you leave them to be cleaned up by SQL Server? Is one way better for performance than the other? Let’s do some testing and see what we find out.

Read on for the answer.

Comments closed

Automate Availability Group Failover for SSISDB 2012 and 2014

Alex Stuart shows how to fail over SSISDB in SQL Server 2012 or 2014:

Hopefully not many people are still configuring SSIS instances on SQL 2012 or 2014 – especially HA instances – but if you are, this post is for you.

If you’re running SQL Server 2016 or above, having the SSIS catalog function correctly in an AG is supported by built-in functionality to manage the DMK (database master key). In 2012/2014 however there is no such support. Without intervention this makes SSISDB unable to be opened after a failover, because the DMK isn’t open – leading to errors such as “Please create a master key in the database or open the master key in the session before performing this operation.

Read on to see how to resolve this error, and then how to do this automatically.

Comments closed

Querying Delta Lake Files with T-SQL in Azure Synapse Analytics

Jovan Popovic shows off a new feature in public preview for Azure Synapse Analytics serverless SQL pools:

You can use Azure Synapse and Azure Databricks to prepare and modify your Delta Lake data sets placed in the Azure Data Lake storage. Once your data engineers have prepared the data, your data analysts can create reports using the tools such as Power BI.

Using the serverless query endpoint in Azure Synapse, you can create a relational layer on top of your Delta Lake files that directly references the location where Azure Synapse and Azure Databricks are used to modify data. This way, you can get the real-time analytics on top of the Delta Lake data set without any need to wait for a pipeline to copy and prepare data.

Read on to see how this works.

Comments closed

Understanding the STUFF() Function

Chad Callihan explains an important function:

I used to always see the STUFF function in passing when reading blogs and kind of move past it without understanding what it was doing. I would see it used and think it’s doing something with a bunch of stuff and kind of skip over it since it wasn’t entirely relevant to what else I was reading. When I read about what the STUFF function actually does, it made a lot more sense as a name. More than dealing with “a bunch of stuff” the STUFF function is used for stuffing a string into another string. Let’s check out a few examples of stuffing data.

Granted, 99% of its importance is in combination with FOR XML PATH() but that’s still important. And we get to see a few other use cases for it as well.

Comments closed

Finding Cmdlets in Powershell

Jack Vamvas helps us with cmdlet lookup:

Question: I need to interrogate the module for a Powershell cmdlet but I can’t find the actual physical location of the file on the host server.  I can execute the cmd through the Powershell command line OK , but can’t find the powershell cmd file. Is there a Powershell method of locating the Powershell module files?

Click through to see the answer.

Comments closed

Restoring Databases from Blob Storage Files

Stuart Moore talks us through a (rare) gap in dbatools:

In the comments here I was asked about using Restore-DbaDatabase when all you have is blobs in an Azure Storage account. This can be done, involves a couple of non dbatools steps.

Restore-DbaDatabase doesn’t natively ‘talk’ Azure, nor do any of the other dbatools commands. The reason for this is that we didn’t want to force dbatools users to have to install the AzureRM/Az powershell modules just to use our module. We’ve gone to a lot of effort to make sure that dbatools is acceptable to Security Admins and that it has a small(ish) footprint, and adding those large modules as prerequisites would have broken that.

Read on for how you can get around that.

Comments closed

Securing Databricks on AWS

Andrew Weaver, et al, take us through security practices for running Databricks on AWS:

In this article, we will share a list of cloud security features and capabilities that an enterprise data team can use to harden their Databricks environment on AWS as per their risk profile and governance policy. For more information about how Databricks runs on Amazon Web Services (AWS), view the AWS web page and Databricks security on AWS page for more specific details on security and compliance.

Click through for that list.

Comments closed