Press "Enter" to skip to content

Curated SQL Posts

Simplifying Azure Authorization with Managed Identities

MIka Sutinen doesn’t need a password:


Managed Identities
 in Azure provide an efficient and secure way for managing credentials when accessing databases, or other Azure resources. In this post, we build on my recent post about using Service Principal for database authentication, further exploring how we can leverage Managed Identities to simplify the process.

While Service Principals can make some parts of the access management simpler in Azure, you’re still left with several responsibilities in their management.

These responsibilities include rotating secrets and managing the lifecycle of the Service Principal, to name maybe the two most crucial ones. And if you have multiple applications, each with a separate Service Principal, this can be complex to manage.

Click through to learn more about managed identities and service principals, and how they work to link together Azure resources behind the scenes.

Comments closed

Backup Types and Recovery Models in SQL Server

I have a new video:

In this video, I cover the three recovery models available to SQL Server, describe (most of) the types of backups you can take, and spend a bit of time covering Recovery Point Objective & Recovery Time Objective.

The advice for video length that people tend to give is 8-10 minutes. I typically average closer to 15 minutes per video. This one is nearly 30 minutes long because there’s just so much information to cover, even with me repeatedly saying “Don’t worry, I’ll cover this bit in a future video.”

Comments closed

Writing Data to an Unattached Lakehouse via Fabric Notebook

Prathy Kamasani does a bit of movement:

Regardless of which architecture we follow, during stages of data integration and transformation there’s always a step to move data from one location to another. And, we work with multiple tables, schemas, and even lake houses.Same goes with Fabric Notebooks. I often find myself in scenarios where I don’t want to attach Lakehouse to my notebook, but I do want to read or write data from various bakehouses.

I recently blogged about a way to achieve this as part of documenting your workspaces. In that post, I described how to write data to a workspace that was not attached to the notebook. I used MsSparkUtil(renamed to NotebookUtils) to mount and then write data in the Lakehouse as Delta tables.

Read on for the answer.

Comments closed

Converting Audit Files to CSV via Read-SqlXEvent

Patrick Keisler finds a work-around:

Reading the contents of an audit event file can be accomplished using either Management Studio or the T-SQL function, sys.fn_get_audit_file. However, sometimes a customer may want to use a third-party tool to read and aggregate audit records, and some of those tools do not have the ability to read the binary audit file. In that case, the customer will just use sys.fn_get_audit_file to write the audit records to a database and then use the third-party tool to ingest those records from the database.

What if that third-party tool cannot even read from SQL Server? That happened to me recently where the only option was to read from a text-based file.

Read on for Patrick’s solution to the problem and a real pain point you’ll find along the way.

Comments closed

Thoughts on Combining UPDATE Operations

Brent Ozar thinks about consolidation:

I’ve got a BEGIN TRAN in there before the updates just so I can test the same queries repeatedly, and roll them back each time. The execution plan for the updates is quite nice: SQL Server divebombs into the supporting indexes:

Relatively few rows match, so our query does less than 1,000 logical reads – way less than there are pages in the table. In this case, separate UPDATE statements make sense.

Brent then continues with, but what do we do when we need to perform multiple independent scans of the same table? Read on for Brent’s answer, but definitely check out Thomas Franz’s comments, which indicate a potential complicating factor.

Comments closed

Ingesting Blob Storage Data into SQL Server

Andy Brownsword brings in some data:

We may associate consuming data from Azure Storage with tools like Data Factory or even SSIS as we saw recently. We don’t always need the middle man though.

Here we’ll demonstrate how to use an External Data Source to perform the ingestion directly into SQL Server.

Click through for the solution. As a quick note, the TYPE attribute that Andy uses in CREATE EXTERNAL DATA SOURCE was necessary from SQL Server 2016 through SQL Server 2019, but no longer exists for SQL Server 2022. Instead, for SQL Server 2022, you’d switch the LOCATION to start with abs:// for Azure Blob Storage and PolyBase would infer the type from the protocol.

Comments closed

Functions in JOIN and Performance Problems

Steve Stedman lays out a warning:

When writing SQL queries, it’s easy to focus on getting the right results without thinking too much about performance. One common mistake that can lead to significant slowdowns is using functions in the JOIN ON clause of a SELECT statement.

While SQL Server supports a wide range of built-in functions, using them incorrectly—especially in the join conditions—can severely impact performance. Let’s break down why this happens, how it can affect your query speed, and what you can do to avoid it.

Read on for the answer.

Comments closed

Restarting Failed Control Flows in Azure Data Factory

Meagan Longoria doesn’t want to repeat good work:

I presented at SQL Saturday Pittshburgh this past weekend about populating your data warehouse with a metadata-driven, pattern-based approach. One of the benefits I mentioned is that it’s easy to employ this pattern for restartability.

For instance, let’s say I am loading data from 30 tables and 5 files into the staging area of my data mart or data warehouse, and one of table loads fails. I don’t want to reload the other tables I just loaded. I want to load the ones that have not been recently loaded. Or let’s say I have 5 dimensions and 4 facts, and I had a failure loading a fact table. I don’t want to reload my dimensions, and I only want to reload the failed facts. How do we accomplish this?

Read on to learn how.

Comments closed