Press "Enter" to skip to content

Curated SQL Posts

Checking Key Vault Access in Microsoft Fabric Spark Notebooks

Marc Lelijveld has clearance:

Working with sensitive data in Microsoft Fabric requires careful handling of secrets, especially when collaborating externally. In a recent customer engagement, I needed to validate access to Azure Key Vault from within a Fabric Notebook, without ever exposing the actual secret values. With only read access granted and no need to manage or update secrets, I focused on confirming that the connection was working as expected.

In this blog, I’ll walk you through the approach, including the setup, code snippets, and logic behind this quick but crucial verification step.

Click through for the full story.

Comments closed

Checking SQL Server for tempdb Performance Issues

Jeff Iannucci announces a new stored procedure:

Most database folks might know the tempdb database in SQL Server is used for temporary tables, but many folks don’t realize all the other things that use this critical system database.

It’s also used by table variables, cursors, aggregations, joins, and sorts. And by memory spills when you don’t have enough memory. And by integrity checks that you run regularly to check for corruption. And more.

Optimal performance of tempdb is vital, as it is involved constantly with all your database queries. So…is your tempdb configured and running optimally? And if it isn’t, can you tell why?

Click through to see how it works and where you can get a free copy.

Comments closed

Mirroring vs Shortcuts in Microsoft Fabric

Nikola Ilic compares and contrasts:

A few days ago, I shared a short LinkedIn post in which I condensed key differences and use cases for both mirroring and shortcuts in Microsoft Fabric. Since the post sparked some nice conversation and opened a few additional question, I decided to cover the same topic here, so that it doesn’t get lost in LinkedIn’s “jungle”:)

But, before we move to the final showdown, let’s take one step back and explain what mirroring and shortcuts are.

Click through for the overview and explanation of how the two differ, as well as where each fits.

Comments closed

Restoring Multiple Differential Backup Files

Tim Radney violates Betteridge’s Law of Headlines:

I was recently asked if you can restore multiple differential backups in preparation for a migration. I responded that yes, technically you can restore multiple differential backups, however it will not speed up your cutover.

As soon as I read the first sentence, the answer in my head was “Yes, but why?” Tim explains the person’s reasoning and then demonstrates that this reasoning doesn’t quite work.

Comments closed

Working through an Azure SQL DB Vulnerability Assessment

Josephine Bush digs in:

But do you always need them? Not necessarily. A vulnerability assessment might confirm what you already know. On the other hand, if you’re managing multiple environments or collaborating across teams, these tools can catch things you might miss, like forgotten test accounts or relaxed security rules.

They’re not a silver bullet, but can be a smart way to stay proactive. Ensure you understand what you are trying to achieve, not just blindly accept what they tell you.

Click through for an example and some solid advice.

Comments closed

Caching Database Calls in Python with Redis

Levi Masonde stands up a Redis instance:

Databases play a vital role in software applications—they need to keep updated data or state, which is served by the database acting as the source of truth for the application. How the database performs affects how the entire application performs. Besides obvious factors that affect the performance of a database (hardware, database type, networking infrastructure), there are techniques designed to help you improve the performance of your database and ultimately, your applications. One way to do this is to add caching to your database. But which cache technique works best for which application requirements? This article sheds light on one strategy to implement a cache system.

Click through for one pattern of interaction between cache and database. My preference with the cache-aside pattern is to hide the two data platforms from the calling application as much as possible. In a classic object-oriented language like C#, the actual database + cache calls would be in a separate project and would expose methods on classes that were database-agnostic. With Python, I’d use different .py files in the same project unless I wanted to build a wheel file and deploy it to multiple projects, but the concept would still be the same. I’m not the biggest fan of the way that Levi did it, forcing API developers to have knowledge of both the cache and the database, as that increases the risk of a future developer messing something up.

Comments closed

Review those Logs

Kevin Hill has a public service announcement:

Most SQL Server crashes don’t come out of nowhere.
They leave breadcrumbs – red flags that something’s not right. The problem? If you don’t know where to look, you miss the signs…until it’s 2am and your CEO’s calling.

Let’s talk about how to listen for those whispers before they turn into full-blown alarms.

Click through for some advice on the topic. I’ll also note that everything Kevin mentions, you can automatically retrieve and centralize in a monitoring system, and once you have more than a couple of SQL Server instances, I’d recommend doing so.

Comments closed

Working with the JSON Data Type in Azure SQL DB

Dennes Torres tries out the JSON data type in Azure SQL Database:

Before this new field type, JSON data was typically stored in varchar(max) columns. There are many features to use with JSON values stored in varchar(max) columns and variables, but storing JSON as regular strings is still limited.

The built-in JSON type expands the possibilities. Using an actual JSON column, it becomes easier to build constraints related to JSON columns, for example.

Dennes also spends a lot of the article covering the JSON_ARRAYAGG() and JSON_OBJECTAGG() functions.

Comments closed

The Dual Perils of Overfitting and Data Leakage

John Mount shares notes on a theme:

One of the bigger risks of iterative statistical or machine learning fitting procedures is over-fit or the dreaded data leak.

Over-fit is when: a model performs better on training data than on future data. Some degree of over-fit is expected. A data leak is when: the model learns things about the evaluation set that it would not know about the future data the model will be applied on. This can drive models that look great on training and (supposedly) held-out data, but don’t work in practice.

Click through for the rest of the story, and be sure to check out the comments for a notebook digging further into one of the topics.

Comments closed

The APPLY Operator in T-SQL

I have a new video:

In this video, I explain the history behind the APPLY operator in T-SQL. I also demonstrate one use case for it: executing table-valued functions.

As I mention at the end of the video, this is by far the least interesting use case for the APPLY operator. Subsequent videos in the series get a lot more exciting. But we have to start at the beginning, unless we’re going full Pulp Fiction.

Comments closed