Press "Enter" to skip to content

Curated SQL Posts

Accelerated Database Recovery and NOLOCK

Brent Ozar has a new episode of Customers Say the Darndest Things:

I have never seen a T-SQL feature that people love as much as NOLOCK.

I keep thinking I’ve written enough blog posts about it, but a client came up with a new one:

We use SQL Server 2022’s Accelerated Database Recovery, which keeps copies of versions inside the table. Plus, we don’t use transactions – our inserts, updates, and deletes are done one table at a time, and your demos always have transactions in them, so we’re not affected.

That’s not how this works. That’s not how any of this works.

It might be because it’s early in the morning when I type this out, but I’m having a hard time even conceptualizing what the customer could be thinking here. Brent does lay it out in the comments below, however, and yeah, I’d still raise an eyebrow if someone said that out loud to me.

But the moral of the story is, find someone who loves you like T-SQL developers love NOLOCK.

Leave a Comment

Substring Search with Regular Expressions in SQL Server

Louis Davidson continues a series on regular expressions:

The REGEXP_SUBSTR function extracts parts of a string based on a regular expression pattern. It has some similarieties with the SUBSTRING function, but with some important (and interesting) differences. This function returns Nth occurrence of a substring that matches the regex pattern.

Read on to see how it compares to the traditional SUBSTRING() function.

Leave a Comment

Sparse Columns and Space Utilization

Steve Jones gins up a demo:

I saw this as a question submitted at SQL Server Central, and wasn’t sure it was correct, but when I checked, I was surprised. If you choose to designate columns as sparse, but you have a lot of data, you can use more space.

This post looks at how things are stored and the impact if much of your data isn’t null.

I consider sparse columns a relic of the mid-aughts era, when storage was a lot more expensive and compression was an Enterprise Edition-only feature. Given that you can use page compression in any edition of SQL Server nowadays, I don’t think there’s a viable reason ever to have a sparse column.

Also, definitely check out the comments, where Jeff Moden has a great one.

Leave a Comment

K-Means Clustering in SQL Server

Sebastiao Pereira implements k-means clustering in T-SQL:

K-means clustering is an unsupervised machine learning algorithm used to group data into k distinct clusters based on their similarity, allowing for customer segmentation, anomaly detection, trend analysis, etc. The most common machine learning tutorials focus on Python or R. Normally, data is stored in SQL Server, and it is necessary to move data out of the database to apply clustering algorithms and then, if necessary, to update the original data with the cluster numbers. Is it possible to do it directly in SQL Server?

Given the work you have to do to implement this, I can’t imagine that it would be particularly fast. But it is neat to see that it’s possible.

Leave a Comment

Collations and citext in PostgreSQL

Umut Tekin picks up on a thread:

Recently, I read Laurenz Albe’s blog about case insensitive string search. He recommended case insensitive collations saying, “it won’t be any more expensive (and maybe more intuitive) than the current techniques of using upper() or the citext extension are today”. In other words, a custom collation can be used instead of citext to achieve case insensivity with a lower performance penalty. Today we will be testing it.

Click through for the results of those tests.

Leave a Comment

Comparing Write-Back Options for Power BI

Jon Vöge compares two options:

We’ve previously on this blog covered Power Apps write-back for Power BI/Fabric comprehensively, and in the past months we’ve taken a stab at the Fabric Native solution: Translytical Task Flows.

However, when comparing the different options, which solution actually comes out on top?

Read on as Jon contrasts the two options and explains when you might want to use each.

Leave a Comment

Modifying Power BI Page Visibility and Active Status via Semantic Link Labs

Meagan Longoria hides (or shows) a page:

Setting page visibility and the active page are often overlooked last steps when publishing a Power BI report. It’s easy to forget the active page since it’s just set to whatever page was open when you last saved the report. But we don’t have to settle for manually checking these things before we deploy to a new workspace (e.g., from dev to prod). If our report is in PBIR format, we can run Fabric notebooks to do this for us.

Click through for a notebook and an explanation.

Leave a Comment

SQL Server Auditing Bug Allows Data Exfiltration without Logging

Andreas Wolter describes a bug in SQL Server’s auditing capabilities:

Last week, I was contacted by an IT Leader from Saudi-Arabia who previously found several CVE’s in Oracle and Microsoft SQL Server. He wanted my opinion on a newly discovered security issue in SQL Server Auditing.

Interestingly, his findings directly overlap with a topic I wrote about just last month: Using Data Classification to Audit Data Access.

Emad Al-Mousa identified two vulnerabilities in the SENSITIVE_BATCH_COMPLETED Audit Action Group. Microsoft Security Response Center (MSRC) acknowledged the issue but classified it as low priority – meaning it may not be addressed until a major release, if at all.

Read on to see what the issue is and how you can trigger it today. Andreas also includes a workaround that will work in the meantime.

Leave a Comment