Press "Enter" to skip to content

Curated SQL Posts

New Syntax in SQL Server 2022

Steve Jones talks syntax:

At the recent 2024 PASS Data Community Summit, I saw a lightning talk from Mikey Bronowski on the New T-SQL Functions in SQL Server 2022.  Before the talk, I made a joke with him that none of these were new because it’s 2024. They’d been out for 2 years.

Mikey did a nice job, given that he only had 10 minutes, but it was enough to give attendees an idea of some new things they might investigate to use in their own code. If you missed the talk, or you don’t have access to the recordings, we have a series on SQL Server Central that covers these (Part 1Part 2Part 3) and helps you understand the new options. The list of language changes is also in the MS Docs under What’s New in the Language.

Steve asks about how much utilization there is of these. I think the problem is that so few organizations have adopted SQL Server 2022 that knowledge that these things even exist is quite low.

By the way, I do have a script covering many of these new bits of syntax as part of a talk on getting up to speed with core T-SQL changes since SQL Server 2012.

Comments closed

The Difficulty of Deletion in PostgreSQL

Radim Marek takes us through the process of deleting rows:

Your database is ticking along nicely – until a simple DELETE brings it to its knees. What went wrong? While we tend to focus on optimizing SELECT and INSERT operations, we often overlook the hidden complexities of DELETE. Yet, removing unnecessary data is just as critical. Outdated or irrelevant data can bloat your database, degrade performance, and make maintenance a nightmare. Worse, retaining some types of data without valid justification might even lead to compliance issues.

Read on to learn about the process, some challenges, and a common pattern for resolving these challenges. The solution is pretty similar in SQL Server as well: batching delete operations, ideally with a supporting index.

Comments closed

AWS DMS and a LOB Bug

Richard O’Riordan fixes an issue:

The table over in our Postgres cluster is similar except for the data type “text” being used instead of “varchar”. All kind of boring so far, but what we noticed that on some very rare occasions the “largevalue” column was empty over in the PostgreSQL database even though for that row it was populated in SQL Server.

This seemed odd to me, like you would expect if there was some error inserting the row on the PostgreSQL side then since it is all done within a transaction that it would either all succeed or all fail, how would the row be partially inserted, i.e. missing this text value.

Read on for the story and several tests of how things work.

Comments closed

Query Hash Value Changes

Kendra Little digs into a problem:

This is the worst bug I’ve found in SQL Server to date. Previously, my top find was SQL Server Online Index Rebuild sometimes happens offline without warning. This one has taken top slot because it makes my life more difficult on a daily basis.

Background: SQL Server generates a query_hash for each query. This is stored in sys.query_store_query and it’s one of the primary ways you can identify what a query is across different Query Stores, or even the same Query Store over time, as surrogate query_id values get reset if Query Store is cleared or data ages on. The query_hash is a “Zobrist hash over the shape of the individual query, based on the bound (input) logical query tree. Query hints aren’t included as part of the hash.” (Source)

Except that’s wrong.

Read the whole thing.

Comments closed

Microsoft Fabric Shapes for Excalidraw

Miles Cole has an announcement:

I’m thrilled to announce that my Microsoft Fabric Shape Library for Excalidraw has been published!

Documenting and diagramming solution architectures is critical for both conceptualizing and building maintainable data solutions. Whether you are whiteboarding ideas for review with your customer or technical lead, or documenting an existing solution so that new or future team members can understand your spaghetti architecture, high-quality diagrams are a must.

Having spent time in both web and graphic design at various points in my career, I’ve developed a stumbling block: I have a hard time feeling good about a diagram unless it looks stellar. While I’ve successfully used a few diagramming tools in the past, I’ve never fallen in love with one—until now. Enter Excalidraw.

Click through to learn more about Excalidraw and try out the icons. I’ve always been a fan of draw.io (which was also diagrams.net for a little while but appears to be back to its original name) and mermaid, but this also looks interesting.

Comments closed

Converting an Excel Workbook to CSV via Microsoft Fabric

Jared Westover builds a Data Factory job:

After a two-year break, I started working with Azure Data Factory again, now part of the Fabric family. I quickly adapted to Data Factory since it closely resembled SQL Server Integration Services (SSIS), a tool with which I had a love-hate relationship. For my new mission, I set out to convert a list of files from Excel to comma-separated values (CSV). We upload the original Excel files to a Data Lake in Fabric. We then need to convert a specific worksheet and move the CSV files to a different folder in Data Lake.

Read on to see what Jared came up with.

Comments closed

Obtaining VisualIDs for Visuals in a Power BI Report

Sandeep Pawar checks for ID:

Log Analytics and Workspace Monitoring in Fabric logs all the activities of datasets in a workspace. These logs contain dataset, report, visual IDs which the user has to decipher to get the full picture. Dataset, report ids are straightforward but it’s not easy to get visual IDs programmatically. Chris Webb already has a blog on couple of different ways to get the visual IDs. That blog was published in 2022 and in the Fabric world we now have a couple of more options.

Read on for two additional methods you can use.

Comments closed

Comparing Collation Speed in Postgres

Cristophe Pettus has the need for speed:

In previous installments in this series about locales and collations in PostgreSQL, we’ve made some vague allusions to the speed of the various collation functions. Let’s be a bit more analytical now.

The data here was gathered on a 4GB Linode instance running Ubuntu 24.04 and PostgreSQL 17.1. The test data was 1,000,000 records, each one a string of 64 random 7-bit ASCII characters. For each of the configurations, the test data was loaded into a table:

It’s a fairly simple test, but the results are quite interesting.

Comments closed

Working with the Microsoft Fabric API for GraphQL

Nikola Ilic parses some data:

“We are creating a custom dashboard using code, and we need the data stored inside Microsoft Fabric. Can we access it in another way than via SQL Analytics Endpoint?”

This is a real-life customer requirement we’ve encountered recently. And the short answer is: Yes, you can! For the longer answer, we encourage you to read this article and understand how to leverage the Fabric API for GraphQL feature for enhanced data retrieval experience compared to the traditional REST API approach.  

Click through for an excerpt from a book that Nikola and Ben Weissman are writing.

Comments closed

CPU Monitoring in SQL Server with Datadog

Kendra Little has a recommendation:

What makes me a raving fan is the flexibility of Datadog’s notebooks and dashboards, combined with the ability to create all sorts of custom metrics and monitors. There are always things in SQL Server monitoring packs that I have strong opinions about. Datadog lets me take what I want, build what I need that isn’t contained in that, and ignore the rest. For a team that has the budget to afford Datadog paired with dedicated database staff with the time and resources to do this work, this can be a great fit.

One of the weirdest and worst parts of the Datadog SQL Server monitoring tooling, though, is how it handles wait stats. In my opinion, it’s a case of someone reinventing a wheel that didn’t need to be reinvented, and then not documenting what they did clearly (at least not in a way I can find).

Two of the most confusing Datadog “waits” are labeled “CPU” and “Waiting on CPU”. I opened a support ticket with Datadog a while back to ask what these are, because I couldn’t find any way they correspond to actual wait stats in SQL Server. I learned they aren’t wait stats at all. In fact, I think you should largely ignore them. Here’s why.

Read on for the full story.

Comments closed