Press "Enter" to skip to content

Curated SQL Posts

Thoughts on AI-Driven Database Development in 2026

Brent Ozar shares some thoughts:

In the PollGab question queue for Office Hours, MyRobotOverlordAsks asked a question that merited a full blog post answer:

My company announced during some AI training that within the next 12 months we won’t be writing any of our own code. Instead, we’ll be babysitting agents. What’s your opinion on this from a DB dev / DBA POV? MSSQL Dev tends to lag, so I’d personally be surprised.

If this sounds completely alien to you, check out this blog post by developer Armin Ronacher. In it, he discusses how 2025 was the year when he reluctantly shifted his development process to the point where now he spends most of his time doing exactly what MyRobotOverlordAsks’ company is proposing: rather than writing the code directly, he now asks AI tools to build and debug things for him, and he spends his time tweaking what they produce. (Update 2025/01/07: for another example, check out Eugene Meidinger’s post on his uses of AI.)

Brent is generally bullish on the idea. I agree that a lot of companies will move in this direction, but am not at all bullish that it’ll work well. I think this is mostly the latest iteration of Stack Overflow-driven development, except with less copy and paste of bad code and more generation of bad code.

If you want the really spicy version of this take, you’ll have to talk to me in person.

Leave a Comment

Randomly Moving the Mouse Cursor in R

Tomaz Kastrun has been so busy, his screensaver never comes on, even when he’s out at lunch:\

New R Package called LazyMouse with single function for randomly moving mouse cursor in your favorite R IDE.

For every R developer, R data scientists and all those everyday R users, that also need a break and do not want the computer to go into sleep mode.

Read on to see how it works. And jokes aside, there have been times in which I’ve wanted something like this to keep the screen from locking up or drives going to sleep when running heavy work overnight on a device I can physically control (i.e., not a workstation I’m leaving on at the office).

Leave a Comment

An Overview of Fabric Security Insights

Yael Biss lays out some changes:

We want to update on a strategic evolution in how you can access and utilize security insights within Microsoft Fabric! The powerful reports you’ve relied on in the Microsoft Purview Hub are officially transitioning and being significantly enhanced within the new Admin Report in the Govern tab of the OneLake Catalog, as was announced at Ignite in November and explained in the Govern in OneLake Catalog for Fabric admins (Preview) blog post.

This isn’t just a relocation; it’s a consolidation and elevation of your most critical governance data. This move is a direct response to your feedback and a key step in providing a more unified, intuitive, and action-oriented governance experience right where your data lives.

There’s a bit too much marketing hype in the blog post for me not to roll my eyes a bit, but the security insights themselves are useful.

Leave a Comment

Creating a Variable Library in Microsoft Fabric

Laura Graham-Brown opens a library:

This post to help you get started creating a variable library. When multiple dataflows, notebooks and pipelines are using the same details to perform tasks it helps if those values are stored in one place. When you move to use deployment pipelines and those values change from your development workspace to your test workspace to your prod, it helps if that is easy. The solution in Microsoft Fabric is a Variable Library to store those common values.

Click through for step-by-step instructions on the process.

Leave a Comment

Tracing Trigger Executions in SQL Server

Kendra Little does some investigating:

Triggers can be tricky to observe in SQL Server. When you need to understand exactly what a trigger is doing and what it impacts, Query Store and Extended Events both have something to offer, but using them effectively requires navigating some confusing nuances. Query Store tracks query-level executions aggregated by query_id, but has a habit of generating a lot of query_ids with different context settings for triggers. Extended Events can capture trigger module starts and individual statement completions within triggers.

This post walks through setting up both observation methods on a test trigger, then compares what each one shows. You’ll see how Query Store data gets spread across multiple query_ids and context_settings_id values, a variety of ways to trace triggers in XEvents, and why the numbers don’t always match up exactly between traces and Query Store.

This is definitely worth the read. Kendra does a great job laying things out in a thoughtful manner and provides a lot of additional insight.

Leave a Comment

Table Renames and Views in PostgreSQL

Deepak Mahto runs into a common issue:

Instead of doing a straight:

ALTER TABLE ... ADD COLUMN ... DEFAULT ... NOT NULL;

we chose the commonly recommended performance approach:

  • Create a new table (optionally UNLOGGED),
  • Copy the data,
  • Rename/swap the tables.

This approach is widely used to avoid long-running locks and table rewrites but it comes with hidden gotchas. This post is about one such gotcha: object dependencies, especially views, and how PostgreSQL tracks them internally using OIDs.

This also happens in SQL Server, though we do have a helpful sp_refreshview procedure that means not needing to drop + recreate views.

Leave a Comment

JSON Data and Columnstore Indexes

Niko Neugebauer continues a series on columnstore:

Not since SQL Server 2008 that Microsoft has added a new base data type to SQL Server, but in SQL Server 2025 they have added not 1 but whole 2 new data types – Vector and JSON. The first one (Vector) and the corresponding index (Vector Index) are described in details in the Columnstore Indexes – part 134 (“Vectors and Columnstore Indexes”) and this post is dedicated to the new JSON data type and the new JSON Index and their compatibility with the Columnstore Indexes and the Batch Execution mode.

One common trait for the Vector & JSON Indexes is that both come with a big number of limitations and they are all enabled under a “Preview” option, making them unsuitable for the most production environments.

Niko has a somewhat-humorous and somewhat-infuriating table at the beginning describing just how much support columnstore indexes have for JSON data types.

And it is another example of the frustrating way in which Microsoft will release something before it’s even half-baked, demand consumer adoption to continue working on it, and then can the feature because people can’t use the not-even-half-baked feature in its current state. There’s a fine line between rapid prototyping and quick market feedback versus strangling products in the crib, and I think they’re pretty far onto the wrong side of things when it comes to most SQL Server functionality.

Leave a Comment

From Conjecture to Hypothesis and the Failure of Data-Driven

Alexander Arvidsson does some research:

I’ve spent the last few weeks diving deep into something that’s been bothering me for years. Everyone talks about being “data-driven,” but when you actually look at what that means in practice, something doesn’t add up. Companies are knee-deep in data, wading in dashboards, drowning in reports, and yet… nothing changes.

So I went looking for examples. Real examples. Not “we implemented analytics and it was amazing” marketing fluff, but concrete cases where data actually improved outcomes. What I found was fascinating, and not at all what the analytics vendors want you to hear.

This is an interesting article and starts to get to the reason why “data-driven” companies fail to deliver on their promise. It also gets to one of my nag points around dashboards: the purpose of a dashboard is to provide relevant parties enough information, at a glance of the dashboard, to take whatever action is necessary. In order to develop a good dashboard, you need to understand all of that information: who the relevant parties are, what decision points exist, under what circumstances should an individual take action, and (ideally) what action the individual could take. But that’s a lot of information and a lot of effort to tease out the right answers.

Leave a Comment

Data Corruption in PostgreSQL

Josef Machytka explains a new (for PostgreSQL) solution to an old problem:

PostgreSQL 18 made one very important change – data block checksums are now enabled by default for new clusters at cluster initialization time. I already wrote about it in my previous article. I also mentioned that there are still many existing PostgreSQL installations without data checksums enabled, because this was the default in previous versions. In those installations, data corruption can sometimes cause mysterious errors and prevent normal operational functioning. In this post, I want to dissect common PostgreSQL data corruption modes, to show how to diagnose them, and sketch how to recover from them.

Read on to learn more, as well as what this new change means.

Leave a Comment