Press "Enter" to skip to content

Curated SQL Posts

Logging and Error Handling with MicrosoftFabricMgmt

Rob Sewell continues a series on the MicrosoftFabricMgmt Powershell module. First up is structured logging:

If you have ever come back to a script the next morning and thought “what on earth happened last night?”, you understand why logging matters. Write-Host and Write-Verbose are fine for interactive use, but in automation — scheduled tasks, CI/CD pipelines, long-running jobs — you need something more structured. Something you can query, filter, and persist across sessions, something that you can provide to your team or support or auditors.

MicrosoftFabricMgmt uses PSFramework for all its internal logging, and that capability is available directly to you.

Then comes retry logic and dealing with long-running operations:

REST APIs fail. Networks are unreliable. Cloud services have rate limits. If your automation script does not account for this, it will eventually break at the worst possible moment. This is not pessimism — it is production experience.

MicrosoftFabricMgmt has a lot of error handling built in, so you do not have to write it all yourself. Today I want to show you what the module handles automatically, and how to add your own handling on top for the scenarios you care about.

Leave a Comment

An Introduction to pg_duckpipe

Yuwei Xiao needs a way to move data:

When we released pg_ducklake, it brought a columnar lakehouse storage layer to PostgreSQL: DuckDB-powered analytical tables backed by Parquet, with metadata living in PostgreSQL’s own catalog. One question kept coming up: how do I keep these analytical tables in sync with my transactional tables automatically?

This is a real problem. If you manage DuckLake tables by hand, running periodic ETL jobs or batch inserts, you end up with stale data, extra scripts to maintain, and an operational surface area that grows with every table. For teams that want fresh analytical views of their OLTP data, this quickly becomes painful.

pg_duckpipe addresses this. It is a PostgreSQL extension (and optionally a standalone daemon) that streams changes from regular heap tables into DuckLake columnar tables in real time. No Kafka, no Debezium, no external orchestrator. Just PostgreSQL.

Click through to learn more about how it works.

Leave a Comment

The Transaction Log as a Circle

Paul Randal explains to the officer that the transaction log is a flat circle:

In the second part of this series, I described the structural hierarchy of the transaction log. As this post is chiefly concerned with the Virtual Log Files (VLFs) I described, I recommend you read the second part before continuing.

When all is well, the transaction log will endlessly loop, reusing the existing VLFs. This behavior is what I call the circular nature of the log. Sometimes, however, something will happen to prevent this, and the transaction log grows and grows, adding more and more VLFs. In this post, I’ll explain how all this works, or sometimes doesn’t.

Read on for a depiction of the transaction log and what can cause it to foul up.

Leave a Comment

What’s New for tempdb in SQL Server 2025

Johan Deardurff lists the updates:

TempDB has consistently been regarded as one of the most essential and, historically, overlooked components within SQL Server. When TempDB goes wrong, it rarely fails quietly. A single runaway query, poorly designed report, or unexpected workload spike can consume TempDB space and bring an entire instance to its knees.

With SQL Server 2025, Microsoft has made meaningful investments to change that story. TempDB is no longer just something you monitor and hope behaves; it’s now something you can govern, recover quickly from, and observe with far greater clarity.

Click through for a breakdown of how Microsoft has tackled some classic tempdb problems in SQL Server 2025.

Leave a Comment

Deploy Microsoft Fabric Items with fabric-cicd in Azure DevOps

Kevin Chant announces a new Azure DevOps extension:

This post covers how you can simplify Microsoft Fabric deployments with “Deploy Microsoft Fabric items with fabric-cicd”. Which is an Azure DevOps extension that I recently published.

To manage expectations, this post shows how to start working with the extension and its associated task within the GUI-based classic release pipelines in Azure DevOps. Like in the below screenshot.

Read on to see how the extension works.

Leave a Comment

PostgreSQL Business Continuity as Layers

Umair Shahid explains that it’s like an onion:

High availability for PostgreSQL is often treated as a single, big, dramatic decision: “Are we doing HA or not?”

That framing pushes teams into two extremes:

  • a “hero architecture” that costs a lot and still feels tense to operate, or
  • a minimalistic architecture that everyone hopes will just keep running.

A calmer way to design this is to treat HA and DR as layers. You start with a baseline, then add specific capabilities only when your RPO/RTO and budget justify them.

My thing I would point out is that the first few layers are actually disaster recovery, and that high availability first comes into the picture with Layer 3. But if you think of it in terms of Business Continuity (High Availability + Disaster Recovery), then the approach is a good one.

Leave a Comment

Data Type Precedence in SQL Server

Louis Davidson has a type:

There is one topic in query and equation writing that is constantly a minor issue for SQL programmers: implicit data type conversions. Whenever you don’t specifically state the datatype of an expression, like when you write SELECT 1;, it can feel a bit of a mystery what the datatypes of your literal values are. Like in this case, what is 1 ? You probably know from experience that this is an integer, but then what happens when you compare CAST(1 as bit) to the literal 1. Is that literal 1 now a bit? Or is it still an integer?

Perhaps even more importantly, why does this query succeed?

Click through to learn more.

Leave a Comment

Porting Statistics in PostgreSQL

Radim Marek imports production statistics:

In the previous article we covered how the PostgreSQL planner reads pg_class and pg_statistic to estimate row counts, choose join strategies, and decide whether an index scan is worth it. The message was clear: when statistics are wrong, everything else goes with it.

But there was one thing we didn’t talk about. Statistics are specific to the database cluster that generated them. The primary way to populate them is `ANALYZE` which requires the actual data.

Click through to see how Postgres handles this. It’s quite similar to SQL Server’s DBCC CLONEDATABASE in practice, it seems.

Leave a Comment

Avoid Hard-Coding Linked Server Names

Greg Low provides some good advice:

I’m not a great fan of linked servers in SQL Server but they are often necessary. If I’m working with the latest version of SQL Server, I really prefer to use External Data Sources and External Tables. But not everyone is on the latest version. In the meantime, what I see all the time, is people hardcoding server names like this:

SDUPROD2022.WWIDB.Payroll.Employees

That makes your code really hard to manage.

Read on for several options. At a prior company quite a while ago, we went with DNS entries and they worked reasonably well.

Leave a Comment

Preview-Only Steps in Microsoft Fabric Dataflows

Chris Webb covers a new feature:

I have been spending a lot of time recently investigating the new performance-related features that have rolled out in Fabric Dataflows over the last few months, so expect a lot of blog posts on this subject in the near future. Probably my favourite of these features is Preview-Only steps: they make such a big difference to my quality of life as a Dataflows developer.

The basic idea (which you can read about in the very detailed docs here) is that you can add steps to a query inside a Dataflow that are only executed when you are editing the query and looking at data in the preview pane; when the Dataflow is refreshed these steps are ignored. This means you can do things like add filters, remove columns or summarise data while you’re editing the Dataflow in order to make the performance of the editor faster or debug data problems. It’s all very straightforward and works well.

First up, that feature is pretty interesting, though I could see things break if you only do your testing in the preview pane. Second, what Chris does with this is quite interesting.

Leave a Comment