Press "Enter" to skip to content

Curated SQL Posts

Log Truncation in Distributed Availability Groups

Paul Randal phones a friend:

This is a question that came in through email from a prior student, which I’ll summarize as: with a distributed availability group, what are the semantics of log truncation?

As I’m not an AG expert (I just know enough to be dangerous ha-ha), I asked Jonathan to jump in, as he is most definitely an AG expert! I’ll use AG for availability group and DAG for distributed availability group in the explanation below. Whether the AGs in question have synchronous or asynchronous replicas is irrelevant for this discussion.

Read on for the answer.

Leave a Comment

Orchestration Options in Microsoft Fabric

Reitse Eskens moves some data:

Well, unless you enjoy waking up every night to start your Extract-Transform-Load (ETL) process and manually running each process to do some work, it’s a smart move to automate this. Also, make sure everything always runs in the correct order. Additionally, there are situations where processes need to run in different configurations.

All these things can be done with what we call orchestration. It may sound a bit vague now, but we’ll get to the different moving parts of this, like parameterisation and pipelines.

Read on for a primer on the topic.

Leave a Comment

Idempotence and Durable Execution

Jack Vanlightly does some thinking:

Determinism is a key concept to understand when writing code using durable execution frameworks such as Temporal, Restate, DBOS, and Resonate. If you read the docs you see that some parts of your code must be deterministic while other parts do not have to be.  This can be confusing to a developer new to these frameworks. 

This post explains why determinism is important and where it is needed and where it is not. Hopefully, you’ll have a better mental model that makes things less confusing.

Some of the examples Jack includes are pretty tricky, showing just how difficult it can be to ensure that multiple, independent systems are all on the same page.

Leave a Comment

Generating Exponential Random Numbers in T-SQL

Sebastiao Pereira generates more artificial data:

Generating random numbers from an exponential distribution is essential for queuing theory, reliability engineering, physics, finance modeling, failure analysis, Poisson process, simulation and Monte Carlo methods, computer graphics, and games. Is it possible to have a Random Exponential Gaussian Numbers function in SQL Server without use of external tools?

As always, I love this series because these examples are complex enough not to be trivial, yet perform well enough to work in real-world environments.

Leave a Comment

Tips for the Import Data Option in SQL Server

Andy Brownsword doesn’t trust wizards, with their pointy caps and long beards:

If you need to create a copy of a table in another database, the ‘Import Data’ option may seem convenient. If you’ve used this method to copy to your dev environment and found things break, this post is for you.

Click through for some solid advice on how to import that data. Another thing I would sometimes do is coerce all of the input columns to long strings and load it into a staging table. Then, I could use T-SQL to re-shape the data however I needed it rather than trying to get a finicky SSIS flow to translate this date and time combination (or whatever) appropriately.

Leave a Comment

Thoughts on Data Modeling

Steve Jones has a two-fer. First up, he asks an opinion question about data modeling:

Recently, I had a few questions on database modeling. One was posted in the SQL Server Central forums, and a customer asked about ERD tooling on the same day. This came shortly after Redgate acquired Vertabelo (now Redgate Data Modeler). This stood out to me as very rarely in the last few years have I found people consulting and updating a diagram while performing database development.

Second, he takes a peek at a tool Redgate purchased:

Redgate acquired a data modeling tool from Vertabelo recently and I wanted to explore how it works. This is a short look at this tool and how it might be useful in working with databases.

My experience with data modeling has been that only the really large companies did a lot of work with upfront data modeling and keeping logical models up to date. It’s still quite useful for data warehouses, and that’s where the people I know who do a lot of data modeling make their living. But I find it’s too much of a hassle in fast-paced environments, especially when I can keep most or all of the data model in my head and I’m the person managing it all.

Essentially, data models are useful to the extent that they’re approximately true. But because they quickly get out of sync with reality, they quickly go from “quite useful” to “dirty lies.”

Leave a Comment

Copy Job in Fabric Data Factory Pipelines now GA

Jianlei Shen makes an announcement:

Copy Job Activity allows you to run Copy jobs as native activities inside Data Factory pipelines.

Copy jobs are created and managed independently in Data Factory for quick data movement between supported sources and destinations. With Copy job Activity, that same fast, lightweight experience is now embedded within pipelines, making it easier to automate, schedule, and chain Copy jobs as part of broader data workflows.

Read on for an overview of what’s in the activity and a few links on how to get started with it.

Leave a Comment

When Decision Trees Fail

Ivan Palomares Carrascosa builds an explanation:

In this article, you will learn why decision trees sometimes fail in practice and how to correct the most common issues with simple, effective techniques.

Topics we will cover include:

  • How to spot and reduce overfitting in decision trees.
  • How to recognize and fix underfitting by tuning model capacity.
  • How noisy or redundant features mislead trees and how feature selection helps.

Read on for some of the perils of CART and some ways to resolve them.

Leave a Comment

Using the PRODUCT() Function in T-SQL

Rajendra Gupta uses a reducer function:

SQL Server 2025 includes new features and enhancements. In the previous SQL Server 2025 tips, we have explored many new features. Have you explored the new Product() function? If not, this will walk you through the Product() function with several examples.

Read on to see how PRODUCT() works and how thoughtful the development team was around supporting window functions as an aggregate function.

Leave a Comment