Press "Enter" to skip to content

Curated SQL Posts

The Basics of Log Shipping

Kevin Hill explains why log shipping is still a viable disaster recovery approach, 25 years later:

In a world where shiny new HA/DR features get all the press, there’s one SQL Server technology that just keeps doing its job.

Log Shipping has been around since SQL Server 2000. It doesn’t make headlines, it doesn’t have fancy dashboards, and it’s not going to win you any architecture awards. But for the right environment and use case, it’s rock solid and can save your bacon job in a disaster.

Read on for a briefing on the topic.

Leave a Comment

Replacing Text in SQL Server 2025 via Regular Expression

Louis Davidson continues a series on regular expressions in SQL Server 2025:

Okay, we have gone through as much of the RegEx filtering as I think is a a part of the SQL Server 2025 implementation. Now it is time to focus on the functions that are not REGEXP_LIKE. We have already talked about REGEXP_MATCHES, which will come in handy for the rest of the series.

I will start with REGEXP_REPLACE, which is like the typical SQL REPLACE function. But instead of replacing based on a static delimiter, it can be used to replace multiple (or a specific) value that matches the RegEx expression. All of my examples for this entry will simply use a variable with a value we are working on, so no need to create or load any objects.

Read on to see how it works, including plenty of examples.

Leave a Comment

Time Series Helpers in NumPy

Bala Priya C shares some one-liners:

NumPy’s array operations can help simplify most common time series operations. Instead of thinking step-by-step through data transformations, you can apply vectorized operations that process entire datasets at once.

This article covers 10 NumPy one-liners that can be used for time series analysis tasks you’ll come across often. Let’s get started!

Click through to see the ten in action.

Leave a Comment

Using VALUES in Iterators in DAX

Marco Russo and Alberto Ferrari answer a question:

In a previous article, Choosing between DISTINCT and VALUES in DAX, we explained how to choose the proper function to iterate the unique values visible in a column in the current filter context. We suggest reading that article before starting this one, because it describes the same underlying problem. Here, we discuss whether to use VALUES in an iterator. This choice depends on the answer to this question: Do you want to include or exclude the blank row generated by an invalid relationship when iterating over the rows of a table reference?

Read on to see how the answer affects your decision.

Leave a Comment

Migrating Azure Data Studio SQL Notebooks to VS Code Polyglot Notebooks

Haroon Ashraf gives us a somewhat unwieldy process:

As a SQL/BI developer, I want to run and store my SQL scripts and documentation efficiently in a Notebook as an alternative to using Azure Data Studio SQL Notebooks since Azure Data Studio is retiring soon. Read on to learn more about Visual Studio Code Polyglot Notebooks.

I liked the simplicity of having a SQL kernel in Azure Data Studio. Haroon shows how to work around it and get to roughly the same spot, but I do hope the SQL Server tools team is able to migrate that SQL kernel over to VS Code prior to Azure Data Studio’s ultimate demise.

Leave a Comment

Materializing Lake Views in Microsoft Fabric

Sairam Yeturi reduces ETL and ELT requirements:

Organizations often face challenges when trying to scale analytics across large volumes of data stored in centralized SQL databases. As business teams demand faster, more tailored insights, traditional reporting pipelines can become bottlenecks. By adopting Lakehouse architecture with Microsoft Fabric, business groups can mirror their SQL data into OneLake and organize it using the Medallion architecture—Bronze, Silver, and Gold layers. Materialized lake views play a crucial role in this setup, enabling automated, declarative transformations that clean and enrich data in the Silver layer. This empowers teams to build reliable dashboards and AI-driven insights on top of curated data, all while maintaining performance, governance, and security on a scale.

In this post, we will cover how enterprises can use materialized lake views to streamline data orchestration and enhance data quality, monitoring across silver and gold layers, while mirroring their SQL DB tables to Fabric in the Bronze layer.

The best use case for this is a scenario in which your underlying data is already essentially in a star schema or at least easily transformable into one, and you have no interest in modifying the data in the view directly. Do read the limitations before digging in, though, as there are some big ones.

Leave a Comment

A Primer on ACID Compliance

Erik Darling takes an academic concept and explains what it means in practice for SQL Server. Erik does a good job describing the concepts of atomicity, consistency, isolation, and durability. I do agree with Erik’s take on consistency, which tends to be the property that database platforms minimize in return for scalability. The descriptions of all four are good, though Erik has a lot more content that digs into consistency and isolation.

Leave a Comment

Contrasting Three Classification Algorithms for Small Datasets

Jayita Gulati compares a few mechanisms to classify data:

When you have a small dataset, choosing the right machine learning model can make a big difference. Three popular options are logistic regression, support vector machines (SVMs), and random forests. Each one has its strengths and weaknesses. Logistic regression is easy to understand and quick to train, SVMs are great for finding clear decision boundaries, and random forests are good at handling complex patterns, but the best choice often depends on the size and nature of your data.

In this article, we’ll compare these three methods and see which one tends to work best for smaller datasets.

All three are quite reasonable algorithms to compare, though I’d want to add in gradient descent or XGBoost, as I’d expect it to perform better than random forest with small datasets.

Leave a Comment

Tracking Time Series Rates of Change in SQL Server

Rick Dobson wants a measure of variation:

This tip presents a brief introduction to Common Table Expressions (CTE), along with a few references for those seeking additional details on CTEs beyond those described and demonstrated here. We will examine CTEs that are defined by either one or two SELECT statements. Additionally, we will provide a demonstration of a recursive CTE. All the examples illustrate how to process time series datasets with CTEs.

Click through for the tip.

Leave a Comment

Microsoft Fabric Pipeline Copy Job Activity in Preview

Connie Xu makes an announcement:

We’re thrilled to announce that the Copy job Activity is now in Preview! 

This new orchestration activity brings the simplicity of the Copy job item directly into your Microsoft Fabric Data Factory pipelines, enabling you to manage data movement alongside transformations, notifications, and more; all in one place. 

Read on to learn more about it, including how it differs from the Copy activity and the Copy job item.

Leave a Comment