Curated SQL – Page 18 – A Fine Slice Of SQL Server

Data Platform Announcements from Microsoft Build

Published 2025-06-02 by Kevin Feasel

Once again there were a number of Microsoft Build announcements related to data and AI, and some were very impressive. Below are my favorites.

Everything announced at Build can be found in the Microsoft Build 2025 Book of News.

Here you go:

Out of 15 items that James lists, 14 are either Fabric or related to some Copilot.

Comments closed

HA/DR in Oracle with Data Guard

Published 2025-05-30 by Kevin Feasel

Kellyn Gorman takes a peek at Oracle Data Guard:

In its traditional, (and free) configuration, Oracle Data Guard operates in an active/passive architecture. This incredibly well-designed and valuable solution from Oracle which comes included with the Enterprise Edition has as part of its architecture:

A primary database, which is an active, accessible database system.

One or more standby databases, which are passive replicas that continuously receive redo data from the primary.

Click through for an overview of the product.

Comments closed

The Challenges with Sharing Power BI Reports with External Users

Published 2025-05-30 by Kevin Feasel

Soheil Bakhshi begins a new series:

Are you a Power BI developer or someone in a BI or finance team who needs to share reports with customers, partners, or vendors? If they are not part of your Microsoft 365 tenant, things get a bit more complex than just clicking the “Share” button.

This is a common need, especially in consulting scenarios, but doing it securely and correctly takes more than people often think. It involves both technical setup and a clear understanding of roles and terminology.

This post lays out the groundwork for the rest of the series, so stay tuned for more.

Comments closed

The Case against Database Mocks

Published 2025-05-30 by Kevin Feasel

Brandur Leach lays out the argument:

The textbook example of this is the database mock. Here’s a rough articulation of the bull case for this idea: CPUs are fast. Memory is fast. Disks are slow. Why should tests have to store data to a full relational database with all its associated bookkeeping when that could be swapped out for an ultra-fast, in-memory key/value store? Think of all the time that could be saved by skipping that pesky fsync, not having to update that plethora of indexes, and foregoing all that expensive WAL accounting. Database operations measured in hundreds of microseconds or even *gasp*, milliseconds, could plausibly be knocked down to 10s of microseconds instead.

Prior to reading the article, my stance was as follows: use database mocks for unit test libraries, in which you aren’t testing the actual data processing or retrieval. Those should be able to run on an isolated build server with no access to a database. But you also need proper integration tests that cover how you interact with the database, and those tests should be a majority of your test suite. You should have a known state database before each test run (which is where Docker containers or database snapshots become extremely helpful), and passing the database tests should be a gate early on in the CI/CD process.

After reading the article, my priors remain the same. I think there’s still scope for database mocks, but not as a replacement for proper integration testing with the database.

Comments closed

Automated Table Statistics on Delta Tables in Microsoft Fabric

Published 2025-05-30 by Kevin Feasel

Santhosh Kumar Ravindran makes an announcement:

We’re thrilled to introduce Automated Table Statistics in Microsoft Fabric Data Engineering — a major upgrade that helps you get blazing-fast query performance with zero manual effort.

Whether you’re running complex joins, large aggregations, or heavy filtering workloads, Fabric’s new automated statistics will help Spark make smarter decisions, saving you time, compute, and money.

Click through to see what’s included, as well as the limitations associated with this. You can still create manual statistics if you’d like, so on the whole, I approve.

Comments closed

Performing PostgreSQL Version Upgrades

Published 2025-05-29 by Kevin Feasel

Warda Bibi upgrades a server:

Upgrading to the latest version ensures long-term stability, access to new features, and better support. Recently, I worked on upgrading a critical production PostgreSQL environment from version 11 to 15. Version 15 was chosen because the client’s application had only been tested up to that release. The system supported large batch workloads and live applications, so we had to be meticulous. While this article draws from that specific project, the steps are broadly applicable to anyone planning a major PostgreSQL upgrade, especially when crossing several versions.

This guide outlines a generalized, production-ready approach for performing major version upgrades using the pg_dump/pg_restore method.

Click through for two upgrade methods and some of the things to keep in mind during the process.

Comments closed

OneLake Security and Shortcuts

Published 2025-05-29 by Kevin Feasel

Aaron Merrill explains how OneLake security works when you introduce shortcuts:

OneLake allows for security to be defined once and enforced consistently across Microsoft Fabric. One of its standout features is its ability to work seamlessly with shortcuts, offering users the flexibility to access and organize data from different locations while maintaining robust security controls. In this blog post, we will look at how OneLake security is integrated with shortcuts, explain the distinction between passthrough and delegated auth modes for shortcuts, and look at an example use case.

Read on for an overview of OneLake shortcuts, as well as different security models around them.

Comments closed

Explaining Logins vs Users in SQL Server

Published 2025-05-29 by Kevin Feasel

Kevin Hill explains that there are two wolves inside your SQL Server:

“We added them to the database, but they still can’t connect.”

Sound familiar? That’s the kind of confusion SQL Server’s two-layer security model creates when people don’t understand the difference between a login and a user.

Let’s clear that up, because getting it wrong causes broken access, orphaned users, and frustrated help desk calls.

Click through for Kevin’s explanation.

Comments closed

Explaining Cost in Oracle

Published 2025-05-29 by Kevin Feasel

Kellyn Gorman knows the cost:

When you first begin your journey into Oracle performance tuning, you quickly discover that you’re not just working with databases, you’re deciphering a complex system of rules, statistics, and estimations. I was reminded by Connor McDonald in a post he made on LinkedIn about one of the most influential turning points in my career as a database professional came when I encountered Wolfgang Breitling’s 2003 paper – A Look Under the Hood of the CBO: the 10053 Event. At a time when I was focused on tuning queries for speed, Breitling’s work redefined how I thought about cost, and more importantly, how Oracle thinks about cost.

Click through for more information. Although SQL Server’s cost estimation process is different, the underlying reasoning and many of the mechanisms involved are the same.

2 Comments

Querying Multiple Lakehouse Tables via Python Notebook

Published 2025-05-29 by Kevin Feasel

Gilbert Quevauvilliers builds a data frame:

In this blog post I am going to explain how to query multiple Lakehouse tables into a data frame.

The example I am going to use is when you want to load new data into your staging tables, but you need to know the max date from your previous data load.

Read on to see how. The answer, as you might suspect, involves DuckDB.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Curated SQL Posts