Press "Enter" to skip to content

Month: November 2025

Invoking REST API Endpoints in SQL Server 2025

Hristo Hristov makes a call:

One highly anticipated new feature in SQL Server 2025 is the ability to call an external REST API endpoint from the database server itself. This new feature opens the door to new data integration scenarios and delivers on the promise to “bring AI closer to data.” What are the steps to follow if you want to use this new feature?

I expect to see two things from this. First, some percentage of developers will abuse it and cause performance problems in the database. Second, some percentage of database administrators will panic about this and try to prevent its use even when it makes sense.

But hey, at least this time, they didn’t use the term “unsafe” to describe something DBAs don’t understand and thus cause a widescale panic.

Leave a Comment

Eventhouse Endpoint for Fabric Data Warehouse

Tzvia Gitlin Troyna announces a new feature:

The new Eventhouse Endpoint for Fabric Data Warehouse extends this same architecture to structured data sources, allowing users to:

  • Query Fabric Data Warehouse tables in real-time using KQL.
  • Leverage schema mirroring for warehouse tables.
  • Unify analytics across Lakehouse and Fabric Data Warehouse without duplicating data.

Even if I don’t expect many data platform practitioners to use KQL and even though I’m morally opposed to the Fabric Data Warehouse (short version: Lakehouses and Warehouses in Fabric should be the same thing, not two separate things), I’d still consider this a step forward. It does provide a new integration point between two services that have been annoyingly isolated.

Leave a Comment

Power BI Projects and PBIR Format Admin Settings

Koen Verbeeck notes a new default:

The Power BI Enhanced Report Format (PBIR) will soon become the default, and that’s a good thing because it significantly makes git integration easier. You can already enable it in the preview features of Power BI Desktop (also enable PBIP and TMDL to make git integration of the model itself much easier).

Read on to see the administrative setting associated with this, as well as reasons why you should keep it on.

Leave a Comment

Slimming down Batch Deletion in SQL Server

Matt Gantz deletes a batch at a time:

In previous articles I showed patterns for working with large amounts of data on big tables while keeping locking at a minimum. These processes can allow migrations and maintenance without requiring downtime but, in environments with unpredictable database workloads, there is a risk of heavy traffic starting at any time and disrupting a once smooth operation. In this article, I’ll demonstrate how to augment these processes to allow dynamic adjustment of the configuration.

For most systems, the main limitation these techniques run into is the speed and throughput of I/O (input/output). During periods of low traffic, a large batch size may perform great with no impact to production, but as traffic increases, the storage subsystem may not be able to keep up.

Read on for two mechanisms to make batch operations a little less stressful on the server.

A consulting customer of mine has a fairly clever mechanism for this as well: track the number of non-trivial active processes before the batch begins. If that number is above a certain threshold (say, 10 or 15 or whatever), pause for a pre-defined period of time before running. That way, if the server isn’t very active, batches can keep processing willy-nilly. But once things get busy, it reduces its activity load.

Leave a Comment

Database and Fabric Ignite Roundup

Victoria Holt lays out the news:

Real-Time Analytics with Synapse Real-Time Intelligence

  • The new Synapse Real-Time Intelligence in Fabric allows streaming data from operational databases to be analyzed instantly.
  • This supports use cases like fraud detection, predictive maintenance, and personalized recommendations.

Ah, thank goodness they’re re-using the name Synapse for something that has nothing to do with Azure Synapse Analytics. Makes sense when you’re running Microsoft Service Fab–, err, App Fab–, err, Fabric.

Leave a Comment

Binding Power BI Thin Reports to a Local Model

Ed Hansberry cuts the fat:

You may have an issue though where it isn’t in the service and you need to connect your report to a local model on your desktop. This can be useful if you need to do some testing and don’t want to connect it to a model in the service, or you temporarily don’t have access to the service. Let’s see how this works.

What you need:

  • Your thin PBIX report file. You can download this from the service if necessary.
  • Your local model in Power BI Desktop. It must be up and running.
  • Windows File Explorer

Read on for the process and a demonstration.

Leave a Comment

Multiple Filters with Regular Expressions

Louis Davidson shows off some more of the power of regular expressions:

One of the practical uses of RegEx is more powerful filtering. One of the projects I am working on, (very slowly) is sharing some SQL utilities on GitHub, Utilities like looking at the metadata of a table, searching for columns, database sizes, etc. I usually use LIKE to filter data, which lets me simply use an equality search, or I can also do a partial value search when I don’t know exactly what I am looking for.

LIKE is quite useful but, as Louis points out, it does have its limits. And in those limits is where regular expressions do so well.

Leave a Comment

What’s New in OneLake

Kim Manis shares an update:

In this blog post, I’ll highlight the new zero-ETL, zero-copy sources in OneLake, deeper interoperability between OneLake and Microsoft Foundry, and new tools to help admins manage capacity, security, and governance at scale. Together, these updates further cement Fabric as the ideal data platform for your mission-critical workloads—open, integrated, secure, and built to connect every part of your data estate to the intelligence your business needs. 

Read on to see some of the latest from Ignite.

Leave a Comment

DAX Functions and Model Dependency

Marco Russo and Alberto Ferrari describe model dependency:

In a previous article, Introducing user-defined functions in DAX, we introduced the syntax and the capabilities of this feature that was introduced in preview in September 2025. While there could be many ways to categorize functions, every function can be one of the following:

  • Model-independent function – Has no references to specific tables, columns, calendars, or measures within a semantic model. A model-independent function can have dependencies on other model-independent functions, but it cannot depend on a model-dependent function.
  • Model-dependent function – Has one or more direct or indirect references to specific tables, columns, calendars, measures, or other model-dependent functions within a semantic model.

Read on to learn more about each of these categories.

Leave a Comment

Sales Tables and Normalization

Hans-Jürgen Schönig lays out an argument:

When creating data models in PostgreSQL (or any other powerful relational database), several common problems can arise that hinder effective database design. One frequent issue is the incorrect normalization of tables, leading to redundant data, inefficient querying capabilities, or even simply plain wrong semantics. 

First, I agree with the argument Hans-Jürgen is making regarding modeling sales data: it is important to understand what information the company is actually collecting and the domain and business rules associated with that data collection. The real world should shape and define your database model, not the other way around.

But where I fervently disagree is that this is “too much” or “over-optimistic” normalization because this is not normalization. Normalization is the lossless decomposition of data across entities, and normal forms 1-5 break down along two major lines: decompose functional dependencies (Boyce-Codd Normal Form), and decompose join dependencies (4th & 5th normal forms). In the example, there is clearly a loss of information from the wider model—the table definition Hans-Jürgen shows later on—so the split into product + sales tables does not adhere to 5th normal form.

The sales table does need some variant of product_unit_price because that was the price as of the time of sale, and prices can change for a variety of reasons: discounts because the product is on sale, adjustments due to inflation, increases in demand, supply chain issues, adjustments based on new marketing models, etc. For historical purposes, it is important that we capture the price as of the time of sale, as that will tell how much to refund if the customer returns with the product. It is also important for bookkeeping reasons, as a variety of taxes and reports depend upon that number.

But it’s also clear that the product table is reasonable: a product number (following the first rule of database architecture: anything labeled “number” is actually a string with non-numeric characters in it), a product name, and a product price. But this is the current sales price, which is different from the price at the time of sale. Those are two separate attributes and labeling them both “product_unit_price” leads to some of that confusion. They should both be in the data model.

Then, if I’m looking critically at the data model, I would ask why product name is in the sales data. Do product names change so frequently that we need to capture it on every sales record? Do managers have the authority to change the name of a product on the fly during a sale? Probably not and unless managers do have that discretion or the product name changes on a per-sale basis, it doesn’t belong there. There could be a functional dependency from product_number to product_name, such that each unique value of product_name has one and only one product_number. In that case, the product table should have product_name and the sales table should not.

But what if the product name does change over time and we need to track those changes? Then I’d (potentially) relabel product_name on the product table to current_product_name and create a 6th normal form table called something like product_name_history. This table would include the product_number, the product_name, and a time range over which that name was valid for that product. These timeframes should not overlap and, as long as the product exists, there should be an entry that corresponds with the appropriate name at any arbitrary time.

My final model would look something like:

CREATE TABLE product (
    product_number text PRIMARY KEY,
    product_name text NOT NULL,
    product_price numeric(10, 2) NOT NULL
);

CREATE TABLE product_name_history (
    -- Primary key would be on the combo of product_number and begin_time/interval
    product_number text NOT NULL,
    product_name text NOT NULL,
    -- Architected as an interval, implemented as start and end timestamps
    begin_time timestamptz NOT NULL,
    end_time timestamptz NOT NULL
);

CREATE TABLE sales (
    sales_time timestamptz NOT NULL,
    product_number text NOT NULL,
    sale_unit_price numeric(10, 2) NOT NULL,
    sale_units numeric(10, 2) NOT NULL
);

I did also rename “product_unit_price” and “product_units” to “sale_unit_price” and “sale_units” to make it clearer that these were associated with the sale itself. We could also get into things like sales orders versus sale order line items, additional attributes, etc. but this post is already long enough as it is. And again, this assumes that product name changes frequently enough and the different names are relevant enough for us to care about what the product name was at the time of a given sale. If not, we could remove the product_name_history table without otherwise changing the model.

Long story short, I agree with Hans-Jürgen that database design is important and understanding the business domain is critical for designing a good data model. But where I disagree is with the concept of over-normalization. That’s not a real thing. Improper attempts at normalization are certainly a real thing, as we see in this post, and it is important to understand the business rules as well as the rules of defining normal forms in order to prevent this.

Leave a Comment