Press "Enter" to skip to content

Author: Kevin Feasel

What’s New for Columnstore Indexes in SQL Server 2025

Ed Pollack gives us the lowdown:

Columnstore indexes are a powerful tool for storing analytic data directly in SQL Server. This feature has improved in every version of SQL Server since their inception over ten years ago, and SQL Server 2025 is no exception! 

The newest enhancements are laser-focused on business continuity and performance. Ordered clustered columnstore indexes, ordered non-clustered columnstore indexes, and database/file shrink operations are all given significant boosts that are worth the time to introduce and learn. 

In this article we will dive into each of these changes, how they impact columnstore workloads in SQL Server, and demonstrate their operation. 

Read on to see what we’ve got. Nothing in here is ground-breaking, but it’s a set of nice quality of life improvements.

Leave a Comment

DISTINCT vs VALUES in DAX

Marco Russo and Alberto Ferrari compare two keywords:

When you begin modelling in DAX, DISTINCT and VALUES often appear interchangeable: both return the list of unique values for a column in the current filter context. In a clean development model, they behave the same, so it is easy to pick one at random – or worse, swap between them without thinking.

However, they are not identical. The subtle difference is crucial in production models that may one day contain invalid relationships or bad data

Read on to see how each works and how they differ in practice.

Leave a Comment

The Small Data Showdown in Microsoft Fabric

Miles Cole does a bit of testing:

First, let’s revisit the purpose of the benchmark: The objective is to explore data engineering engines available in Fabric to understand whether Spark with vectorized execution (the Native Execution Engine) should be considered in small data architectures.

Beyond refreshing the benchmark to see if any core findings have changed, I do want to expand in a few areas where I got great feedback from the community:

I really appreciate the approach behind this, both in terms of sticking to more realistic data sizes for many operations as well as performing this test given all of the recent improvements in each engine.

Leave a Comment

Enumerating Template Types in Power BI

Oscar Martinez lays out the list:

A Power BI Template can mean very different things depending on who you ask. Are we talking about a .PBIT shell, a JSON theme, a turnkey Template App, or merely a thin report wired to a central model?

In this post, we cut through the ambiguity and lay each option side‑by‑side. You’ll learn what’s inside every “template” type and the trade‑offs that matter in real‑world projects—so the next time someone says “just use a template” you’ll know exactly which one fits the bill.

Click through for the post. Also, note that each section is in a drill-through div, so you might accidentally miss some information if you haven’t expanded each topic.

Leave a Comment

Using Barman to Back Up HA-Enabled PostgreSQL Clusters

Semab Tariq reminds us that high availability is not disaster recovery:

Barman is a popular tool in the PostgreSQL ecosystem for managing backups, especially in High Availability (HA) environments. It’s known for being easy to set up and for offering multiple types and modes of backups. However, this flexibility can also be a bit overwhelming at first. That’s why I’m writing this blog to break down each backup option in a simple and clear way, so you can choose the one that best fits your business needs.

Click through for the available options, as well as some recommendations.

Leave a Comment

Why Not Use VARCHAR(MAX) for Everything?

David Fowler explains:

When I mentioned to the developer that it’s probably not the best idea, he turned around and asked me, ‘why not?’

It was a genuine question. Why shouldn’t we just spam VARCHAR(MAX) over all of our columns? On the upside, it would get rid of all those annoying issues that crop up when we try to insert a value that overflows the datatype.

Click through for a video as well as a blog post laying out some of the problem with using VARCHAR(MAX) all willy-nilly.

Leave a Comment

Performance Testing the pg_tde Extension

Transparent data encryption is now available in PostgreSQL and Andreas Scherbaum has some performance measures:

The performance impact of pg_tde by Percona for encrypted data pages is measurable across all tests, but not very large. The performance impact of encrypting WAL pages is about 20% for write-heavy tests. The tests were run with an extension RC (Release Candidate), however the WAL encryption feature is still in Beta stage.

Andreas also has a post on the testing specifics:

This test was run on a dedicated physical server, to avoid external influences and fluctuations from virtualization.

The server has a Intel(R) Xeon(R) Gold 5412U CPU with 48 cores, 256 GB RAM, and a 2 TB SAMSUNG MZQL21T9HCJR NVram disk dedicated for the tests (OS was running on a different disk).

Leave a Comment

Incremental Copy Job in Microsoft Fabric now GA

Ye Xu has an announcement:

Copy job has been a go-to tool for simplified data ingestion in Microsoft Fabric, offering a seamless data movement experience from any source to any destination. Whether you need batch or incremental copying, it provides the flexibility to meet diverse data needs while maintaining a simple and intuitive workflow.

We continuously refine Copy job based on customer feedback, enhancing both functionality and user experience. In this update, we’re introducing several key improvements designed to streamline your workflow and boost efficiency.

Click through to see what’s new.

Leave a Comment

PostgreSQL and Covering Indexes

Kendra Little does a bit of index management:

Dear Postgres, Why won’t you use my covering index?

Lately I’ve been learning to tune queries running against PostgreSQL, and it’s pretty delightful. One fun question that I worked through struck me something that most Postgres users probably encounter one way or another: sometimes you may create the perfect index that covers a given query, but the query planner will choose to ignore it and scan the base table.

Why in the world would it do that?

Read on to learn why.

Leave a Comment

Optimizing Multiple Lookup Transformations in SSIS

Andy Brownsword doesn’t want to keep hitting the database:

Lookup transformations provide us a way to access related values from another source, such as retrieving surrogate keys in data warehousing. When we need multiple lookups to the same reference data we can improve performance through the use of a Cache.

If we consider data warehousing, a prime example of this would be an order table which has values for Order Date, Dispatch Date, Delivery Date, etc. All of these would require a lookup to a calendar dimension.

This is a perfect use case for a cache.

Read on to see how the cache connector works.

Leave a Comment