Press "Enter" to skip to content

Day: July 17, 2025

Choosing a Good Split for a Decision Tree

Ivan Palomares Carrascosa continues a series on decision trees:

But what are the underlying mechanisms that make decision trees so well-suited for various predictive tasks? And what criteria are internally used to construct them? Specifically, how are nodes recursively split as the tree-shaped structure is formed? This article takes a closer look at the inner workings of decision trees, focusing on how branches are created through deliberate, data-driven splitting (spoiler: it certainly doesn’t happen at random).

One of the main principles of CART is around finding efficient splits for trees, and this digs into some of those details.

Leave a Comment

Result Set Chaining in Snowflake

Kevin Wilkie tries out a new operator:

In a recent Snowflake release, a slick new operator quietly entered the scene: ->>. This little guy can make certain query workflows both more readable and more efficient—especially when you’re dealing with multi-step commands like SHOWLIST, or DESCRIBE.

Click through to see how it works. Seems that this operator has some pretty strict limitations, but for certain use cases, it’s quite nice.

Leave a Comment

Private Endpoints in Fabric Eventstream now GA

Alex Lin makes an announcement:

We’re excited to announce the General Availability of Managed Private Endpoints (MPE) in Fabric Eventstream. This network security feature allows you to stream data from Azure resources to Fabric over a private and secure network without the complexity of manual network configurations.

Read on to see what private endpoints give you and what’s new for general availability.

Leave a Comment

Summer 2025 SQL ConstantCare Population Report

Brent Ozar shares the numbers:

In this quarter’s update of our SQL ConstantCare® population report, showing how quickly (or slowly) folks adopt new versions of SQL Server, the data is very similar to last quarter. SQL Server 2019 still rules the market:

Click through to see where people are at in Brent’s sample of the market. Alan Cranfield has some numbers for SQL Server on AWS and those come pretty close to what Brent’s sample shows as well.

Leave a Comment

Zone Redundancy in Azure SQL Managed Instance

Arun Sirpal explains what zone redundancy is in Azure:

Do you know what happens when you enable zonal redundancy for your SQL managed instance?

Lets define it first (in the context of Business-Critical tier) – zonal redundancy is achieved by placing compute and storage replicas in different availability zones (3) and then using underlying Always On availability group to replicate data changes from the primary instance to standby replicas in other availability zones. 

Availability zones are in the same Azure region, so it works well for high availability but isn’t as good for disaster recovery: if an entire region goes down, zone redundancy won’t help you very much. Also, be aware that you’re paying for what’s running in those three zones because TANSTAAFL.

Leave a Comment

GUID Hunting for Power BI Performance Load Testing

Gilbert Quevauvilliers finds some UUIDs:

When completing the Power BI performance load testing, you will need to get details from your Power BI report and App Workspace, which will later be used in the PBIReport.JSON file.

In this blog post I will show you how to find those details, so that when it comes time to add it to the PBIReport.JSON file, it will be easy to plug the values in.

The reason for a separate blog post is because you will have to find the GUIDs that are used, which takes a bit of time and knowledge to find the correct GUID for the right value.

Click through for the most unsatisfying Easter egg hunt you could imagine. Gilbert then continues to pull out slider and filter data values.

Leave a Comment