Press "Enter" to skip to content

Month: August 2025

Reasons Regression Models Under-Perform

Ivan Palomares Carrascosa has a list:

In regression models, failure occurs when the model produces inaccurate predictions — that is, when error metrics like MAE or RMSE are high — or when the model, once deployed, fails to generalize well to new data that differs from the examples it was trained or tested on. While model failure typically shows up in one or both of these forms, the root causes can be more diverse and subtle.

This article explores some common reasons why regression models may underperform and outlines how to detect these issues. It is also accompanied by practical code excerpts using XGBoost — a robust and highly tunable ensemble-based regression model. Despite its popularity and power, XGBoost can also fail if not trained or evaluated properly!

These are high-level reasons but they’re good to keep in mind.

Leave a Comment

Using a Child Pipeline Variable in a Parent Pipeline in Fabric Data Factory

Justin Bird passes back some information:

I answered a question on the Fabric community on return variables recently and thought I would expand upon it in a blog post. The question was how to use a variable derived in a child pipeline downstream in the parent pipeline. The person was specifically deriving a json object and wanted to iterate on the values in the parent pipeline.

Click through for the solution.

Leave a Comment

Regular Expression-Based String Splitting in SQL Server 2025

Aaron Bertrand splits a string:

SQL Server users have been asking for native regular expression support for over two decades. There are third-party Common Language Runtime (CLR) modules that offer this functionality, but these can be complicated to install and simply aren’t possible in some environments. I want to split a string using a regular expression instead of a static string. Will that be possible in SQL Server 2025, without CLR?

Must not rant about CLR. Must not rant about CLR. Must not rant about CLR. (By the way, if you ever catch me in person, get me going about how CLR got the short end of the stick and how the ‘modern’ forms of the Common Language Runtime in SQL Server are not great.)

Aaron tries out a function built into SQL Server that allows you to split strings into result sets using a regular expression to perform the splitting, and shows off some of the more complicated scenarios that this can solve over a normal STRING_SPLIT() function call.

Leave a Comment

Performance Optimizing PostgreSQL for RTABench Q0

Andrei Lepikhov gets under the hood:

I wanted to explore whether Postgres could be improved by thoroughly utilising all available tools, and for this, I chose the RTABench benchmark. RTABench is a relatively recent benchmark that is described as being close to real-world scenarios and highly selective. One of its advantages is that the queries include expressions involving the JSONB type, which can be challenging to process. Additionally, the Postgres results on RTABench have not been awe-inspiring.

Ultimately, I decided to review all of the benchmark queries, and fortunately, there aren’t many, to identify possible optimisations. However, already on the zero query, there were enough nuances that it was worth taking it out into a separate discussion.

Click through for a dive into this particular query and what Andrei did and some of the lessons you can draw from it.

Leave a Comment

No More TLS 1.1 in Microsoft Fabric

Nisha Sridhar makes an announcement:

We have officially ended the support for TLS 1.1 and earlier on the Fabric platform. As previously announced, starting July 31, 2025, all outbound connections from Fabric to customer data sources must use TLS 1.2 or later.

This update follows our earlier announcement in the TLS Deprecation for Fabric blog, where we outlined the rationale and timeline for this transition.

Read on to see what you might need to do to keep up to date.

Leave a Comment

Learning RegEx with Louis Davidson

Louis Davidson has a few blog posts for us to catch up on. So far, this is a four-part series on regular expressions and SQL Server.

Part 1 covers simple pattern matching:

I have never once written an regular expression prior to a couple of articles on this blog. And truth be told, when I published those blogs, I got the expression wrong because it seemed to work, and it was what Copilot told me would work. If you are new like me and/or your code is important, test with lots of cases. I obviously fixed that code (thankfully the conclusions were right).

So no, I have never. LIKE does 99% of what I need in a simple manner, and .8% of the time in a complex way, so I never really thought about it too much. I suspect that will be the case even now in SQL, but like any good student, it is time to change my knowledge of regular expressions.

Part 2 covers repeating patterns:

In this blog, I want to look for strings that have 1 or more instances of a repeating pattern. For example, say you want to look for something like the following:

LIKE'%FredFredFred%'

--(or any fixed or unlimited length of a, and only a)
LIKE'%aaaaaaaaaaaaaaa%'or'%aaaaaaa%'

Part 3 looks at matching sets of characters:

In this article, we are going to take an initial look at what are referred to as “character classes” or “character sets” in Regular Expressions. They are commonly used when looking for data to be in a certain format. For example:

We are going to look at how to set a filter for 'lll-ll-lln' and/or 'lll-ll-lll' (where l is letter and n is numeric).

And part 4 deals with negation:

In Part 3, I covered some of the basics of using character classes/sets. (I do tend to say sets.) This allowed us to do things like find words that start with a, b, c, d or e. This is done using: ^[a-e] or ^[abcde]. Now I want to look at two new things (one of which looks really similar to the previous classes but does things very differently.:

  • Negated character classes – Look for strings that don’t have a particular character in them
  • Perl character classes – shorthand for certain types of characters

Regular expressions can be very challenging to learn and even more challenging to troubleshoot and ensure there are no missing corner cases. But they offer an enormous amount of power and that makes it all worthwhile.

Leave a Comment

Working with Microsoft’s First-Party Python Driver

Sebastiao Pereira takes a look at mssql-python:

Python can connect to SQL Server using drivers like pyodbc and pymssql. However, Microsoft recently released a new Python driver called Python Driver for SQL Server or mssql-python. Currently in preview, Microsoft describes it as “the only first-party driver.” So, what’s this new driver all about, and how do you use it? Learn how to configure Python to connect to SQL Server with this new driver.

My standard caveat applies: this looks pretty neat, assuming that Microsoft actually continues to support it. Sebastiao mentions that it requires Python 3.13, but the docs say 3.10 or later. If the former is true, it might be a while before a lot of shops actually use it. But if the latter is true, most Python installations should support the driver out of the box.

Leave a Comment

What-If Analysis in Power BI

Ben Richardson takes us through a what-if analysis:

What If Analysis is a modelling technique used to evaluate different outcomes by changing key input variables.

In Power BI, it uses What If parameters and dynamic DAX measures that recalculate outputs based on user input. Users can ask questions like:

  • “What if sales increase by 10%?”
  • “What if production costs drop by 5%?”

The parameters are created in the Modelling tab, where you define value ranges. Power BI automatically generates a slicer and a measure, which can then be used in DAX calculations to dynamically adjust metrics like revenue, cost, or profit.

Read on to see how it works, understanding that you have to provide the formulas for behavior. In other words, if your what-if parameter is around the unit price of some product, there is no built-in concept of price elasticity for the product. That’s something you’d have to implement yourself.

Leave a Comment

The CU+GDR Path in SQL Server’s Service Model

Jon Russell clarifies the situation:

SQL Server administrators often encounter Microsoft updates labeled as “CU + GDR”, and understandably, this can cause confusion — especially when trying to stay on a consistent CU-based servicing path. This post clarifies what “CU + GDR” really means and why it’s not something to worry about.

Read on for an overview of the different security models, as well as the odd duck in SQL Server 2016.

Leave a Comment