Press "Enter" to skip to content

Curated SQL Posts

Learning RegEx with Louis Davidson

Louis Davidson has a few blog posts for us to catch up on. So far, this is a four-part series on regular expressions and SQL Server.

Part 1 covers simple pattern matching:

I have never once written an regular expression prior to a couple of articles on this blog. And truth be told, when I published those blogs, I got the expression wrong because it seemed to work, and it was what Copilot told me would work. If you are new like me and/or your code is important, test with lots of cases. I obviously fixed that code (thankfully the conclusions were right).

So no, I have never. LIKE does 99% of what I need in a simple manner, and .8% of the time in a complex way, so I never really thought about it too much. I suspect that will be the case even now in SQL, but like any good student, it is time to change my knowledge of regular expressions.

Part 2 covers repeating patterns:

In this blog, I want to look for strings that have 1 or more instances of a repeating pattern. For example, say you want to look for something like the following:

LIKE'%FredFredFred%'

--(or any fixed or unlimited length of a, and only a)
LIKE'%aaaaaaaaaaaaaaa%'or'%aaaaaaa%'

Part 3 looks at matching sets of characters:

In this article, we are going to take an initial look at what are referred to as “character classes” or “character sets” in Regular Expressions. They are commonly used when looking for data to be in a certain format. For example:

We are going to look at how to set a filter for 'lll-ll-lln' and/or 'lll-ll-lll' (where l is letter and n is numeric).

And part 4 deals with negation:

In Part 3, I covered some of the basics of using character classes/sets. (I do tend to say sets.) This allowed us to do things like find words that start with a, b, c, d or e. This is done using: ^[a-e] or ^[abcde]. Now I want to look at two new things (one of which looks really similar to the previous classes but does things very differently.:

  • Negated character classes – Look for strings that don’t have a particular character in them
  • Perl character classes – shorthand for certain types of characters

Regular expressions can be very challenging to learn and even more challenging to troubleshoot and ensure there are no missing corner cases. But they offer an enormous amount of power and that makes it all worthwhile.

Leave a Comment

Working with Microsoft’s First-Party Python Driver

Sebastiao Pereira takes a look at mssql-python:

Python can connect to SQL Server using drivers like pyodbc and pymssql. However, Microsoft recently released a new Python driver called Python Driver for SQL Server or mssql-python. Currently in preview, Microsoft describes it as “the only first-party driver.” So, what’s this new driver all about, and how do you use it? Learn how to configure Python to connect to SQL Server with this new driver.

My standard caveat applies: this looks pretty neat, assuming that Microsoft actually continues to support it. Sebastiao mentions that it requires Python 3.13, but the docs say 3.10 or later. If the former is true, it might be a while before a lot of shops actually use it. But if the latter is true, most Python installations should support the driver out of the box.

Leave a Comment

What-If Analysis in Power BI

Ben Richardson takes us through a what-if analysis:

What If Analysis is a modelling technique used to evaluate different outcomes by changing key input variables.

In Power BI, it uses What If parameters and dynamic DAX measures that recalculate outputs based on user input. Users can ask questions like:

  • “What if sales increase by 10%?”
  • “What if production costs drop by 5%?”

The parameters are created in the Modelling tab, where you define value ranges. Power BI automatically generates a slicer and a measure, which can then be used in DAX calculations to dynamically adjust metrics like revenue, cost, or profit.

Read on to see how it works, understanding that you have to provide the formulas for behavior. In other words, if your what-if parameter is around the unit price of some product, there is no built-in concept of price elasticity for the product. That’s something you’d have to implement yourself.

Leave a Comment

The CU+GDR Path in SQL Server’s Service Model

Jon Russell clarifies the situation:

SQL Server administrators often encounter Microsoft updates labeled as “CU + GDR”, and understandably, this can cause confusion — especially when trying to stay on a consistent CU-based servicing path. This post clarifies what “CU + GDR” really means and why it’s not something to worry about.

Read on for an overview of the different security models, as well as the odd duck in SQL Server 2016.

Leave a Comment

Storytelling with Time Series Scatter Charts in Power BI

Reza Rad takes us through data changes:

Column or Bar chart can be easily used for showing a single measure’s insight across a category. Mixed charts such as Line and Column chart can be used for showing two measure and comparing their values across a set of categories. However there are some charts that can be used to show values of three measures, such as Scatter Chart. Scatter chart not only shows values of three measure across different categories, it also has a special Play axis that helps you to tell the story behind the data. In this post you’ll learn how easy is to visualize something with Scatter chart and tell a story with that. If you like to learn more about Power BI, read Power BI online book; from Rookie to Rock Star.

Read on for the blog post as well as a video version.

Leave a Comment

Getting beyond Pandas

Shittu Olumide recommends a few other packages:

If you’ve worked with data in Python, chances are you’ve used Pandas many times. And for good reason; it’s intuitive, flexible, and great for day-to-day analysis. But as your datasets start to grow, Pandas starts to show its limits. Maybe it’s memory issues, sluggish performance, or the fact that your machine sounds like it’s about to lift off when you try to group by a few million rows.

That’s the point where a lot of data analysts and scientists start asking the same question: what else is out there?

Read on for seven options, including six libraries and one built-in programming technique.

Leave a Comment

Animated Maps in R with gganimate

Osheen MacOscar looks at a new version of an old package:

In this blog post, we are going to use data from the {gapminder} R package, along with global spatial boundaries from ‘opendatasoft’. We are going to plot the life expectancy of each country in the Americas and animate it to see the changes from 1957 to 2007.

The {gapminder} package we are using is from the Gapminder foundation, an independent educational non-profit fighting global misconceptions. The cover issues like global warming, plastic in the oceans and life satisfaction.

There are several common gotchas that Osheen takes us through before building an animated map of the western hemisphere.

Leave a Comment

Oracle Password-Related Profile Settings

David Fitzjarrell takes a look at some settings:

Passwords expire, and, depending upon how various profiles are configured, accounts are either locked or provided a grace period during which the old password can be changed. In any recent enterprise password verification functions are provided to police new passwords to ensure a modicum of security. Let’s dig into what Oracle provides to assist in password security.

Basic elements of password security that Oracle provides start with the profile; listed below are the associated resources:

Read on for the available options you can set on a per-profile basis.

Leave a Comment

Auto-Scale Billing for Spark in Microsoft Fabric now GA

Santhosh Kumar Ravindran announces a feature in general availability:

We’re thrilled to announce the general availability (GA) of Autoscale Billing for Apache Spark in Microsoft Fabric — a serverless billing model designed to offer greater flexibility, transparency, and cost efficiency for running Spark workloads at scale.

With this model now fully supported, Spark Jobs can run independently of your Fabric capacity and are billed on a pay-as-you-go basis — similar to how Spark works in Azure Synapse. This gives teams the freedom to scale compute as needed without impacting other workloads running on your shared Fabric capacity.

I’m of two minds here. On the one hand, there is value to having this as an option. On the other hand, one of the talking points for Microsoft Fabric is that you have one billing model. But because it’s an optional thing you can enable rather than something you must use, I’m fine with it.

Leave a Comment