Curated SQL – Page 144 – A Fine Slice Of SQL Server

Transmitting Printed Data in Notebooks

Published 2025-04-09 by Kevin Feasel

Marc Lelijveld provides a public service announcement:

When working with Notebooks in Microsoft Fabric, exporting and reusing them across environments or tenants might seem like a harmless, even convenient, task. Whether you’re sharing a template with a colleague, moving assets between workspaces, or contributing to the community — the last thing you’d expect is to accidentally include data along with your code.

But that’s exactly what can happen.

For people who have worked with Jupyter notebooks in the past, this is a fairly obvious result. But if you aren’t familiar with the platform, that idea may seem weird. Marc does provide some options for exporting notebook contents, and you can also clear the cell contents before exporting.

Comments closed

SELECT FOR UPDATE in PostgreSQL

Published 2025-04-09 by Kevin Feasel

Umair Shahid preps for an update:

When multiple transactions need to modify the same rows, ensuring data consistency can become tricky. A single wrong approach to locking can lead to suboptimal performance or even bring your application to a standstill as numerous transactions block one another. One tool in PostgreSQL’s arsenal to handle concurrency is SELECT FOR UPDATE. It allows you to lock specific rows before updating them, preventing other transactions from modifying those rows until your transaction completes.

In this blog, we will dive deep into SELECT FOR UPDATE in PostgreSQL. We will explore how it helps in reducing contention, avoiding deadlocks, and ultimately boosting performance when dealing with highly concurrent applications.

Click through to understand how this works and also some notes on when to use it and when not to use it.

Comments closed

Common Data Model Connector for Synapse Spark 3.4

Published 2025-04-08 by Kevin Feasel

Richard Swinbank deals with a totally-not-deprecated platform:

The underlying problem here appears to be that the Spark connector is simply not supported in v3.4. There’s very little I can find to officially confirm or deny this, but an answer to this question on Microsoft Q&A backs this up. The answer also suggests a few options, including:

downgrade to Spark 3.3 – this isn’t an option because it’s end-of-life

migrate to Fabric – long term this is good idea, but it’s not a quick fix for this problem.

use alternative data access methods, e.g. using the Azure Data Lake Storage Gen2 connector.

In this article, I take a look at option (3).

Click through for Richard’s workaround.

Comments closed

Handling NULL in T-SQL

Published 2025-04-08 by Kevin Feasel

Chad Callihan deals with non-existent data:

Handling NULL values can be a challenge and may lead to unexpected query results when mixed in with non-NULL values. Thankfully, there are a few SQL Server functions devoted to handling NULL values. Let’s look at two of them. The first, NULLIF, will help you to return a NULL value. The second, ISNULL, will help you with an alternative to a NULL value if you need another value in its place.

For the longest time, I strongly preferred COALESCE() over ISNULL() because of how it can handle multiple scenarios, as well as it being ANSI SQL syntax. But it turns out that, if you do only have two things to compare, ISNULL() is a little bit faster.

Comments closed

CI/CD Announcements during FabCon

Published 2025-04-08 by Kevin Feasel

Kevin Chant keeps us up to date:

Microsoft Fabric Variable libraries were announced during the keynote at the Microsoft Fabric community conference. Which caused some excitement.

Variable libraries are an upcoming preview item that will enable developers to manage configurations within a workspace. Reducing the need for custom development work after deployments.

You will be able to achieve this by creating a Variable library in each workspace. From there, configure the individual variables for that workspace. Improving your CI/CD experience.

Click through to see more about that, as well as several other interesting announcements.

Comments closed

Setting the Right shared_buffer Value in PostgreSQL

Published 2025-04-08 by Kevin Feasel

Federico Campoli deals with a lot of RAM:

Nowadays it’s not uncommon to deal with machines with hundreds of GB of RAM.

Abundant memory can give PostgreSQL a massive performance boost. However things work slightly different than you may expect.

Let’s find out!

Click through for a dive into how PostgreSQL handles memory management.

Comments closed

Fuzzy Matching in Power Query and Power BI

Published 2025-04-08 by Kevin Feasel

Reza Rad does a bit of a match:

Have you ever wanted to match two tables together but not on exact matches, but also on a threshold of similarity? if your answer to this question is yes, then this feature is built for you. Let’s explore in details how the fuzzy matching works in Power BI. To learn more about Power BI, read Power BI from Rookie to Rock Star.

As always, Reza has a video and a blog post for us, explaining how the fuzzy matching process works and some of the knobs you can control. In the comments, Reza even digs up the specific algorithm that Microsoft uses for fuzzy matching.

Comments closed

Filtering Weekdays using DAX

Published 2025-04-08 by Kevin Feasel

Marco Russo and Alberto Ferrari show that tracking weekdays is like an onion:

Computing time intelligence calculations in DAX is rather simple. However, as soon as the requirements are not trivial, the complexity of formulas skyrockets, and it is necessary to have a very good understanding of several details about DAX to obtain a good formula. In this article, we show a simple requirement: the need to maintain a filter on weekdays while computing time intelligence. As you are about to read, it will require several complex steps despite being a simple requirement; but let us start by clarifying what we want to obtain and what a filter-preserving column is.

Click through for the full article.

Comments closed

Setting TEXTSIZE in SQL server

Published 2025-04-08 by Kevin Feasel

Steve Jones limits results:

There is a SET command in SQL Server that changes how much data is returned from some fields. This short post shows what I learned about the SET TEXTSIZE command.

I’m pretty sure that, if I’ve ever used this command before, it was by accident. It’s kind of a neat concept in how it can limit the result size of larger blocks of text, but I think it’s too limiting to be a good solution save for some very specific circumstances.

Comments closed

Avoid aggregate in R on Wide Matrices

Published 2025-04-07 by Kevin Feasel

Ali Oghabian shares some hard-earned advice:

The aggregate function can be very useful in R, allowing one to run a function (e.g. mean) within groups of rows, in each column in a matrix/data-frame and organize the results in an easy-to-read table. However, the function takes long to run for very wide matrices and data frames, where the number of the columns are large. I this post I demonstrate the issue and show a couple of nice solutions that at least for the example cuts down the time to 15% and even less, compared to the run-time of the aggregate function.

Click through for a demo. Granted, this is a matrix with 10,000 columns, so I’m not sure how this applies to narrower matrices. H/T R-Bloggers.

Comments closed

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Curated SQL Posts