Press "Enter" to skip to content

Author: Kevin Feasel

Maintaining Statistics Information Post-Update in PostgreSQL 18

Laurenz Albe takes a peek at an upcoming feature:

Everybody wants good performance. When it comes to the execution of SQL statements, accurate optimizer statistics are key. With the upcoming v18 release, PostgreSQL will preserve the optimizer statistics during an upgrade with dump/restore or pg_upgrade (see commit 1fd1bd8710 and following). With the beta testing season for PostgreSQL v18 opened, it is time to get acquainted with the new feature.

It’s kind of wild to me that this wasn’t in place years ago for PostgreSQL.

Leave a Comment

First Thoughts on Rancher Desktop

Steve Jones tries something new:

I’ve been very happy with Docker Desktop for years, running it on both laptop and desktop. However, a corporate decision was made to move to Rancher Desktop, so I now have an unexpected “opportunity” to learn something new.

Here’s a short post on how things went on the desktop and laptop.

I’m guessing corporate made the switch because of Docker for Desktop’s licensing costs. It is kind of funny how, in the Windows and even MacOS world, “Docker” is synonymous with “container,” whereas in the Linux world, that’s not at all the case.

Leave a Comment

Multithreading and Multiprocessing in Python

Jessica Wachtel explains how two systems work in Python:

Let’s use a simple example to understand them: a mechanics shop. Concurrency happens when one mechanic works on several cars by switching between them. For example, the mechanic changes the oil in one car while waiting for a part for another. They don’t finish one car before starting the next, but they can’t do two tasks at exactly the same time. The tasks overlap in time but don’t happen simultaneously.

Click through for the analogy, how it applies to Python, and tips and tricks around each.

Leave a Comment

IF Statements and DAX

Marco Russo and Alberto Ferrari talk in hypotheticals:

DAX is a functional language. This means that – no matter how complicated it is – a measure is just ONE function call. Then, functions call other functions, creating the intricacies of a sophisticated DAX expression. However, there is always just one function at the top level. This is, at the same time, beautiful and painful, elegant and complex to understand. It is fair to say that being functional is what makes DAX so fascinating.

However, when a DAX formula is executed, it loses its functional nature. Indeed, in the end it needs to be transformed into a set of simpler queries executed by one of the engines of DAX: either the storage engine or the formula engine. During this step, the function execution is transformed, and it becomes much simpler.

Click through to see how the IF() function works in such a world.

Leave a Comment

The Downsides of SELECT FOR UPDATE in PostgreSQL

Laurenz Albe explains why SELECT FOR UPDATE is rarely the right call:

Recently, while investigating a deadlock for a customer, I was again reminded how harmful SELECT FOR UPDATE can be for database concurrency. This is nothing new, but I find that many people don’t know about the PostgreSQL row lock modes. So here I’ll write up a detailed explanation to let you know when to avoid SELECT FOR UPDATE.

Click through for the full explanation.

Leave a Comment

Result Set Caching in Microsoft Fabric Data Warehouse

Emily Tehrani makes an announcement:

Result Set Caching is now available in preview for Microsoft Fabric Data Warehouse and Lakehouse SQL analytics endpoint. This performance optimization works transparently to cache the results of eligible T-SQL queries. When the same query is issued again, it directly retrieves the stored result, instead of recompiling and recomputing the original query. This operation drastically cuts execution time for complex queries. The cache is then automatically managed on the user’s behalf. This lightweight performance boost is most beneficial for workloads like reports, that issue many repetitive T-SQL queries to the DW and SQL analytics endpoint.

This is something I’ve wished we had on-premises for years and years, especially for data warehouses where you know the data only changes once every x hours or days. You can, of course, do this yourself with the cache-aside pattern and some caching solution, but that implies you have a layer between your end user and the data source that you fully control.

Leave a Comment

Useful Query Store Metrics

Jared Poche gives us five:

Query Store is my favorite way to gather information about problem queries and plans, and I wanted to share some information on the useful metrics I use most.

The first two are obvious, but there’s a difference between them. The last two are not obvious but offer an unusual utility. I also wanted to explain why I use logical reads and mostly ignore physical reads.

Read on for Jared’s list.

Leave a Comment