Press "Enter" to skip to content

Category: Internals

Inspecting the Postgres Write-Ahead Log

Henrietta Dombrovskaya digs into the write-ahead log:

First, when the users fixed one of the primary suspects jobs, the situation with WAL growth didn’t change. Second, the rate of the growth couldn’t be explained by these suboptimal jobs: the data volumes they were removing and reloading were still magnitudes smaller than the WAL size we were dealing with. Finally, I decided to do what I should have done from the start – to take a look at what exactly was in these super-fast growing WALs.

Read on to learn what Henrietta found. Also check out the comments for some additional context.

Leave a Comment

Generating and Recognizing Hash Collisions in SQL Server

Hugo Kornelis continues a deep dive into hash tables in SQL Server:

Welcome back! In the previous parts, I first showed how a Hash Match (Left Outer Join) can give insight in the order of data in a hash table, and then used that trick to obtain and verify some interesting insights into the internal structure of such a table. It’s now time to see if this same trick can also be used to find hash collisions.

Hugo finds some interesting results along the way, so check it out.

Leave a Comment

SLRUs and MultiXacts in Postgres

Andrew Atkinson digs in:

The main purpose of SLRUs is to track metadata about Postgres transactions.

SLRUs are a general mechanism used by multiple types. Like a lot of things in Postgres, the SLRU system is extensible which means extensions can create new types.

The “least recently used” aspect might be recognizable from cache systems. LRU refers to how the oldest items are evicted from the cache when it’s full, and newer items take their place. This is because the cache has a fixed amount of space (measured in 8KB pages) and thus can only store a fixed amount of items.

Read on to learn more about these two concepts and how things have changed in Postgres 17.

Leave a Comment

The Internals of a Hash Table

Hugo Kornelis digs deep:

In part 1 of this series, I laid the foundation to explore the structure of the hash table, as used by the Hash Match operator, by alleging and then proving that a Hash Match (Left Outer Join) returns unmatched rows from the build input in the order in which they are stored in the hash table. This means that we can create queries on carefully curated data to gain insight in the structure of that hash table.

It is now time to use that trick to actually start to explore the hash table. But not without also looking at available documentation and common sense.

Click through for a waltz down memory lane, a graphical interpretation of a hash table, and some tests to see if Hugo is correct.

Comments closed

Join Operator Properties and Query Performance

Andy Brownsword takes a closer look at the big three join operators in SQL Server:

In the previous post I explained Join Operators in SQL Server. Whilst compiling that I dug a little deeper and came across a few interesting points I thought were worth sharing.

Let’s look at behaviour of the operators which may occur under specific conditions. Hopefully you find them as interesting as I did:

Click through for an interesting tidbit about nested loops joins, merge joins, and hash joins.

Comments closed

A Primer on Join Operators

Andy Brownsword takes a peek at the three most common types of join operators, plus a bonus:

When reviewing our execution plans we’ll see joins executed using different operators. The type of operator is chosen based on the data that’s available to join and how the optimiser wants to execute it.

In this post we’ll take a look at what the operators are, when they are used, and how they work. These are the operators we’ll cover:

  • Nested Loop Joins
  • Merge Joins
  • Hash Match Joins
  • (Bonus) Adaptive Joins

Read on for a quick overview of which works best when.

Comments closed

Ordered Insert Optimization in OrioleDB

Alexander Korotkov deals with hot page issues:

When many sessions try to insert into the same B-tree leaf page, classic exclusive page locking serializes progress and wastes time on sleep/wake cycles. We’re introducing a batch page insertion path that lets the session holding the page lock insert for itself and its neighbors. The result: dramatically reduced lock waits, and big gains at high client counts (2X throughput boost starting from 64 clients in our benchmark).

Click through to see how it differs from normal PostgreSQL behavior on what is one of the more common performance problems on busy OLTP systems.

Comments closed

Diving into Hash Tables

Hugo Kornelis dives into the arcane:

But what you probably don’t know is how that hash table is structured. How is the data stored? Where are new rows added, how is the table accessed?

To be fair, none of this is useful knowledge, unless you work for the engine team at Microsoft. And if you do, then you have access to source code and documentation, so you won’t need me to explain this structure to you. So why do I even take the trouble to investigate and describe this structure? Because I am a geek, and geeks love to dig into technical stuff and uncover things they were never meant to uncover.

“Because I can” is a perfectly valid reason to dig into a topic.

Comments closed

Kafka: From ZooKeeper to KRaft

Phil Yang lays out how to make a migration:

Apache Kafka has made a landmark shift in KIP-500 with the introduction of Kafka Raft (KRaft) mode, eliminating the dependency on Apache ZooKeeper for metadata management. With KRaft, the Kafka nodes themselves can be configured as KRaft controllers – which allow for metadata management and leader elections to work all within just Kafka, resulting in significant performance improvements. This cemented KRaft’s status as the metadata management protocol for Kafka moving forward.

This blog will guide you through the importance of this transition, what migrating from ZooKeeper to KRaft entails, and how we, at NetApp Instaclustr, make this seamless with our automated, streamlined process that is built into our platform.

Click through to see how you can update your own clusters, whether you’re using the Instaclustr service or not.

Comments closed