Press "Enter" to skip to content

Category: Kafka / Flink

Benchmarking Kafka

Jack Vanlightly continues a series on Dimster. First up is a benchmark of consumer groups versus share groups:

In this first share group benchmarking post, we’re going to use share groups as they are not intended to be used, but for a good reason. Share groups allow you to move past partitions as the unit of parallelism by allowing multiple consumers to read from the same partition, using message queue semantics. We’ll run those kinds of tests in the next post. In this post I just want to understand if the mechanics of how share groups work add any additional overhead compared to consumer groups. So we’ll use share groups as if they were consumer groups (by capping consumer count to partition count).

Objective: Use synthetic tests to measure the overhead of share groups compared to consumer groups in identical conditions.

After that, Jack simulates processing time:

In this post we’re going to simulate processing time in the consumers to make these benchmarks more realistic and show the utility of share groups (namely the ability to parallelize processing beyond the partition count).

We’ll see how the following two configurations play an important role in parallelizing consumption with share groups:

  • max.poll.records (consumer config)
  • group.share.partition.max.record.locks (broker-side config)

And there’s one more post in the series so far:

In the last post we used simulated consumer processing time to reveal how important it is to set an appropriate value for max.poll.records to ensure the consumer parallelism that we expect. With a uniform distribution of messages over partitions, the rule of thumb was a value somewhat lower than:

group.share.partition.max.record.locks / number of consumers per partition

But there’s more to parallel consumption than max.poll.records. The size of producer batches also plays a role when using the default share.acquire.mode (batch_optimized).

Stay tuned for the next post in the series.

Leave a Comment

Dimensional Testing in Kafka

Jack Vanlightly announces a new tool:

Most of my career in distributed systems has been as a tester, performance engineer and formal verification specialist. I’ve written performance benchmarking tools in the past, for RabbitMQ and Apache Pulsar but in recent years I’ve used OpenMessagingBenchmark (OMB) to run benchmarks against Apache Kafka and other messaging systems. But OMB is hard to deploy and has several limitations compared to more sophisticated benchmarking systems I’ve developed in the past. With Claude becoming so much better since Christmas I decided to write a Kafka-centric performance benchmarking tool, with a lot of inspiration from OMB. I took the bits I like about OMB and the things I like about the tooling I’ve built in the past, to make a performance testing tool for testing Apache Kafka.

Click through for an overview of the tool and how it works.

Leave a Comment

Materialized Tables in Apache Flink

Robin Moffatt digs into a neat feature in Apache Flink:

Flink added support for what it calls Materialized Tables in 1.20, released in 2024. You can read about the design and motivations in FLIP-435. In a nutshell, Materialized Tables provide a way to include the SQL to populate and refresh a table as part of its definition.

Let’s take a look!

Robin takes a deep dive into it, figures out several issues you might run into along the way, and provides a verdict at the end of the post. In addition, the GitHub repo includes a Docker Compose file you can use to follow along.

Comments closed

What’s New in Kafka 4.1.0

Paul Brebner has a list:

Since then, Kafka 4.1.0 was released (September 2025, see detailed release notes), with around 472 Kafka Improvement (KIPs), including new features, improvements, bug fixes, tests, and more—well done to the Apache Kafka open source community! Kafka 4.1.1 (a bugfix release) was made available on the NetApp Instaclustr Managed Platform in December 2025.

So, what’s changed from 4.0 to 4.1.0? What are the most interesting improvements (for me at least)? In this blog, we focus on a new improvement, the Streams Rebalance Protocol.

Click through for that list.

Comments closed

Kafka Topic Management in Amazon MSK

Swapna Bandla, et al, dig into a managed service:

If you manage Apache Kafka today, you know the effort required to manage topics. Whether you use infrastructure as code (IaC) solutions or perform operations with admin clients, setting up topic management takes valuable time that could be spent on building streaming applications.

Amazon Managed Streaming for Apache Kafka (Amazon MSK) now streamlines topic management by supporting new topic APIs and console integration. You can programmatically create, update, and delete Apache Kafka topics using familiar interfaces including AWS Command Line Interface (AWS CLI), AWS SDKs, and AWS CloudFormation. With these APIs, you can define topic properties such as replication factor and partition count and configuration settings like retention and cleanup policies. The Amazon MSK console integrates these APIs, bringing all topic operations to one place. You can now create or update topics with a few selections using guided defaults while gaining comprehensive visibility into topic configurations, partition-level information, and metrics. You can browse for topics within a cluster, review replication settings and partition counts, and go into individual topics to examine detailed configuration, partition-level information, and metrics. A unified dashboard consolidates partition topics and metrics in one view.

In this post, we show you how to use the new topic management capabilities of Amazon MSK to streamline your Apache Kafka operations. We demonstrate how to manage topics through the console, control access with AWS Identity and Access Management (IAM), and bring topic provisioning into your continuous integration and continuous delivery (CI/CD) pipelines.

Read on to see what the experience looks like using the MSK console.

Comments closed

Diskless Topics in Apache Kafka

Paul Brebner extends a metaphor:

I’ve been tracking the progress of Apache Kafka “Diskless Topics” for a while now. It’s a topic that sparks curiosity—mostly because the name itself sounds like an oxymoron. How can a topic be diskless? Where does the data go? 

With the recent voting on KIP-1150, I decided it was time to dive deep into the architectural changes. There are several related Kafka Improvement Proposals (KIPs) floating around, but KIP-1150 is dependent on KIP-1163 and KIP-1164, and the designs are still in flux. Consider this blog post a “theory” in the true scientific sense: a best-guess model based on current evidence that will almost certainly evolve. 

Click through for your moment of zen.

Comments closed

Recent Updates to Apache Kafka

Jaisen Mathai lays out some changes:

Hey, fellow Apache Kafka® developers! Let’s look at the important updates across Confluent’s client ecosystem, from the core librdkafka to the wrappers for PythonGo.NET, and JavaScript. The last couple of months have been focused on laying down some solid architectural foundations and adding key quality-of-life features.

I’m curious to see what changes with IBM purchasing Confluent.

Comments closed

IBM Acquires Confluent

Confluent has an announcement:

We are excited to announce that Confluent has entered into a definitive agreement to be acquired by IBM. After the transaction is closed (subject to customary closing conditions and regulatory approvals), together, IBM and Confluent will aim to provide a platform that unifies the world’s largest enterprises, unlocking data for cloud/microservices, accelerating time-to-value, and building the real-time data foundation required to scale AI across every organization. 

Whelp. I suppose it was bound to happen at some point, but I definitely can’t say this news pleases me.

Comments closed

The Downside of Zero-Copy Integration between Kafka and Iceberg

Jack Vanlightly lays out an argument:

Over the past few months, I’ve seen a growing number of posts on social media promoting the idea of a “zero-copy” integration between Apache Kafka and Apache Iceberg. The idea is that Kafka topics could live directly as Iceberg tables. On the surface it sounds efficient: one copy of the data, unified access for both streaming and analytics. But from a systems point of view, I think this is the wrong direction for the Apache Kafka project. In this post, I’ll explain why. 

Read on for an explanation of what “zero-copy” means here, as well as Jack’s position on the matter. I think it’s a solid argument and worth the read.

Comments closed

Cross-Cloud Data Replication with Confluent

Ahmed Saef Zamzam and Hannah Miao move some data:

Cross-cloud replication over private networks is powered by Cluster Linking, Confluent’s fully managed, offset-preserving replication service that mirrors topics across clusters. Cluster Linking already makes it simple to connect environments across regions, clouds, and hybrid deployments with near-zero data loss. Now, with private cross-cloud replication, the possibilities expand even further—enabling secure multicloud data sharingdisaster recovery, and compliance use cases that many organizations, particularly those in regulated industries, have struggled to solve for years.

Click through to see how it works and how it can beat mechanisms that existed prior to it.

Comments closed