Kafka Consumer Groups

David Brinegar discusses consumer groups and lag in Apache Kafka:

While the Consumer Group uses the broker APIs, it is more of an application pattern or a set of behaviors embedded into your application.  The Kafka brokers are an important part of the puzzle but do not provide the Consumer Group behavior directly.  A Consumer Group based application may run on several nodes, and when they start up they coordinate with each other in order to split up the work.  This is slightly imperfect because the work, in this case, is a set of partitions defined by the Producer.  Each Consumer node can read a partition and one can split up the partitions to match the number of consumer nodes as needed.  If the number of Consumer Group nodes is more than the number of partitions, the excess nodes remain idle. This might be desirable to handle failover.  If there are more partitions than Consumer Group nodes, then some nodes will be reading more than one partition.

Read the whole thing.  It’s part one of a series.

Related Posts

Enabling Exactly-Once Kafka Streams

Guozhang Wang wraps up his exactly-once series in Kafka: When restarting the application from the point of failure, we would then try to resume processing from the previously remembered position in the input Kafka topic, i.e. the committed offset. However, since the application was not able to commit the offset of the processed message A before crashing […]

Read More

Avro Schemas In Kafka

Stephane Maarek explains the value of using Apache Avro as a schema structure for your Kafka topics: Avro has support for primitive types ( int, string, long, bytes, etc…), complex types (enum, arrays, unions, optionals), logical types (dates, timestamp-millis, decimal), and data record (name and namespace). All the types you’ll ever need. Avro has support for embedded documentation. Although documentation is optional, in my workflow I […]

Read More


October 2016
« Sep Nov »