One of the greatest advantages of Kafka is its ability to maintain high throughput of data. Unsurprisingly, high throughput starts with the producers. Prior to sending messages off to the brokers, individual records destined for the same topic-partition are batched together as a single compressed collection of bytes. These batches are then further aggregated before being sent to the destination broker.
Batching is a great thing, and we (generally) want it. But how do you know when it’s working well and when it’s not?
This first post covers message throughput but there will be several other topics in the series as well.