Press "Enter" to skip to content

The Architecture of Apache Kafka

Michael Carter walks us through the key components and concepts behind Apache Kafka:

Despite its name’s suggestion of Kafkaesque complexity, Apache Kafka’s architecture actually delivers an easier to understand approach to application messaging than many of the alternatives. Kafka is essentially a commit log with a very simplistic data structure. It just happens to be an exceptionally fault-tolerant and horizontally scalable one.

The Kafka commit log provides a persistent ordered data structure. Records cannot be directly deleted or modified, only appended onto the log. The order of items in Kafka logs is guaranteed. The Kafka cluster creates and updates a partitioned commit log for each topic that exists. All messages sent to the same partition are stored in the order that they arrive. Because of this, the sequence of the records within this commit log structure is ordered and immutable. Kafka also assigns each record a unique sequential ID known as an “offset,” which is used to retrieve data.

Read on to learn more about the key ideas, such as producers, consumers, and partitions.