What Is Kafka?

I start a new series on Apache Kafka:

The broker serves several purposes:

  1. Know who the producers are and who the consumers are.  This way, the producers don’t care who exactly consumes a message and aren’t responsible for the message after they hand it off.
  2. Buffer for performance.  If the consumers are a little slow at the moment but don’t usually get overwhelmed, that’s okay—messages can sit with the broker until the consumer is ready to fetch.
  3. Let us scale out more easily.  Need to add more producers?  That’s fine—tell the broker who they are.  Need to add consumers?  Same thing.
  4. What about when a consumer goes down?  That’s the same as problem #2:  hold their messages until they’re ready again.

So brokers add a bit of complexity, but they solve some important problems.  The nice part about a broker is that it doesn’t need to know anything about the messages, only who is supposed to receive it.

This is an introduction to the product and part one of an eight-part series.

Related Posts

Generating Load For Kafka With JMeter

Anup Shirolkar shows us a way to use JMeter to generate load for Apache Kafka clusters: The Anomalia Machina is going to require (at least!) one more thing as stated in the intro, loading with lots of data! Kafka is a log aggregation system and operates on a publish-subscribe mechanism. The Kafka cluster in Anomalia Machina […]

Read More

Data Science And Data Engineering In HDP 3.0

Saumitra Buragohain, et al, show off some of the things added to the Hortonworks Data Platform for data scientists and data engineers: We leverage the power of HDP 3.0 from efficient storage (erasure coding), GPU pooling to containerized TensorFlow and Zeppelin to enable this use case. We will the save the details for a different […]

Read More

Categories

October 2016
MTWTFSS
« Sep Nov »
 12
3456789
10111213141516
17181920212223
24252627282930
31