What Is Kafka?

I start a new series on Apache Kafka:

The broker serves several purposes:

  1. Know who the producers are and who the consumers are.  This way, the producers don’t care who exactly consumes a message and aren’t responsible for the message after they hand it off.
  2. Buffer for performance.  If the consumers are a little slow at the moment but don’t usually get overwhelmed, that’s okay—messages can sit with the broker until the consumer is ready to fetch.
  3. Let us scale out more easily.  Need to add more producers?  That’s fine—tell the broker who they are.  Need to add consumers?  Same thing.
  4. What about when a consumer goes down?  That’s the same as problem #2:  hold their messages until they’re ready again.

So brokers add a bit of complexity, but they solve some important problems.  The nice part about a broker is that it doesn’t need to know anything about the messages, only who is supposed to receive it.

This is an introduction to the product and part one of an eight-part series.

Related Posts

Building TensorFlow Neural Networks On Spark With Keras

Jules Damji has an example of using the PyCharm IDE to use Keras to build TensorFlow neural network models on the Databricks MLflow library: Our example in the video is a simple Keras network, modified from Keras Model Examples, that creates a simple multi-layer binary classification model with a couple of hidden and dropout layers and […]

Read More

Hortonworks Data Platform 3.0 Released

Saumitra Buragohain, et al, announce the newest version of the Hortonworks Data Platform: Highlighted Apache Hive features include: Workload management for LLAP:  You can assign resource pools within LLAP pool and allocate resources on a per user or per group basis. This enables support for large multi-tenant deployments. ACID v2 and ACID on by default:  We are […]

Read More

Categories

October 2016
MTWTFSS
« Sep Nov »
 12
3456789
10111213141516
17181920212223
24252627282930
31