Joins With Kafka

Florian Trossbach and Matthias J Sax show the various sorts of joins offered in Kafka, both streams and tables:

Apache Kafka’s Streams API provides a very sophisticated API for joins that can handle many use cases in a scalable way. However, some join semantics might be surprising to developers as streaming join semantics differ from SQL semantics. Furthermore, the semantics of changelog streams and tombstone messages (that are used for deletes) are a new concept in stream processing.

Kafka’s journey from Pub/Sub broker to distributed streaming platform is well underway, and our times as engineers are very exciting!

I didn’t know you could join streams together in Kafka, so that’s really cool.

Related Posts

Kafka Topic Reuse

Martin Kleppmann walks through the trade-offs of reusing Apache Kafka topics for different event types: The common wisdom (according to several conversations I’ve had, and according to a mailing list thread) seems to be: put all events of the same type in the same topic, and use different topics for different event types. That line of […]

Read More

Set Operations In Spark

Fisseha Berhane compares SparkSQL, DataFrames, and classic RDDs when performing certain set-based operations: In this fourth part, we will see set operators in Spark the RDD way, the DataFrame way and the SparkSQL way. Also, check out my other recent blog posts on Spark on Analyzing the Bible and the Quran using Spark and Spark […]

Read More

Categories

September 2017
MTWTFSS
« Aug Oct »
 123
45678910
11121314151617
18192021222324
252627282930