An Introduction To Kafka

Kevin Feasel

2017-07-14

Hadoop

Prashant Sharma explains the basics of Apache Kafka:

Apache describes Kafka as a distributed streaming platform that lets us:

  1. Publish and subscribe to streams of records.

  2. Store streams of records in a fault-tolerant way.

  3. Process streams of records as they occur.

Kafka is probably the most generally interesting of the current Hadoop ecosystem, with Spark not too far behind.  By “generally interesting,” I mean in the sense that companies with no vested interest in Hadoop as a whole could still be excited by the prospect of Kafka.

Related Posts

Anomaly Detection With Kafka Streams

Ajmal Karuthakantakath shows us an application which performs fairly simple anomaly detection using Kafka Streams: The problem is in the banking loan payment domain, where customers have taken a loan and they need to make monthly payments to repay the loan amount. Assume there are millions of customers in the system and all these customers need […]

Read More

Crossing The Streams With Kafka

Himani Arora shows how to join two Kafka streams together: KStream-KStream Join It is a sliding window join, that means, all tuples close to each other with regard to time are joined. Time here is the difference up to size of the window. These joins are always windowed joins because otherwise, the size of the internal state […]

Read More

Categories

July 2017
MTWTFSS
« Jun Aug »
 12
3456789
10111213141516
17181920212223
24252627282930
31