Streaming Analytics With Kafka

Rathnadevi Manivannan shows how to use Kafka SQL to query streaming data:

Kafka SQL, a streaming SQL engine for Apache Kafka by Confluent, is used for real-time data integration, data monitoring, and data anomaly detection. KSQL is used to read, write, and process Citi Bike trip data in real-time, enrich the trip data with other station details, and find the number of trips started and ended in a day for a particular station. It is also used to publish trip data from the source to other destinations for further analysis.

In this article, let’s discuss enriching the Citi Bike trip data and finding the number of trips on a particular day to and from a particular station.

Read on for a nice tutorial.

Related Posts

Unit Testing Spark Streaming DStreams

Anuj Saxena shows how to create unit tests for DStreams in Spark Streaming: The method ‘ testOperation ‘ takes the output of the operation performed on the ‘inputPair’ and check whether it is equal to the ‘outputPair’ and just like this, we can test our business logic. This short snippet lets you test your business logic without […]

Read More

Kafka Topic Reuse

Martin Kleppmann walks through the trade-offs of reusing Apache Kafka topics for different event types: The common wisdom (according to several conversations I’ve had, and according to a mailing list thread) seems to be: put all events of the same type in the same topic, and use different topics for different event types. That line of […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

Categories

January 2018
MTWTFSS
« Dec  
1234567
891011121314
15161718192021
22232425262728
293031