Stream Processing with Kafka

Satish Sharma wraps up a series on Kafka and Kafka Streams:

Consider a hypothetical fleet management company that needs a dashboard to get the insight of its day to day activities related to vehicles. Each vehicle in this fleet management company is fitted with a GPS based geolocation emitter, which emits location data containing the following information

1. Vehicle Id: A unique id is given to each vehicle on registration with the company.
2. Latitude and Longitude: geolocation information of vehicle.
3. Availability: The value of this field signifies whether the vehicle is available to take a booking or not. Current Status (Online/Offline) denotes whether the vehicle is on duty or not.

Read through the article and then check out Satish’s GitHub repo for more.

Related Posts

Comparing Performance: HBase1 vs HBase2

Surbhi Kochhar takes us through performance improvements between HBase version 1 and HBase version 2: We are loading the YCSB dataset with 1000,000,000 records with each record 1KB in size, creating total 1TB of data. After loading, we wait for all compaction operations to finish before starting workload test. Each workload tested was run 3 […]

Read More

The Transaction Log in Delta Tables

Burak Yavuz, et al, explain how the transaction log works with Delta Tables in Apache Spark: When a user creates a Delta Lake table, that table’s transaction log is automatically created in the _delta_log subdirectory. As he or she makes changes to that table, those changes are recorded as ordered, atomic commits in the transaction log. Each commit […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.


August 2019
« Jul