Tracking Kafka Consumer Lag

Simarpreet Kaur Monga has a Scala-based example showing how to calculate Kafka offset lag for consumers:

The Consumer can subscribe to multiple topics, you need to pass the list of topics you want to consume from. For the sake of simplicity, I have just passed a single topic to consume from.

Now that the consumer has subscribed to the topic, it can consume from that topic.

The consumer maintains an offset to keep the track of the next record it needs to read.

Now, let us see how we can find the consumer offsets.

The Consumer offsets can be found using the method offset of class ConsumerRecord. This offset points to the record in a Kafka partition. The consumer consumes the records from the topic in the form of an object of class ConsumerRecord. The class ConsumerRecord also consists of a topic name and a partition number from which the record is being received, and a timestamp as marked by the corresponding ProducerRecord (the record sent by the producer).

Click through for the rest of the story.

Related Posts

Controlling Partition and File Counts in Spark

Landon Robinson shows how we can control the number of partitions (and therefore the number of output files) on reduce-style jobs in Spark: Whatever the case may be, the desire to control the number of files for a job or query is reasonable – within, ahem, reason – and in general is not too complicated. And, it’s often […]

Read More

Creating an Azure Databricks Cluster

Brad Llewellyn shows how you can create an Azure Databricks cluster: There are three major concepts for us to understand about Azure Databricks, Clusters, Code and Data.  We will dig into each of these in due time.  For this post, we’re going to talk about Clusters.  Clusters are where the work is done.  Clusters themselves […]

Read More

Categories

November 2017
MTWTFSS
« Oct Dec »
 12345
6789101112
13141516171819
20212223242526
27282930