To completely understand the problem, we will first go into detail how ingestion and processing occur by default in Kafka Streams. For example purposes, the
punctuatemethod is configured to occur every ten seconds, and in the input stream, we have exactly one message per second. The purpose of the job is to parse input messages, collect them, and, in the
punctuatemethod, do a batch insert in the database, then to send metrics.
After running the Kafka Stream application, the
Processorwill be created, followed by the
initmethod. Here is where all the connections are established. Upon successful start, the application will listen to input topic for incoming messages. It will remain idle until the first message arrives. When the first message arrives, the
processmethod is called — this is where transformations occur and where the result is stored for later use. If no messages are in the input topic, the application will go idle again, waiting for the next message. After each successful process, the application checks if
punctuateshould be called. In our case, we will have ten
processcalls followed by one
punctuatecall, with this cycle repeating indefinitely as long as there are messages.
A pretty obvious behavior, isn’t it? Then why is one bolded?
Read on for more, including how to handle this edge case.