Imagine that you have built an Apache Flink® job. It collects records from Apache Kafka®, performs a time-based aggregation on those records, and emits a new record to a different topic. With your excitement high, you run the job for the first time, and are disappointed to discover that nothing happens. You check the input topic and see the data flowing, but when you look at the output topic, it’s empty.
In many cases, this is an indication that there is a problem with watermarks. But what is a watermark?
Read on for a primer on watermarks, followed by an explanation of the common solution to the problem Wade describes.