A classic interview question is: “How do you go about displaying large amounts of data in a performant way?” Most people (at least on the front end), usually come up with pagination first. An implementation for pagination might go something like this:
Out of a list of 100, request 10 items at a time until 100 items are reached. So you would do 9 requests, asking for 1–10, 11–20, etc., until the 100 are reached.
In Kafka’s case, there could be 1 million messages between successive requests, so a user can never see the “latest” message, only the range as requested by the browser. In addition, there is a fundamental problem with pagination as it relates to Kafka. Message ordering across partitions is non-deterministic, so what is displayed in the UI, a linear sequence from 1–100, would not represent the data as it is laid out inside of Kafka.
Very interesting reading.