Dmitry Tolpeko solves an interesting problem:
It would be nice to smooth S3 write operations between two checkpoints. How to do that?
You may have already noticed there are 3 single PUT operations above made at 37:02, 37:06 and 37:09 before the checkpoint. The write size can give you a clue, it is a single part of multi-part upload to S3.
So some data sets were quite large so their data spilled before the checkpoint. Note that this is the internal spill in S3, data will not be visible until committed upon the successful Flink checkpoint.
So how can we force more writes to happen before the checkpoint so we can smooth IOPS and probably reduce the overall checkpoint latency?
Read on for the answer.