Hong Liang Teoh speeds things up:
When designing a Flink data processing job, one of the key concerns is maximising job throughput. Sink throughput is a crucial factor because it can determine the entire job’s throughput. We generally want the highest possible write rate in the sink without overloading the destination. However, since the factors impacting a destination’s performance are variable over the job’s lifetime, the sink needs to adjust its write rate dynamically. Depending on the sink’s destination, it helps to tune the write rate using a different RateLimitingStrategy.
This post explains how you can optimise sink throughput by configuring a custom RateLimitingStrategy on a connector that builds on the AsyncSinkBase (FLIP-171). In the sections below, we cover the design logic behind the AsyncSinkBase and the RateLimitingStrategy, then we take you through two example implementations of rate limiting strategies, specifically the CongestionControlRateLimitingStrategy and TokenBucketRateLimitingStrategy.
Read on for some tips on creating a rate limiting strategy for a sink.