Latency Vs Throughput

I have a post up on understanding latency versus throughput:

The primary method in which data moves from one process to another is through buffers.  We break up data into smaller portions and push them to their destination.  In Integration Services, we have buffers.  When passing data through TCP, we use packets.

Okay, so what’s the trade-off?  The trade-off is between latency and throughput.  Let’s take TCP packets as an example.  Say you have a series of 50-byte messages you want to send from a source to a destination.  We have two primary options:  push messages as fast as possible, or hold off until you have the most data you can store in a packet and send it along.  For simplicity’s sake, we’ll say that we can fit about 1350 bytes in a packet, so we can store 27 messages in a packet.  We’ll also assume that it takes 10 milliseconds to send a packet from the source to the destination (regardless of packet size, as we’re using powerful connections) and 1 millisecond to produce a message.

We use pipes as metaphors in IT, especially around data transfer, and I think it’s a solid metaphor because it intuitively includes most of the important concepts we need to worry about with data.  We have latency (how long it takes something to go from one end of the pipe to the other), throughput (how much we can move at any point in time, which is determined by things like the diameter of the pipe), back pressure (in the pipe scenario, resistance caused by pipe directional changes; in the data world, when downstream operators are slower than upstream operators), etc.

Related Posts

All Execution Plans Are Estimates

Grant Fritchey drops a bomb on us: All these resources, yet, for any given query, all the plans will be identical (assuming no recompile at work). Why? Because they’re all the same plan. Each and every one of them is an estimated plan. Only an estimated plan. This is why the estimated costs stay the […]

Read More

The Optimal Kafka Message Size

Guy Shilo wants to figure out the right chunk size for a Kafka message: I wrote a python program that runs a producer and a consumer for 30 minutes with different message sizes and measures how many messages per second it can deliver, or the Kafka cluster throughput. I did not care about the message […]

Read More


November 2016
« Oct Dec »