Akka Streams backpressure: what actually happens when your sink is slow

Backpressure in Akka Streams is one of those concepts that sounds intuitive until you have to explain it to someone else. "The sink pulls from upstream, so if the sink is slow, everything upstream slows down." Correct in the simple case, but the real story gets interesting as soon as you introduce async boundaries, buffers, or anything running on a different dispatcher.

The simple case

A linear graph with no async boundaries is a single actor under the hood. The Reactive Streams protocol works exactly like you'd expect: the downstream sends Request(n) upstream, the upstream emits at most n elements, rinse and repeat. If your sink processes 1000 elements/sec and your source can produce 10000, the source literally won't run its next tick until the sink signals it's ready.

This is why the first question to ask about a "slow Akka Stream" is: is there actually any async boundary in this graph? If not, the whole thing is one fused pipeline of Scala functions, and your throughput is just your sink throughput, minus some per-element overhead.

Where it gets interesting: .async

Drop an .async into the middle of a graph and you've split it into two fused segments, each running on a dispatcher, connected by a buffer. The default buffer size in Akka Streams is 16 elements. That buffer is load-bearing: it's the only thing that lets the two halves run concurrently.

Now the backpressure story has more moving parts:

  • Upstream produces as fast as it can and parks itself against the buffer's input.
  • The buffer fills up to its max size.
  • Downstream pulls one element at a time.
  • Every time the buffer drains an element, it signals upstream to produce one more.

If the upstream is much faster than the downstream, this behaves almost identically to the sync case — just with a 16-element lag. If upstream is much slower, the buffer is mostly empty and the downstream is the one waiting. The interesting case is when they're close in throughput: buffer size starts to matter, because it determines how well you can absorb short bursts without stalling.

The mental model that held up

After a few years of maintaining Akka Streams pipelines in production, the mental model I settled on is: each .async boundary is a producer-consumer handoff with a small bounded queue. Backpressure isn't magic — it's the consumer pulling, and the queue blocking the producer when it's full.

Two practical consequences:

  • If you add .async in too many places, you end up with a pipeline full of small queues, each adding latency and none of them actually absorbing load in a useful way. Be deliberate about where the boundaries go.
  • The 16-element default buffer is fine for elements that are cheap to produce. If your upstream is generating a few KB per element and your downstream takes 100ms per element, you might want a bigger buffer — or a different architecture.

When backpressure lies

Two patterns quietly break the clean model:

1. mapAsync with unbounded parallelism. If you set parallelism to something huge, you're effectively removing backpressure from that stage — the stage accepts elements as fast as they come and fires off futures concurrently. Your downstream will then see a huge burst when the futures complete. The slow thing here is usually the external system those futures are calling, which now experiences a stampede.

2. Side effects inside map. If your transformation makes a DB call synchronously inside a map, the backpressure does propagate (the stage blocks until the call returns), but you've now coupled your pipeline's throughput to the DB's latency and tied up a dispatcher thread doing I/O. Use mapAsync with a reasonable parallelism, or move the call onto its own dispatcher.

Debugging in practice

When a real pipeline is running slow, I usually check three things in order:

  1. Stage-level metrics (elements in, elements out, demand, buffer fill ratio). If a stage has lots of demand and nothing upstream to give — the upstream is the bottleneck. If a stage has no demand — the downstream is.
  2. Dispatcher saturation. Are the threads actually running or blocked?
  3. External system latencies in anything wrapped by mapAsync. More often than you'd think, the actual problem is that a DB or HTTP call got slow and the rest of the pipeline is faithfully reporting that slowness upstream.

Once you get the mental model right, Akka Streams is one of the most predictable systems I've worked with. Backpressure really is doing what it says on the tin — you just have to know what the tin is.