The 10 KB and 40 KB Cliffs: PQC Certificate Bloat as a Network Problem

A new arXiv preprint from Chou and Cao (UIUC / NCSA) puts numbers on the network cost of PQC certificate bloat. The headline: TTFB doesn’t degrade smoothly as chains grow — it steps up at two specific points where chain size breaches TCP’s flight windows.

Reference: arXiv 2604.24869v1, 27 April 2026.

The cliffs

The first cliff sits at ~10 KB. IW=10 caps first-RTT data at ~14 KB; cross it and you’re paying an extra round-trip. The second sits around 40 KB, where slow-start doubling combines with IW to give an effective threshold near 42 KB.

ML-DSA-44 with two intermediates lands at ~12 KB — already past the first cliff. SLH-DSA-192s blows past both. The authors measure up to ~1.5x TTFB inflation in extreme RTT cases purely from threshold crossings. Propagation delay alone barely moves the needle.

What works

Three mitigations evaluated, very different effectiveness ratios:

  • MTC: ~2–3x headroom. A Merkle proof in place of X.509 intermediates keeps a chain that would otherwise breach the 10 KB cliff under it (draft-ietf-plants-merkle-tree-certs).
  • CDN chain optimisation: ~1.6x headroom. Useful, not enough on its own for SLH-DSA.
  • Session resumption: extremely effective when it applies. Their NCSA Zeek data shows 94% resumption on CDN TLS 1.3 vs 46% non-CDN. Resumed sessions skip certificate transmission entirely. CDNs realised ~2x the TTFB savings of non-CDNs.

Why it matters here

Two compounding factors for NZ.

Distance amplifies every extra RTT — 120–200+ ms intercontinental is normal for us. A 12 KB ML-DSA chain to a US-East origin becomes a 150–200 ms penalty on every cold handshake, on top of the baseline.

NZ runs heavy on CDN edges precisely because of that distance. The finding that CDN-side mitigations recover most of the loss is good news — but only if your service is actually fronted, and only if your resumption rate looks like the NCSA aggregate. For first-touch flows, it almost certainly doesn’t.

Caveats

OQS stack overhead dominated absolute TTFB in the testbed (~50–55 ms attributable to implementation, not network), and packet loss and fragmentation weren’t measured directly. The thresholds themselves are real, though, and the qualitative finding — that PQC migration risk is a network-shape problem, not just a crypto-correctness problem — is the framing worth taking forward.