The 1.2 KB problem: how post-quantum breaks the ClientHello
Turn on post-quantum key exchange and some connections start failing. Not with a crypto error — with a hang, or a reset partway through the handshake, on paths that worked fine yesterday. The cause isn't the cryptography. It's that the ClientHello got too big to fit in one packet.
How big, exactly
A classical TLS 1.3 ClientHello is small. The X25519 key share is 32 bytes, and the whole message lands around 300 bytes — comfortably inside a single TCP segment. Add post-quantum and that changes by an order of magnitude: the ML-KEM-768 encapsulation key alone is 1184 bytes, the X25519MLKEM768 key share is 1216, and the whole ClientHello crosses 1.5 KB.
# classical ClientHello (X25519 only)
key_share: X25519 32 bytes
whole ClientHello ~ 300 bytes # one TCP segment, always
# post-quantum ClientHello (X25519MLKEM768, codepoint 0x11ec)
key_share: ML-KEM-768 ek 1184 bytes
+ X25519 32 bytes
= key_exchange 1216 bytes
whole ClientHello ~ 1600 bytes # past the 1500-byte MTU -> 2+ segments
Those numbers are from a live handshake, not a spec table. The consequential one is the last: the ClientHello no longer fits in a single ~1500-byte Ethernet frame, so it arrives split across two or more TCP segments.
Why one packet ever mattered
TLS runs over TCP, which is a byte stream — nothing guarantees a record arrives in one segment, and a correct implementation reassembles before parsing. But for twenty-five years the classical ClientHello did fit in one segment, essentially always. So a great deal of network equipment was written assuming it would.
Middleboxes that route or filter on the server name — load balancers, SNI routers, inspecting firewalls — parse the ClientHello to read the SNI. Many read only the first TCP segment: a fast path with no full TCP reassembly. When the ClientHello spans two segments, that fast path sees a truncated message — no SNI, or a partial record — and depending on the implementation it drops the connection, times out, or routes it wrong. The bug was always there. Post-quantum just turned the multi-segment ClientHello from the never-case into the common case, and the latent bug became a daily outage.
What it looks like in the wild
The symptom is misleading. The handshake hangs or resets right after the ClientHello; it fails for some destinations and not others; and it correlates with a particular appliance in the path, not with either endpoint. It reads like a flaky network, not a cryptography problem — which is exactly why it eats diagnostic time. The crypto is fine. The bytes never arrived intact.
Why padding won't save you
There's prior art for ClientHello size bugs, and it points the wrong way. The padding extension (RFC 7685) exists because some servers once choked on a ClientHello in the 256–511 byte range; the fix was to pad it up past 512. Post-quantum is the opposite problem — too big, not too small. You cannot pad your way out of 1.2 KB. The only levers are making the larger message fit, or making the path handle it correctly.
Why this is the migration canary
The breakage isn't in your endpoints — both ends negotiate ML-KEM perfectly. It's in the path: the elements between them, which you frequently don't own — a cloud load balancer, a partner's firewall, an ISP's transparent proxy. This is the concrete first instance of the hard part of any PQC migration: the systems you don't control. And it bites first precisely because the ClientHello is the first post-quantum byte to cross the network at all — the opening flight, before a single secret is computed.
Finding where a split ClientHello dies is, once again, a wire problem: you have to watch the handshake to see which hop swallows the second segment. It is one step of the broader migration — see also why the share is 1.2 KB in the first place and the full byte-by-byte walk. When a PQC rollout "works in the lab" but fails in production, this is the first thing we look for — that's where we come in.
← All posts