problems / verified-protocol-boundaries / widget
authors: quinn, nora
Before any of the AI-specific machinery in this chapter runs, a request has already crossed two or three protocol boundaries that predate AI entirely: the TLS session that carries the HTTPS request, the SSH daemon an operator uses to reach the box, and increasingly the WireGuard overlay that defines which machines can see the inference fleet at all. These are the oldest, most-studied trust boundaries in the stack, and they remain the ones whose single-bug blast radius is largest — Heartbleed [1], FREAK [2], and the Triple Handshake attack [3] on the TLS side; CVE-2024-6387 (regreSSHion) [4], a pre-authentication remote root on the SSH side that turned on a signal-handler race condition, not a parser bug. ARIA’s Safeguarded AI: Cybersecurity solicitation [5] names verified TLS 1.3, a verified SSH boundary, and a verified WireGuard control plane as three of its six example targets; they belong in this document for the same reason, and they share enough methodology to be one widget rather than three.
The shared shape is the LangSec [6] one: a network-facing boundary parses adversarial bytes and then drives a protocol state machine, and almost every catastrophic bug is either memory unsafety in the parser or an unintended transition in the state machine (downgrade, pre-auth confusion, stale-credential reuse). The formal methods community has already shown each piece is tractable in isolation. EverParse [7] produces verified zero-copy binary parsers; Project Everest [8], building on miTLS [9], verified a TLS reference implementation in F* down through the HACL*/EverCrypt [10], [11] cryptographic primitives, reaching the TLS 1.3 record layer and a substantial fragment of the handshake; Owl [12] compiles verified secure-channel protocol designs into interoperable implementations; Tamarin [13] and ProVerif [14] discharge the symbolic protocol-level secrecy and authentication proofs. What none of them deliver is a single machine-checked proof spanning the whole boundary — parser, both state machines, and the concurrency and signal-handling discipline of the daemon around them — shipped as a drop-in replacement at a real trust boundary. That gap, not a missing technique, is the widget.
The discipline that keeps this tractable is the assume-the-primitives line. The cryptographic primitives are assumed correct by reusing HACL*/EverCrypt; the OS kernel, hardware, and side channels are out of scope (and partly addressed by § and §); certificate-path validation is an assumed oracle. What remains is exactly the part the CVE record says actually breaks: wire-format parsing, the RFC state machines, downgrade resistance, and the honest exposure of authentication outcomes to the calling application. This is the same methodology as § — a formal grammar plus a verified parser at a trust boundary — scaled up from in-process text parsing to full network protocols.
Three instances, sharing a parser-plus-state-machine methodology so that the second and third are cheaper than the first.
Verified TLS 1.3 server. A protocol-compatible, drop-in TLS 1.3 server with a machine-checked proof covering wire-format parsing, the handshake and record-layer state machines, downgrade resistance, and faithful exposure of authentication outcomes to the application — both server and client roles (the latter for mTLS), a curated cipher-suite set, session resumption, 0-RTT, and post-handshake authentication. The specification: for any sequence of network inputs the implementation either processes input per RFC 8446 or rejects it; malformed inputs cannot trigger memory unsafety; state machines transition only along RFC 8446-permitted paths; negotiated parameters are the strongest mutually supported under policy; and an authenticated session is exposed to the application only when a valid handshake transcript has completed with an authenticated peer. miTLS [9] proved the core; the open frontier is miTLS-grade assurance at production performance and feature-completeness, shippable at the HTTPS edge.
Verified SSH boundary. At minimum the packet framing and parsing, the RFC 4253 transport layer, and the RFC 4252 user-authentication state machine, including the concurrency and signal-handling discipline of the daemon; a more ambitious version adds channel-opening and multiplexing (RFC 4254); the stretch goal is a full protocol-compatible drop-in server. The proof goal is memory safety, state-machine correctness under adversarial input, and the absence of unintended transitions through concurrent or asynchronous control flow — the last is what regreSSHion exploited, and what most existing SSH verification work omits. Vest’s [15] verified high-performance binary parsers and the HACL*/EverCrypt [10], [11] primitives are the reusable substrate; earlier work formally specifying SSH state machines found that even mature implementations violate parts of the standard [16], but no system covers the full boundary under one proof.
Verified WireGuard control plane. The WireGuard data plane (the tunnel itself) is assumed verified — Owl [12] and Tamarin [13] have addressed the handshake — so the target is the layer above: the control plane that, given device identities, access policy, and current network state, computes and distributes exactly the peer, key, and routing configuration realising the authorised connectivity graph, across the full device lifecycle (enrolment, approval, key rotation, policy change, revocation). The invariant is a zero-trust containment property: only authenticated and authorised devices can learn about and reach the peers and resources policy permits, preserved under every lifecycle transition. This is the same capability-graph containment analysis as §’s agent-authority sketch, applied to network overlay membership rather than gateway request routing. Tailscale and Cloudflare’s WARP are the widely deployed but unverified control-plane designs this would displace; the proof target is the control-plane state machine, with the tunnel protocol and identity issuance treated as trusted inputs.
// stack context
[execution-harness] Execution Harness
Execution Harness The execution harness is the outermost software layer of an ML deployment: the wrapper that actually calls the model and handles its inputs and outputs. A user’s prompt enters here, gets tokenized and batched, passes through the model, and returns as a completion. Because every int…
[orchestration-cloud] Orchestration & Cloud
Orchestration and Cloud Layer The orchestration layer manages the compute resources that ML workloads run on: scheduling jobs across GPU clusters, routing traffic between them, and controlling who can access what. It sits below the frameworks and above the firmware — a layer of distributed systems s…