Tractable Problems in AI Security via Formal Methods

Verified Protocol Trust Boundaries

Before any of the AI-specific machinery in this chapter runs, a request has already crossed two or three protocol boundaries that predate AI entirely: the TLS session that carries the HTTPS request, the SSH daemon an operator uses to reach the box, and increasingly the WireGuard overlay that defines which machines can see the inference fleet at all. These are the oldest, most-studied trust boundaries in the stack, and they remain the ones whose single-bug blast radius is largest — Heartbleed [1], FREAK [2], and the Triple Handshake attack [3] on the TLS side; CVE-2024-6387 (regreSSHion) [4], a pre-authentication remote root on the SSH side that turned on a signal-handler race condition, not a parser bug. ARIA’s Safeguarded AI: Cybersecurity solicitation [5] names verified TLS 1.3, a verified SSH boundary, and a verified WireGuard control plane as three of its six example targets; they belong in this document for the same reason, and they share enough methodology to be one widget rather than three.

The shared shape is the LangSec [6] one: a network-facing boundary parses adversarial bytes and then drives a protocol state machine, and almost every catastrophic bug is either memory unsafety in the parser or an unintended transition in the state machine (downgrade, pre-auth confusion, stale-credential reuse). The formal methods community has already shown each piece is tractable in isolation. EverParse [7] produces verified zero-copy binary parsers; Project Everest [8], building on miTLS [9], verified a TLS reference implementation in F* down through the HACL*/EverCrypt [10], [11] cryptographic primitives, reaching the TLS 1.3 record layer and a substantial fragment of the handshake; Owl [12] compiles verified secure-channel protocol designs into interoperable implementations; Tamarin [13] and ProVerif [14] discharge the symbolic protocol-level secrecy and authentication proofs. What none of them deliver is a single machine-checked proof spanning the whole boundary — parser, both state machines, and the concurrency and signal-handling discipline of the daemon around them — shipped as a drop-in replacement at a real trust boundary. That gap, not a missing technique, is the widget.

The discipline that keeps this tractable is the assume-the-primitives line. The cryptographic primitives are assumed correct by reusing HACL*/EverCrypt; the OS kernel, hardware, and side channels are out of scope (and partly addressed by § and §); certificate-path validation is an assumed oracle. What remains is exactly the part the CVE record says actually breaks: wire-format parsing, the RFC state machines, downgrade resistance, and the honest exposure of authentication outcomes to the calling application. This is the same methodology as § — a formal grammar plus a verified parser at a trust boundary — scaled up from in-process text parsing to full network protocols.

Solution/project Sketch

Three instances, sharing a parser-plus-state-machine methodology so that the second and third are cheaper than the first.

Verified TLS 1.3 server. A protocol-compatible, drop-in TLS 1.3 server with a machine-checked proof covering wire-format parsing, the handshake and record-layer state machines, downgrade resistance, and faithful exposure of authentication outcomes to the application — both server and client roles (the latter for mTLS), a curated cipher-suite set, session resumption, 0-RTT, and post-handshake authentication. The specification: for any sequence of network inputs the implementation either processes input per RFC 8446 or rejects it; malformed inputs cannot trigger memory unsafety; state machines transition only along RFC 8446-permitted paths; negotiated parameters are the strongest mutually supported under policy; and an authenticated session is exposed to the application only when a valid handshake transcript has completed with an authenticated peer. miTLS [9] proved the core; the open frontier is miTLS-grade assurance at production performance and feature-completeness, shippable at the HTTPS edge.

Verified SSH boundary. At minimum the packet framing and parsing, the RFC 4253 transport layer, and the RFC 4252 user-authentication state machine, including the concurrency and signal-handling discipline of the daemon; a more ambitious version adds channel-opening and multiplexing (RFC 4254); the stretch goal is a full protocol-compatible drop-in server. The proof goal is memory safety, state-machine correctness under adversarial input, and the absence of unintended transitions through concurrent or asynchronous control flow — the last is what regreSSHion exploited, and what most existing SSH verification work omits. Vest’s [15] verified high-performance binary parsers and the HACL*/EverCrypt [10], [11] primitives are the reusable substrate; earlier work formally specifying SSH state machines found that even mature implementations violate parts of the standard [16], but no system covers the full boundary under one proof.

Verified WireGuard control plane. The WireGuard data plane (the tunnel itself) is assumed verified — Owl [12] and Tamarin [13] have addressed the handshake — so the target is the layer above: the control plane that, given device identities, access policy, and current network state, computes and distributes exactly the peer, key, and routing configuration realising the authorised connectivity graph, across the full device lifecycle (enrolment, approval, key rotation, policy change, revocation). The invariant is a zero-trust containment property: only authenticated and authorised devices can learn about and reach the peers and resources policy permits, preserved under every lifecycle transition. This is the same capability-graph containment analysis as §’s agent-authority sketch, applied to network overlay membership rather than gateway request routing. Tailscale and Cloudflare’s WARP are the widely deployed but unverified control-plane designs this would displace; the proof target is the control-plane state machine, with the tunnel protocol and identity issuance treated as trusted inputs.

Bibliography

[1] Z. Durumeric et al., “The Matter of Heartbleed,” in Proceedings of the 2014 Internet Measurement Conference (IMC), 2014, pp. 475–488. doi: 10.1145/2663716.2663755.
[2] B. Beurdouche et al., “A Messy State of the Union: Taming the Composite State Machines of TLS,” in 2015 IEEE Symposium on Security and Privacy, 2015, pp. 535–552. doi: 10.1109/SP.2015.39.
[3] K. Bhargavan, A. Delignat-Lavaud, C. Fournet, A. Pironti, and P.-Y. Strub, “Triple Handshakes and Cookie Cutters: Breaking and Fixing Authentication over TLS,” in 2014 IEEE Symposium on Security and Privacy, 2014, pp. 98–113. doi: 10.1109/SP.2014.14.
[4] Qualys Threat Research Unit, “regreSSHion: Remote Unauthenticated Code Execution Vulnerability in OpenSSH Server (CVE-2024-6387).” [Online]. Available: https://blog.qualys.com/vulnerabilities-threat-research/2024/07/01/regresshion-remote-unauthenticated-code-execution-vulnerability-in-openssh-server
[5] Advanced Research and Invention Agency, “Safeguarded AI: Cybersecurity — Call for Proposals.” [Online]. Available: https://aria.org.uk/media/1padxpaf/safeguarded-ai_cybersecurity_solicitation.pdf
[6] L. Sassaman, M. L. Patterson, S. Bratus, M. E. Locasto, and A. Shubina, “Security Applications of Formal Language Theory,” 2013, pp. 489–500.
[7] T. Ramananandro et al., “EverParse: Verified Secure Zero-Copy Parsers for Authenticated Message Formats,” in Proceedings of the 28th USENIX Security Symposium, 2019.
[8] K. Bhargavan et al., “Everest: Towards a Verified, Drop-in Replacement of HTTPS,” in 2nd Summit on Advances in Programming Languages (SNAPL), 2017. [Online]. Available: https://drops.dagstuhl.de/entities/document/10.4230/LIPIcs.SNAPL.2017.1
[9] K. Bhargavan, C. Fournet, M. Kohlweiss, A. Pironti, and P.-Y. Strub, “Implementing TLS with Verified Cryptographic Security,” in IEEE Symposium on Security and Privacy (S&P), 2013.
[10] J.-K. Zinzindohoué, K. Bhargavan, J. Protzenko, and B. Beurdouche, “HACL*: A Verified Modern Cryptographic Library,” in ACM SIGSAC Conference on Computer and Communications Security (CCS), 2017. [Online]. Available: https://eprint.iacr.org/2017/536
[11] J. Protzenko et al., “EverCrypt: A Fast, Verified, Cross-Platform Cryptographic Provider,” in IEEE Symposium on Security and Privacy (S&P), 2020. [Online]. Available: https://eprint.iacr.org/2019/757
[12] J. Gancher, S. Gibson, P. Singh, S. Dharanikota, and B. Parno, “Owl: Compositional Verification of Security Protocols via an Information-Flow Type System,” in IEEE Symposium on Security and Privacy (S&P), 2023. [Online]. Available: https://eprint.iacr.org/2023/473
[13] S. Meier, B. Schmidt, C. Cremers, and D. Basin, “The TAMARIN Prover for the Symbolic Analysis of Security Protocols,” in 25th International Conference on Computer Aided Verification (CAV), in Lecture Notes in Computer Science, vol. 8044. 2013, pp. 696–701. doi: 10.1007/978-3-642-39799-8\_48.
[14] B. Blanchet, “An Efficient Cryptographic Protocol Verifier Based on Prolog Rules,” in 14th IEEE Computer Security Foundations Workshop (CSFW), 2001, pp. 82–96.
[15] Y. Cai et al., “Vest: Verified, Secure, High-Performance Parsing and Serialization for Rust,” in USENIX Security Symposium, 2025. [Online]. Available: https://www.usenix.org/conference/usenixsecurity25/presentation/cai-yi
[16] P. Fiterău-Broştean, T. Lenaerts, E. Poll, J. de Ruiter, F. Vaandrager, and P. Verleg, “Model Learning and Model Checking of SSH Implementations,” in Proceedings of the 24th ACM SIGSOFT International SPIN Symposium on Model Checking of Software, 2017, pp. 142–151. doi: 10.1145/3092282.3092289.