tractable problems
Adversarial Robustness of Formal Methods Agda ’s issue label “false” on GitHub sits at the time of this commit at 10 open and 76 closed . Agda ’s issue label “false” tracks the pro...
Formal Theory of Weird Machines in Agents A weird machine [1] is the unintended computational substrate an attacker programs when individually-legitimate operations compose into ca...
AI Control via Proof-Carrying Code Proof-carrying code [1] and AI Control [2] , [3] are two literatures about the same problem: keep an untrusted producer on the far side of an int...
Audit Log Integrity and Session State Channels The audit log is the primary accountability mechanism for deployed AI systems. Regulators, red teams, and incident responders all dep...
Capability-Safe Tool Interfaces A single tool call in an agentic harness is usually benign. Read a file, run a shell command, make an HTTP request — each is authorized individually...
Context Window Integrity Under Adversarial Length A model’s context window is a fixed-size resource. The serving harness fills it by concatenating, in priority order, a system prom...
Edge Policy Verification The API gateway in front of an inference service decides two things on every request: whether the caller is who they claim to be, and whether they have exc...
GPU Drivers for Verified Kernels Multi-tenant GPU workloads today run on hypervisors whose TCB is orders of magnitude too large to verify, and whose GPU-facing driver is proprietar...
Verified Network Tap The control plane that decides what flows where on a GPU cluster fabric — SDN controllers, fabric managers, tenant routing tables ( § ) — is the same piece of ...
Governance and Interpretability of Neuralese Proof Stacks table stakes is that proofs are cheap. We assume a proof oracle will make code and proofs that they’re correct too cheap t...
OCI Runtime Hardening The standard container runtime, runc , implements the OCI specification atop Linux namespaces and cgroups. It was never designed to resist an adversary inside...
Sampler Integrity and Determinism The sampler is the last piece of code between the model’s output logits and the token that reaches the user. It applies temperature scaling, top- ...
Scheduler Integrity and Co-Tenancy Isolation The dominant real-world mitigation for co-tenant attacks is not a scheduler policy but a purchase order: hyperscaler customers with ser...
Specification Elicitation and Validation If proofs are cheap and specs are expensive, then specs are the bottleneck. Every widget in this document terminates in a spec: a verified ...
Verified Input Parsers Every boundary in the AI serving stack where text is parsed or transformed is a security boundary. The abuse classifier preprocessor tokenizes, truncates, an...
Weight Integrity and Kernel Supply Chain Model weights are the highest-value static artifact in an ML deployment. A frontier model’s checkpoint runs tens to hundreds of gigabytes, ...