Tractable Problems in AI Security via Formal Methods

Hardware and Physical Supply Chain

At the bottom of the stack is the silicon itself: the GPU, TPU, and NPU dies that run every training and inference workload, and the global supply chain that produces them. Compromise here is durable: a hardware trojan survives every software update, every OS reinstall, every firmware reflash. It is also opaque: you cannot read out the RTL of a fabricated chip to check what it does.

Chip Architecture

The RTL design of a modern AI accelerator is proprietary. NVIDIA does not publish the register-transfer-level description of its GPU compute cores; neither do Google (TPUs), AMD, or any other vendor. This means the formal methods community cannot verify what the silicon actually computes. We take it on trust that an H100′s tensor cores implement matrix multiplication correctly, that its memory controllers enforce isolation between MIG partitions, and that no undocumented debug interfaces exist. Classen et al. [1] demonstrated that a hardware trojan implanted in a Xilinx Vitis AI DPU could backdoor a traffic-sign recognition model by replacing just 30 of 43,000 parameters at the accelerator level — expanding the circuit by 0.24% with zero runtime overhead, invisible to all software-side defenses. The attack needed no cooperation from the model or the training pipeline; the accelerator alone was sufficient.

The design tools themselves are part of the attack surface. The chip industry flows through a handful of EDA vendors — primarily Synopsys and Cadence — whose toolchains are millions of lines of proprietary code that no customer audits. Basu et al. demonstrated in their “CAD-Base” work [2] that every stage of the EDA pipeline — high-level synthesis, logic synthesis, place-and-route, verification — is a viable insertion point for hardware trojans. The place-and-route attack is nasty: the tool can emit one netlist for verification and a different netlist for tapeout, and existing equivalence-checking flows never compare the two. A malicious logic synthesis tool can exploit underspecified finite-state machines to insert extra transitions triggered by rare conditions, leading the circuit into attacker-controlled states. These are not hypothetical proof-of-concept attacks against toy designs; they target the actual tool interfaces that every fabricated chip passes through. Open-source EDA toolchains (Yosys, OpenROAD) exist but are not used for leading-edge AI accelerator design, so for now the industry trusts its toolchain vendors implicitly.

Beyond the design tools, modern SoCs integrate dozens of third-party IP cores — licensed blocks for PCIe controllers, memory interfaces, interconnect fabrics, security modules — that the chip designer incorporates without access to the underlying RTL. A third-party IP vendor can embed a hardware trojan that activates post-fabrication, leaks data through a covert channel, or degrades reliability on a trigger condition. The SoC integrator’s only recourse is black-box testing and whatever contractual guarantees the IP license provides, neither of which catch a well-designed trojan. Formal verification of third-party IP at the pre-silicon stage is an active research area, but it requires a golden specification of correct behavior — which is exactly what you don’t have when the IP is a licensed black box.

Hardware formal verification is used in industry, but its coverage is narrower than outsiders might expect. Intel adopted formal methods in earnest after the 1994 Pentium FDIV bug (a $475 million recall) [3]; by the Nehalem microarchitecture in 2008, formal verification was the primary validation method for floating-point and other arithmetic units. ARM’s ISA-Formal framework [4] uses bounded model checking in JasperGold to verify that RTL implementations conform to the instruction set specification — but bounded model checking only proves correctness up to a fixed sequence length, not for arbitrary execution traces, unless you can find the right invariants to lift it to an unbounded proof. In practice, formal verification at major vendors covers specific high-value blocks (FPUs, cache coherence protocols, security state machines) while the rest of the design relies on simulation-based testing. Nobody is formally verifying an entire GPU.

RISC-V changes part of this picture. The ISA specification is open, and several open-source RTL implementations exist (BOOM, Rocket, CVA6) with corresponding formal verification frameworks — YosysHQ’s riscv-formal [5] can check that an implementation conforms to the RISC-V spec using model checking, and the OpenHW Group’s CORE-V-VERIF [6] provides industrial-grade verification for the CV32 and CV64 families. MIT’s Riscy processors, built on the Coq-embedded Kami DSL [7], are formally verified open-source RISC-V cores. But open ISA is not open silicon. These designs still get fabricated at foundries through proprietary EDA toolchains and opaque manufacturing processes. An open RTL that you can verify pre-silicon is a real improvement over a proprietary RTL that you can’t — but it does not close the gap between the verified netlist and the actual transistors on the die. FPGAs remain the one place where the deployer controls the entire path from RTL to running hardware, which is why § proposes FPGA taps as a verification target.

On the GPU side, the architecture determines what isolation primitives are available to driver software. Whether a verified driver (§) can enforce strong tenant isolation depends on what the hardware exposes — and right now, we are relying on vendor documentation rather than verified specs for that interface.

Physical Security

Confidential computing has reached the GPU. NVIDIA’s H100 supports a hardware-rooted TEE that, in conjunction with CPU-side enclaves (Intel TDX, AMD SEV-SNP), can encrypt model weights in transit and at rest on the device, with attestation that the firmware and driver stack have not been tampered with. This matters for AI because model weights are the most valuable artifact in the ecosystem — worth billions in training compute — and multi-tenant cloud deployments expose them to the cloud operator’s entire software stack. But confidential computing protects against software adversaries with root access; it does not protect against physical adversaries with access to the hardware.

Modern AI accelerators use High Bandwidth Memory (HBM) stacked on a silicon interposer — the GPU die and HBM stacks sit side by side on a shared substrate connected by thousands of micrometer-scale copper traces. This interposer is a physical interface carrying model weights and activations at terabytes per second, unencrypted in current designs — even the H100′s confidential-computing mode draws the security boundary at the package and leaves on-package HBM traffic in the clear [8]. An attacker with physical access to the package could, in principle, probe these traces or tap the interposer to exfiltrate data. This is not a low-effort attack — it requires lab equipment and destructive or semi-destructive access to the package — but for a state-level adversary targeting a frontier model worth billions, it is not out of scope. HBM encryption at the interface level is not yet standard; until it is, the physical package boundary is the confidentiality boundary.

Side-channel attacks add another dimension. [9] showed that the magnetic flux from a GPU’s power cable — measurable with a $3 induction sensor — betrays the detailed topology and hyperparameters of a neural network being evaluated, achieving 96.8% classification accuracy for identifying network layers on an Nvidia Titan V. [10] demonstrated that electromagnetic emanations from GPU DVFS (dynamic voltage and frequency scaling) activity penetrate walls and can be captured at 3+ meters, enabling website fingerprinting and keystroke timing attacks. Scale these techniques to a data center with thousands of GPUs running a single training job, and the EM signature is correspondingly richer. TEMPEST-class shielding is well-understood for traditional compute environments, but GPU training clusters dissipate megawatts of power with correspondingly strong emanations, and it is not clear that standard data center construction provides adequate EM containment. This is an area where the AI security community has barely begun to assess the threat surface.

Counterfeit and recycled chips are a supply chain problem that predates AI but gains new significance when the chips in question carry model weights. The Semiconductor Industry Association estimates counterfeit semiconductors cost the U.S. industry roughly $7.5 billion a year in lost revenue [11]; the most common attack is remarking — taking a chip desoldered from e-waste, sanding the package markings, and relabeling it as a higher-grade or newer part. For commodity components this causes reliability failures; for security-critical components in a training cluster, it could mean a node with degraded or compromised silicon participating in a distributed training run. Detection methods exist — on-chip aging sensors (“odometers”), power side-channel profiling, X-ray inspection of die markings — but none are deployed systematically for GPU procurement at scale. The assumption is that buying from authorized distributors eliminates the problem, which is true until it isn’t.

The geographic concentration of semiconductor manufacturing is a systemic risk that no amount of formal verification can address, but it shapes the threat model for everything else in this section. TSMC fabricates over 90% of the world’s most advanced chips [12], including every NVIDIA AI accelerator. All of this capacity sits in Taiwan. A single earthquake, a fabrication contamination event, or a military conflict in the Taiwan Strait would halt frontier AI hardware production globally. The CHIPS Act and TSMC’s fabs in Arizona, Japan, and Germany are slowly redistributing this concentration, but leading-edge capacity outside Taiwan remains years away from volume production. Advanced chip packaging — the 2.5D interposer and CoWoS technology that makes HBM-equipped AI accelerators possible — is even more concentrated than fabrication itself. For security planning, this means the physical supply chain has a single point of failure that cannot be engineered around with software or formal methods; it can only be mitigated through industrial policy and supply chain diversification.

For training clusters of 10,000+ GPUs, the physical supply chain itself becomes a target at the system level, not just the component level. Every node in the cluster must be attested before it receives model weights or gradient updates. Every network link between nodes carries data that, if intercepted or modified, could poison a training run or exfiltrate a model. Physical tamper-evidence for installed hardware — seals on server chassis, tamper-detecting rack enclosures, logged physical access — is standard practice in classified environments but not in commercial data centers running AI training workloads. As frontier model training costs reach hundreds of millions of dollars, the gap between the physical security posture of a commercial hyperscaler and the value of the assets it protects is widening. FPGA-based network taps (§) address the link-level problem: a tap with a formally verified packet-processing pipeline can guarantee that it faithfully mirrors traffic without injection or modification — a property that no software tap running on a general-purpose OS can credibly claim.

Bibliography

[1] A. Classen, D. Hansma, T.-N. Ngo, M. Becker, and K. Rieck, “Evil from Within: Machine Learning Backdoors through Hardware Trojans,” in Proceedings of the IEEE Symposium on Security and Privacy, 2025. [Online]. Available: https://arxiv.org/abs/2304.08411
[2] K. Basu et al., “CAD-Base: An Attack Vector into the Electronics Supply Chain,” ACM Transactions on Design Automation of Electronic Systems (TODAES), vol. 24, no. 4, pp. 38:1–38:30, 2019, doi: 10.1145/3315574.
[3] J. Harrison, “Formal Verification at Intel,” in 18th Annual IEEE Symposium on Logic in Computer Science (LICS), 2003, pp. 45–54. doi: 10.1109/LICS.2003.1210044.
[4] A. Reid, “Trustworthy Specifications of ARM v8-A and v8-M System Level Architecture,” in Proceedings of Formal Methods in Computer-Aided Design (FMCAD), 2016. doi: 10.1109/FMCAD.2016.7886675.
[5] YosysHQ, “riscv-formal: RISC-V Formal Verification Framework.” [Online]. Available: https://github.com/YosysHQ/riscv-formal
[6] OpenHW Group, “CORE-V-VERIF: Functional Verification for CORE-V RISC-V Cores.” [Online]. Available: https://github.com/openhwgroup/core-v-verif
[7] J. Choi, M. Vijayaraghavan, B. Sherman, A. Chlipala, and Arvind, “Kami: A Platform for High-Level Parametric Hardware Specification and Its Modular Verification,” Proceedings of the ACM on Programming Languages (ICFP), vol. 1, 2017, doi: 10.1145/3110268.
[8] NVIDIA, “Confidential Compute on NVIDIA Hopper H100.” [Online]. Available: https://images.nvidia.com/aem-dam/en-zz/Solutions/data-center/HCC-Whitepaper-v1.0.pdf
[9] H. Maia, C. Chang, D. Sanchez, and S. Devadas, “Can one hear the shape of a neural network?: Snooping the GPU via magnetic side channel,” in Proceedings of the 31st USENIX Security Symposium, 2022.
[10] Z. Zhan et al., “Graphics Peeping Unit: Exploiting EM Side-Channel Information of GPUs,” in Proceedings of the IEEE Symposium on Security and Privacy (S&P), 2022. doi: 10.1109/SP46214.2022.9833773.
[11] Semiconductor Industry Association, “Winning the Battle Against Counterfeit Semiconductor Products.” [Online]. Available: https://www.semiconductors.org/wp-content/uploads/2018/01/SIA-Anti-Counterfeiting-Whitepaper.pdf
[12] Semiconductor Industry Association and Boston Consulting Group, “Strengthening the Global Semiconductor Supply Chain in an Uncertain Era,” technical report, Apr. 2021. [Online]. Available: https://www.semiconductors.org/strengthening-the-global-semiconductor-supply-chain-in-an-uncertain-era/

Hardware & Physical Supply Chain

Hardware and Physical Supply Chain

Chip Architecture

Physical Security

Bibliography