AI Safety

Nvidia's AI Chip Verification: Die-Level Too Slow for 2030 Timelines

Die-level chip verification takes years—board and software fixes might save the timeline.

Deep Dive

In a LessWrong post on compute verification for short AI timelines (pre-2030), skunnavakkam examines Nvidia's chip generations—Blackwell, Rubin, and Feynman—and the feasibility of die-level hardware verification. Die-level changes take over two years, meaning the earliest verified chip would be Feynman in 2028 if work starts immediately. With Rubin GPUs releasing later this year and a typical one-year gap to deployment, die-level verification is too slow for urgent coordination.

Instead, the author proposes board-level and software/firmware modifications. Board-level changes, like adding an MCU to hash weights sent to the die, can be implemented in months and could catch unauthorized weight updates. Software/firmware changes take mere weeks and can be applied at any production stage. To verify Rubin chips, such changes must be incorporated now. Skunnavakkam emphasizes that a mix of board and firmware modifications offers a tractable path to compute verification on short timelines, though imperfect.

Key Points
  • Die-level verification takes >2 years; earliest verified chip would be Nvidia's Feynman in 2028.
  • Board-level changes (e.g., MCU-based weight hashing) feasible in months but must start now for Rubin chips.
  • Software/firmware modifications take 1–2 months and can be applied mid-production, offering agile verification.

Why It Matters

Ensuring AI chip compliance by 2030 requires shifting focus from die-level to board/software verification now.