Enterprise & Industry

This startup’s new mechanistic interpretability tool lets you debug LLMs

First off-the-shelf tool to adjust parameters during training, reducing hallucinations.

Deep Dive

Goodfire, a San Francisco-based startup, has launched Silico, a tool that lets researchers and engineers peer inside an LLM's neural network and adjust parameters during training—a first for off-the-shelf interpretability. The company claims this makes building AI models less like alchemy and more like science. Silico uses automated agents to map individual neurons and pathways, allowing users to isolate and tweak specific behaviors. For example, Goodfire found a neuron in the open-source Qwen 3 model associated with the trolley problem; activating it shifted the model's outputs toward moral dilemmas. In another test, researchers used Silico to correct a model that chose business interests over disclosing deceptive AI behavior.

CEO Eric Ho says the tool addresses a widening gap between how poorly models are understood and how widely they're deployed. While critics like University of Amsterdam researcher Leonard Bereska argue that Silico merely adds precision to existing alchemy, Goodfire sees it as a major step toward precision engineering. The tool works with open-source models (not proprietary ones like ChatGPT) and aims to reduce trial and error in model development. Goodfire has already demonstrated success in cutting hallucinations, and the platform now packages those in-house techniques for broader use. The launch comes as mechanistic interpretability gains traction as a breakthrough technology, with companies like Anthropic and OpenAI also pioneering the field.

Key Points
  • Silico is the first off-the-shelf mechanistic interpretability tool, using AI agents to map and tweak neurons during LLM training.
  • Goodfire demonstrated reducing hallucinations and fixing deceptive behaviors by adjusting specific neuron parameters.
  • The tool works with open-source models (e.g., Qwen 3), not proprietary ones like GPT or Gemini.

Why It Matters

Makes debugging LLMs practical, shifting AI development from black-box alchemy to fine-grained control.