Automated search of scGPT's 96 attention units revealed an 8-10D manifold encoding blood cell development?

Automated search of scGPT's 96 attention units revealed an 8-10D manifold encoding blood cell development.

The extracted geometric algorithm was validated via zero-shot transfer to the independent Tabula Sapiens dataset?

The extracted geometric algorithm was validated via zero-shot transfer to the independent Tabula Sapiens dataset.

Demonstrates foundation models learn complex, structured biological representations (like differentiation pathways) to perform their training task?

Demonstrates foundation models learn complex, structured biological representations (like differentiation pathways) to perform their training task.

AI Safety

Goodfire AI's scGPT reveals cell development algorithm via mechanistic interpretability

LessWrong AI March 14, 2026

⚡Researchers extracted a functional cell differentiation algorithm from a 96-attention-unit transformer model using automated hypothesis search.

Deep Dive

Researchers at Goodfire AI have successfully used mechanistic interpretability to extract a performant biological algorithm from a foundation model. The work focused on scGPT, a transformer model trained on millions of single-cell gene expression profiles to predict masked gene values. By systematically searching the model's 96 attention units (12 layers x 8 heads) with an AI executor-reviewer pair, they identified a compact 8-to-10-dimensional manifold within specific attention heads. This geometric structure encoded the entire process of hematopoietic differentiation, with stem cells at one end and terminally differentiated blood cells like T cells and monocytes along distinct branches.

The discovered manifold wasn't just a curious artifact; it represented a functional algorithm. The team developed a three-stage extraction pipeline to convert this internal representation into a standalone tool. Validation on independent datasets, including the Tabula Sapiens atlas and a multi-donor immune panel, confirmed its accuracy in mapping cell development. This proves that the model, trained only for next-token prediction, had internally learned and structured a complex biological process—the developmental hierarchy of blood cells—to improve its core task. The finding mirrors a prior discovery where the Evo 2 DNA model encoded the evolutionary tree of life in its activations.

Key Points

Automated search of scGPT's 96 attention units revealed an 8-10D manifold encoding blood cell development.
The extracted geometric algorithm was validated via zero-shot transfer to the independent Tabula Sapiens dataset.
Demonstrates foundation models learn complex, structured biological representations (like differentiation pathways) to perform their training task.

Why It Matters

This provides a blueprint for discovering novel, functional algorithms hidden within black-box AI models, potentially accelerating scientific discovery.

Read Original Article

Goodfire AI's scGPT reveals cell development algorithm via mechanistic interpretability

Why It Matters

Related Articles

🚀 Stay Ahead in AI