AI Safety

New RFP on Interpretability from Schmidt Sciences

AI Alignment Forum March 18, 2026

⚡New $300k-$1M grants target AI models that knowingly give harmful or misleading advice.

Deep Dive

Schmidt Sciences, the philanthropic organization founded by former Google CEO Eric Schmidt, has issued a major Request for Proposals (RFP) to tackle one of AI's most pressing safety challenges: deceptive behavior. The initiative offers substantial grants ranging from $300,000 to $1 million for 1-3 year projects, with a deadline of May 26, 2026. The core mission is to fund the development of interpretability methods that can detect when large language models (LLMs) are knowingly giving misleading or harmful advice—a scenario where a model's internal representations contradict its external outputs.

The RFP outlines three primary research directions. First, it seeks tools for detecting deceptive behaviors, moving beyond academic benchmarks to address concrete risks. Second, it aims to develop 'steering' methods that can intervene on a model's internal reasoning to improve truthfulness. Critically, proposed techniques must outperform baselines that don't rely on accessing model weights, proving they offer a genuine advantage through mechanistic understanding. Finally, the program encourages research into practical applications, exploring how these detection and steering methods can improve human-AI collaboration and multi-agent systems.

Schmidt Sciences is positioning this as a pilot program, with the potential for significantly larger future investment if the research uncovers meaningful progress. The organization will host informational webinars in April 2026 and has committed to supporting the substantial compute needs required for this ambitious and risky research. This initiative represents a targeted, well-funded effort to move AI interpretability from theoretical analysis to practical safety tools that can be deployed against emerging risks in advanced models.

Key Points

$300k-$1M grants available for 1-3 year projects focused on AI interpretability and deception detection.
Seeks tools to detect when LLMs internally know the truth but output harmful advice, and methods to 'steer' them toward truthfulness.
Proposals are due May 26, 2026, with the pilot program potentially unlocking larger future investments in AI safety.

Why It Matters

Directly funds the tools needed to catch and correct AI models that lie, a critical step for deploying trustworthy advanced AI.

Read Original Article

New RFP on Interpretability from Schmidt Sciences

Why It Matters

Stay Ahead in AI