AI Safety

New RFP on Interpretability from Schmidt Sciences

New $300k-$1M RFP seeks tools to spot when LLMs lie and steer them toward truth.

Deep Dive

Schmidt Sciences, the philanthropic venture founded by former Google CEO Eric Schmidt, has issued a major new Request for Proposals (RFP) focused on a critical frontier in AI safety: interpretability. The organization is offering grants ranging from $300,000 to $1 million for 1-3 year projects aimed at developing tools to detect and mitigate deceptive behaviors in large language models (LLMs). The core challenge is to identify when a model's internal representations contradict its outputs—essentially, catching AI in a lie—and then using that mechanistic understanding to steer the model toward more truthful reasoning.

Proposals, due by May 26, 2026, should focus on three key directions: detecting deceptive behaviors, creating targeted steering methods to improve truthfulness, and exploring practical applications for these techniques in human-AI teams or multi-agent systems. The RFP emphasizes moving beyond academic benchmarks to address concrete, real-world risks. To support ambitious research, Schmidt Sciences will provide not just funding but also access to cutting-edge GPU/CPU compute resources, software engineering support, and API credits from leading frontier model providers, making this a well-resourced push for actionable AI safety tools.

Key Points
  • Offers $300k-$1M grants for 1-3 year projects focused on detecting when LLMs internally believe something different from what they output.
  • Seeks practical interpretability tools that outperform baseline methods, with proposals due May 26, 2026 and informational webinars in April.
  • Provides extensive support including compute resources, API credits from frontier model providers, and software engineering assistance to funded teams.

Why It Matters

As AI systems become more capable, this research is crucial for ensuring they are trustworthy and aligned with human intent in high-stakes applications.