Perspective: Towards sustainable exploration of chemical spaces with machine learning
New framework tackles AI's massive energy footprint in chemistry, aiming to cut computational waste by 90%.
A major collaboration of 23 scientists from leading institutions has published a critical perspective on the unsustainable computational practices in AI-driven materials and drug discovery. The paper, stemming from the 'SusML' workshop in Dresden, argues that the field's reliance on generating enormous quantum-mechanical (QM) datasets for training models incurs prohibitive energy and infrastructure costs. This 'brute-force' approach, while enabling progress, creates a significant environmental and economic bottleneck for scaling discovery.
The authors propose a strategic shift towards hierarchical, multi-fidelity workflows. This involves using fast, general-purpose machine learning models (ML surrogates) for broad screening and reserving expensive, high-accuracy QM calculations only for the most promising candidates. They highlight key efficiency strategies like active learning, where the AI decides which data points are most valuable to compute next, and model distillation, which compresses large models into smaller, more efficient versions.
Crucially, the perspective emphasizes that sustainability must be coupled with real-world practicality. This means AI models must account for synthesizability and multi-objective design criteria—not just theoretical performance—to ensure discoveries can be physically realized. The researchers advocate for open data, reusable workflows, and domain-specific AI systems as foundational pillars for a new era of efficient and responsible scientific discovery.
- Critiques the massive energy costs of generating quantum-mechanical datasets for AI training in chemistry and materials science.
- Proposes a hierarchical workflow using fast ML surrogates broadly and expensive QM methods selectively to optimize resource use.
- Advocates for open data, reusable workflows, and AI that accounts for real-world synthesizability to ensure practical impact.
Why It Matters
This framework could drastically reduce the cost and carbon footprint of discovering new batteries, drugs, and advanced materials.