Research & Papers

Mozi: Governed Autonomy for Drug Discovery LLM Agents

Researchers introduce a governed supervisor-worker hierarchy that enforces tool isolation and reflection-based replanning.

Deep Dive

A research team led by He Cao has introduced Mozi, a novel architecture designed to bring governed autonomy to large language model (LLM) agents operating in the high-stakes domain of drug discovery. The system directly addresses critical bottlenecks that have prevented the reliable deployment of AI agents in pharmaceutical pipelines: unconstrained tool-use governance and poor long-horizon reliability. Mozi's core innovation is a dual-layer design that bridges the flexibility of generative AI with the deterministic rigor required in computational biology, aiming to prevent the multiplicative compounding of early-stage errors that typically derail autonomous scientific workflows.

Mozi's architecture consists of two key planes. The Control Plane establishes a governed supervisor-worker hierarchy that enforces role-based tool isolation, limits execution to constrained action spaces, and drives reflection-based replanning. The Workflow Plane operationalizes canonical drug discovery stages—from Target Identification to Lead Optimization—as stateful, composable skill graphs. This layer integrates strict data contracts and strategic human-in-the-loop (HITL) checkpoints to safeguard scientific validity. Evaluated on the PharmaBench benchmark, Mozi demonstrated superior orchestration accuracy over existing baselines. In end-to-end case studies, it successfully navigated massive chemical spaces, enforced stringent toxicity filters, and generated highly competitive in silico drug candidates, effectively evolving the LLM from a fragile conversationalist into a traceable and reliable co-scientist.

Key Points
  • Introduces a dual-layer architecture with a Control Plane for governed tool-use and a Workflow Plane with stateful skill graphs for drug discovery stages.
  • Designed to mitigate error accumulation in long-horizon tasks through tool isolation, reflection-based replanning, and strategic human-in-the-loop checkpoints.
  • Demonstrated superior performance on the PharmaBench benchmark and generated competitive in silico drug candidates in end-to-end case studies.

Why It Matters

It provides a blueprint for deploying reliable, auditable AI agents in critical scientific domains, potentially accelerating pharmaceutical R&D.