Two-tier architecture?

specialist agents extract structured evidence, governance agent enforces clinical rule set

Achieved 97.3% macro AUC and 94.3% accuracy on TCGA-UCEC (n=541) with only 0.93% logic violations?

Achieved 97.3% macro AUC and 94.3% accuracy on TCGA-UCEC (n=541) with only 0.93% logic violations

Outperformed locked-transfer neural baselines (84.2% vs <31% accuracy) on CPTAC-UCEC under distribution shift?

Outperformed locked-transfer neural baselines (84.2% vs <31% accuracy) on CPTAC-UCEC under distribution shift

Agent Frameworks

EndoGov: Multi-agent AI enforces clinical rules for cancer risk

arXiv cs.MA April 28, 2026

⚡AI that follows doctor’s rules hits 97.3% AUC in cancer staging

Deep Dive

A team of researchers from multiple institutions has introduced EndoGov, a knowledge-governed multi-agent expert system designed to enforce clinical guideline compliance in endometrial cancer (EC) risk stratification. Unlike standard multimodal AI models that optimize for aggregate accuracy but ignore mandatory clinical overrides—such as assigning POLE-mutated tumors to the low-risk group regardless of high-grade morphology—EndoGov explicitly factorizes decision-making into two tiers. Tier 1 deploys specialist agents (pathology, molecular, and clinical) that independently generate schema-constrained reports from frozen foundation-model features or structured records. Tier 2 then queries an evidence-level-weighted Guideline Knowledge Graph using deterministic hard-path rules for high-priority overrides and constrained soft-path reasoning for ambiguous cases.

In rigorous testing on the TCGA-UCEC cohort (n=541), EndoGov achieved 0.943 accuracy, 0.973 macro AUC, and a conditional logic-violation rate (C-LVR) of just 0.93% among trigger-exposed cases—meaning it almost never broke the clinical rules. On the CPTAC-UCEC cohort (n=95) where reference labels are guideline-derived, EndoGov reached 0.842 accuracy compared with less than 0.31 for locked-transfer neural baselines, demonstrating robust governance-pathway transfer under distribution shift. End-to-end safety decomposition showed that residual failures stemmed primarily from upstream molecular detection, not the governance layer. Backend-swap experiments further confirmed that hard-path compliance is invariant to the LLM backend, ensuring reliability across models. This work offers a practical blueprint for auditable, guideline-compliant AI in high-stakes medical decisions.

Key Points

Two-tier architecture: specialist agents extract structured evidence, governance agent enforces clinical rule set
Achieved 97.3% macro AUC and 94.3% accuracy on TCGA-UCEC (n=541) with only 0.93% logic violations
Outperformed locked-transfer neural baselines (84.2% vs <31% accuracy) on CPTAC-UCEC under distribution shift

Why It Matters

Enables auditable, guideline-compliant AI for cancer risk stratification, reducing dangerous rule violations in clinical decisions.

Read Original Article

EndoGov: Multi-agent AI enforces clinical rules for cancer risk

Why It Matters

Related Articles

🚀 Stay Ahead in AI