97.9% Fidelity: New Framework Verifies Multi-Agent AI Safety by Distilling Neural Policies into Decision Trees
97.9% fidelity to neural policies with provable safety guarantees for drone swarms
Multi-agent reinforcement learning (MARL) has enabled drones and autonomous vehicles to coordinate through emergent communication, but neural policies remain black boxes—a critical safety risk for real-world deployment. In a paper accepted at IROS 2026, researchers Ahmad Farooq and Kamran Iqbal introduce the first framework that formally verifies safety properties of learned multi-agent communication policies by distilling them into interpretable decision trees.
The pipeline consists of four stages: domain-specific feature extraction, decision tree distillation achieving 97.9% ± 1.2% fidelity to original neural networks, automated translation to PRISM probabilistic model checker specifications, and compositional verification of Probabilistic Computation Tree Logic (PCTL) properties using pairwise decomposition. Evaluating on Vector-Quantized Variational Information Bottleneck (VQ-VIB) policies for multi-drone coordination with 5–7 agents, the team verified 18 temporal logic properties covering safety, liveness, and cooperation—achieving 88.9% property satisfaction. Crucially, all five safety thresholds were satisfied, including a collision probability of only 0.3% against a 1% threshold. Monte Carlo validation confirmed that verified properties transfer to original neural policies with ≤0.6 percentage-point deviation (95% CI). The discrete VQ-VIB messages provided +11.6 to +13.6 percentage-point fidelity advantages over continuous methods, enabling 3–4x faster verification. This framework bridges deep MARL and formal safety workflows, offering a practical path to certifying multi-robot systems.
- Decision tree distillation achieves 97.9% ± 1.2% fidelity to original neural policies
- Verified 18 temporal logic properties on VQ-VIB drone coordination, with collision probability of 0.3% (threshold: 1%)
- Discrete VQ-VIB messages provide 3–4x faster verification and +11.6–13.6 pp fidelity boost over continuous methods
Why It Matters
First proven method to formally certify neural-based multi-agent coordination for safety-critical drone and vehicle fleets.