Multi-Agent LLM Simulator Recreates Nuclear Team Failures with 53% Accuracy
AI model recreates Chernobyl and Three Mile Island disasters with near-perfect timing.
Get AI news that actually matters
One email a day. Zero fluff. Join 10,000+ professionals.
TEAM-SimHRA, developed by researchers including Xingyu Xiao, reimagines human reliability analysis (HRA) for high-stakes team environments. Traditional HRA assigns fixed error probabilities to individual tasks, but fails to capture how team dynamics like delayed diagnosis, suppressed dissent, and authority-driven error propagation cause catastrophic failures. This multi-agent LLM framework treats reliability as an emergent property of team interactions, simulating real-time communication and role-conditioned authority during accident progressions.
Validated against the two most documented nuclear disasters—Three Mile Island (1979) and Chernobyl (1986)—TEAM-SimHRA achieved face-validity pass rates of 43.5% and 52.6%, respectively. It reproduced key historical metrics: near-exact decision delay (134.8 minutes simulated vs. 138 actual), perfect communication suppression stability, and full authority pressure cascades at accurate propagation depths. These results demonstrate that multi-agent LLM simulations can extract quantitative team-level reliability indicators inaccessible to traditional methods, paving the way for dynamic probabilistic risk assessment in safety-critical systems like nuclear plants, aviation, and military command centers.
- Uses multi-agent LLMs to model team interactions, not individual error rates
- Validated against Three Mile Island (43.5% pass rate) and Chernobyl (52.6%) disasters
- Reproduced decision delay within 3 minutes of historical data (134.8 vs 138 min)
- Captures authority pressure cascades and communication suppression at accurate depths
Why It Matters
This framework could transform risk assessment for nuclear, aviation, and other safety-critical team operations.