Preserving Disagreement: Architectural Heterogeneity and Coherence Validation in Multi-Agent Policy Simulation
Using different LLMs for different values cut consensus by 25% in policy tests.
A new arXiv paper by Ariel Sela introduces the AI Council, a three-phase deliberation framework designed to combat artificial consensus in multi-agent LLM policy simulations. The study runs 120 deliberations across two policy scenarios—child welfare and housing—and tests two key interventions. First, architectural heterogeneity, where each value perspective is assigned a different 7-9B parameter model, significantly reduces first-choice concentration compared to a homogeneous baseline: from 70.9% to 46.1% in child welfare (p < 0.001, r = 0.58) and 46.0% to 22.9% in housing (p < 0.001, r = 0.50). This contrasts with accuracy-oriented multi-agent debate, where model diversity does not reduce convergence, suggesting that diversity operates differently when there is no objectively correct answer.
Second, coherence validation—using a frontier model to check if each evaluator's reasoning aligns with its assigned values—reveals a fidelity-diversity tradeoff. On the child welfare scenario (with a dominant option), coherence validation further reduces concentration from 46.1% to 40.8% (p = 0.004). However, on the housing scenario (with genuinely competitive options), it increases concentration from 22.9% to 26.6% (p = 0.96) by amplifying high-coherence evaluators who cluster on one option. The paper also reports negative results from three failed Delphi designs, shows that 8B models exhibit binary rather than graded responses to counter-arguments, and proposes the trustworthy tension rate as a diagnostic measure for small-model deliberation capabilities.
- Architectural heterogeneity (different 7-9B models per value) cut first-choice concentration by 25-50% in 120 policy deliberations.
- Coherence validation reduced diversity in balanced scenarios (22.9% to 26.6%) but helped in dominant-option scenarios (46.1% to 40.8%).
- 8B models showed binary responses to counter-arguments, limiting nuanced deliberation in smaller systems.
Why It Matters
This framework could make AI policy simulations more realistic, avoiding false consensus in public opinion modeling.