Agent Frameworks

One Qwen-3-14B seed agent teaches others to cooperate without training

A single aligned agent can spread cooperation to untrained teammates through natural language.

Deep Dive

Researchers from multiple institutions have demonstrated that a single aligned AI agent can spread cooperative behavior to untrained agents through natural language interaction alone, a phenomenon they call 'Alignment Propagation.' Using the Red-Black Game — a team-based iterated Prisoner's Dilemma where agents deliberate and vote on collective actions — they distilled the cooperative reasoning and persuasive dialogue of a teacher model into a Qwen-3-14B model to create a 'seed agent.' When placed among four untrained teammates, the seed agent doubled the cooperation rate from 24.8% to 62.2%, outperforming both the teacher model and vanilla Gemini-3.1-Pro.

Remarkably, the same seed agent, trained exclusively on the Red-Black Game, transferred zero-shot to Sugarscape, a spatially grounded survival simulation with pairwise trading. There it achieved a 91.5% trade success rate versus a 21.6% baseline. These results reframe multi-agent alignment from an exhaustive per-agent training problem to a scalable social capability that can be engineered through strategic seed placement, potentially transforming how we ensure safe and cooperative behavior in large-scale AI systems.

Key Points
  • Seed agent fine-tuned from Qwen-3-14B, using distilled reasoning from a teacher model.
  • Cooperation rate in Red-Black Game rose from 24.8% to 62.2% with just one seed agent among four untrained agents.
  • Zero-shot transfer to Sugarscape yielded 91.5% trade success, far above 21.6% baseline.

Why It Matters

This shifts AI alignment from training every agent to strategically placing a few cooperative seeds.