SIGMA Builds AI Agents from Reusable Skills—Outperforms Top Baselines by 2.36 Points Across Six Benchmarks
New framework composes agents on the fly using skill libraries, outperforming static topologies.
SIGMA, introduced by a team of researchers (Kun Zeng et al., EMNLP 2026), tackles a key limitation in multi-agent systems: static agent definitions that can't combine skills for novel tasks. Instead of optimizing communication over fixed agents, SIGMA models each agent as a task-conditioned bundle of skills from a library. It predicts a skill-agent incidence matrix, uses those embeddings to compose agents, and decodes their communication topology. During execution, skill-specific mailboxes route messages to the relevant capabilities.
Tested on six reasoning and coding benchmarks using three base LLMs (e.g., GPT variants), SIGMA achieves the highest average performance, improving over the best static-topology baseline CARD by 2.06 to 2.36 points. It also drops only 0.96 points on average when tested with unseen skill libraries, showing strong generalization. The code is available on GitHub. This suggests that dynamically composing agents from skills is a promising new axis for multi-agent design, beyond just optimizing how they talk to each other.
- SIGMA dynamically constructs agents as bundles of reusable skills from a task and skill library.
- Outperforms CARD by 2.06–2.36 points on average across 6 reasoning and coding benchmarks with 3 LLMs.
- Shows strong robustness: only 0.96 point performance drop on average with unseen skill libraries.
Why It Matters
Enables AI teams to adapt to novel tasks without retraining, saving time and improving generalization.