EvoGM merges LLMs with generative evolution, beating manual methods
No more hand-crafted operators: AI learns to merge models optimally.
A new paper accepted at ICML 2026 introduces EvoGM (Evolutionary Generative Merging), a framework that automates the composition of large language models (LLMs) through parameter-space search without requiring retraining. Traditional evolutionary model merging relies on stochastic, hand-crafted operators that struggle to navigate the underlying performance landscape. EvoGM replaces these manual heuristics with a learnable generative modeling approach, specifically a dual-generator architecture trained with cycle-consistent learning.
This design allows the system to adaptively sample and refine promising merging candidates. By constructing winner-loser pairs from historical search trajectories, EvoGM efficiently captures high-performance parameter distributions, maximizing data efficiency. The generative process is integrated into a multi-round evolutionary pipeline where elite merged models serve as new expert foundations. Extensive experiments across diverse benchmarks show that EvoGM significantly outperforms existing baselines and generalizes robustly to both seen and unseen tasks. Code and data are publicly available.
- Dual-generator architecture with cycle-consistent learning optimizes merging coefficients instead of manual heuristics.
- Winner-loser pairs from historical trajectories boost data efficiency and performance capture.
- Outperforms state-of-the-art baselines across diverse benchmarks, including unseen tasks, at ICML 2026.
Why It Matters
Enables automated, training-free LLM composition, making model merging smarter and more scalable for real-world deployment.