GEAR: Genetic AutoResearch evolves autonomous agents with population-based search
Population-based search beats single-path for autonomous AI research agents
Autonomous research agents can already run machine learning experiments without human supervision, but many rely on a narrow search strategy: they repeatedly modify one program and keep changes only when they improve the current best result. This single-path approach discards useful partial ideas, alternative promising directions, and insights from failed experiments. GEAR addresses this by maintaining a population of candidate research states, each storing code changes, reflections, and performance data. It selects parents based on productivity, novelty, and coverage, then explores new ideas through mutation and crossover. This allows the system to build on past discoveries while exploring diverse strategies in parallel.
Three versions of GEAR were studied: one controlled through prompting, one using a fixed programmatic search controller, and one where the controller itself evolves during the run. All three outperformed the baseline AutoResearch approach under the same compute budget. Crucially, while the baseline tends to settle into one local optimum, GEAR continues finding improvements over longer runs. The results suggest that autonomous research agents become more effective when they maintain multiple promising directions and can adapt their search strategy over time, pointing to a new paradigm for automated AI research.
- Replaces single-path search with population-based search across multiple research states
- Three versions tested: prompting-controlled, fixed programmatic, and an evolving controller
- Outperforms AutoResearch baseline under same compute budget, avoiding local optima in longer runs
Why It Matters
Could unlock more efficient and creative autonomous AI research by exploring diverse solutions in parallel.