LEVI achieved top scores on systems benchmarks with a budget 3.3–6.7x smaller than ShinkaEvolve, GEPA, and AdaEvolve?

LEVI achieved top scores on systems benchmarks with a budget 3.3–6.7x smaller than ShinkaEvolve, GEPA, and AdaEvolve.

On one problem, LEVI matched the existing best performance at a 35x lower cost?

On one problem, LEVI matched the existing best performance at a 35x lower cost.

The framework uses a diversity-preserving archive, a smart mutation router (small vs. large LLMs), and a proxy benchmark to reduce costly rollouts?

The framework uses a diversity-preserving archive, a smart mutation router (small vs. large LLMs), and a proxy benchmark to reduce costly rollouts.

Research & Papers

LEVI: smarter search replaces costly LLMs, slashing budgets 35x

arXiv cs.NE May 12, 2026

⚡New framework cuts evolutionary search costs by 3.3-6.7x while matching top results.

Deep Dive

LEVI tackles a major inefficiency in LLM-guided evolutionary algorithms: existing frameworks like AlphaEvolve burn through costly frontier-model API calls because they use weak archives, blind model assignment, and full-set evaluation. Instead of throwing larger models at the problem, LEVI redesigns the search architecture itself. It introduces a solution database that preserves diversity from the start, a mutation router that assigns routine edits to small models and hard problems to large ones, and a rank-preserving proxy benchmark that slashes expensive rollouts on redundant examples.

On systems-research benchmarks, LEVI attains the highest score with a budget 3.3–6.7× smaller than published runs of methods like ShinkaEvolve, GEPA, and AdaEvolve. In one case it matched the existing best at 35× lower cost. On prompt optimization, LEVI matches or exceeds GEPA using less than half the rollout budget across four benchmarks. The fully open-source framework shows that smarter architecture, not bigger models, is the path to affordable evolutionary discovery.

Key Points

LEVI achieved top scores on systems benchmarks with a budget 3.3–6.7x smaller than ShinkaEvolve, GEPA, and AdaEvolve.
On one problem, LEVI matched the existing best performance at a 35x lower cost.
The framework uses a diversity-preserving archive, a smart mutation router (small vs. large LLMs), and a proxy benchmark to reduce costly rollouts.

Why It Matters

LEVI proves stronger search design can slash LLM costs by an order of magnitude, making evolutionary methods practical for more teams.

Read Original Article

LEVI: smarter search replaces costly LLMs, slashing budgets 35x

Why It Matters

Related Articles

🚀 Stay Ahead in AI