Research & Papers

Harvard Researchers Crack AI Ensemble Selection with Efficient Greedy Algorithm

A failure-conditioned greedy algorithm cuts model calls while preserving optimality guarantees.

Deep Dive

Organizations increasingly deploy multiple AI systems across task domains, but selecting a small, high-performing ensemble often requires expensive model calls, benchmark runs, and human evaluation. A new paper from researchers at Harvard and other institutions tackles this as a distributional variant of multiwinner voting. For binary feedback (correct/incorrect tasks), they design a failure-conditioned greedy algorithm that preserves the classic (1-1/e) approximation guarantee while achieving instance-dependent query savings over exhaustive elicitation baselines. The work provides matching worst-case lower bounds, establishing the method's efficiency.

For pairwise feedback (where candidate outputs are compared by preference), the team studies θ-winning committees. They show that full-information optimization admits a PTAS (polynomial-time approximation scheme) but no EPTAS under Gap-ETH, and the objective is monotone but not submodular. This motivates a weighted ordinal coverage relaxation that is submodular and supports a failure-conditioned greedy oracle under pairwise feedback. The oracle is converted back into θ-type guarantees through finite-family auditing or a minimax wrapper. Small-scale LLM experiments validate the predicted query savings and demonstrate how complementarity between models drives better committee selection.

Key Points
  • Failure-conditioned greedy algorithm achieves (1-1/e) guarantee with instance-dependent query savings for binary feedback ensemble selection.
  • Pairwise feedback objective is monotone but not submodular; PTAS exists but no EPTAS under Gap-ETH.
  • Weighted ordinal coverage relaxation enables submodular greedy oracle with finite-family auditing; LLM experiments confirm query savings and complementarity benefits.

Why It Matters

Organizations can now select cost-effective AI ensembles with provable guarantees, reducing expensive evaluations and human effort.