Research & Papers

EVE-Agent forces AI agents to justify every training example

Self-evolving agents must now prove their answers with source-grounded evidence

Deep Dive

Self-evolving AI agents that generate their own training data risk learning from fluent but unsubstantiated examples. The new EVE-Agent framework tackles this by introducing an evidence verifier into the proposer-solver architecture. The proposer generates a question, an answer, and a verbatim evidence span from a source. The evidence verifier then measures the marginal accuracy gain when that evidence is provided to the solver. Only examples with genuinely helpful evidence receive high rewards, creating a self-generated curriculum that is both reliable and auditable.

EVE-Agent leaves the backbone model, retriever, and search tools unchanged while substantially improving evidence-grounded correctness compared to prior self-evolving search agents. The system operates without oracle answers or human labels, making it scalable. Each training example includes an inspectable source span, so developers can verify exactly why the agent trusts a particular answer. This addresses a critical trust gap in autonomous AI learning systems.

Key Points
  • EVE-Agent uses a proposer-solver framework where the proposer generates questions, answers, and verbatim evidence spans
  • An evidence verifier rewards spans based on marginal accuracy gain when the evidence is provided to the solver
  • The method improves evidence-grounded correctness over prior self-evolving agents while keeping all underlying models and tools unchanged

Why It Matters

Makes autonomous AI training auditable, preventing agents from learning from unsupported or false examples.