PARNESS: A Paper Harness for End-to-End Automated Scientific Research with Dynamic Workflows, Full-Text Indexing, and Cross-Run Knowledge Accumulation
First open-source system to combine declarative pipelines, full-PDF indexing, and reusable knowledge across experiments
Existing automated research systems (AI-Scientist, PaperOrchestra, AutoSOTA, etc.) force a fixed control-flow pattern — linear pipelines, single-agent loops, or skill packs. This rigidity stems from five core problems: workflows differ by discipline, ideation is limited by LLM context size, summary-only views miss paper bodies, paper-to-code links are ignored, and no tool persists knowledge across runs into a finite context. PARNESS, built by Yuchen Wang and Zhongzhi Luan, tackles all five with four design moves.
First, a thin DAG kernel with a four-field Agent contract (inputs, outputs, tools, constraints) expressed in user-editable YAML lets any discipline's loop be defined without modifying source code. Second, a full-text PDF-parsing subsystem indexes paper bodies, figures, and tables as typed objects, falling back to abstracts when needed. Third, a knowledge graph over papers, ideas, experiments, and code repositories enables scenario-typed retrieval — similar, contradictory, cross-domain, or counter-intuitive — feeding a focused slice into each LLM call. Fourth, an extension surface allows agents like Claude Code, Cursor, Copilot, and OpenCode to replace modules. PARNESS is claimed to be the first open-source system combining declarative pipelines, full-PDF and code-repository indexing, and cross-run knowledge accumulation, and has already produced a verbatim end-to-end-generated paper.
- PARNESS uses a declarative DAG kernel with YAML-defined Agent contracts, enabling any discipline's workflow loop without framework code changes.
- Full-text PDF parsing indexes paper bodies, figures, and tables as typed objects, with graceful abstract-only fallback for paywalled content.
- Knowledge graph with scenario-typed retrieval (similar, contradictory, cross-domain, counter-intuitive) surfaces targeted context into each LLM call.
Why It Matters
Automates end-to-end scientific research with flexible workflows, full-text knowledge, and cross-run persistence — a major leap for AI-driven discovery.