Research & Papers

CoCoDA framework lets 8B AI models match 32B performance via smart tool DAG

A new code-native DAG lets small models scale tool libraries without ballooning context.

Deep Dive

Researchers Ziyang Yu, Qiyue Li, and Liang Zhao have introduced CoCoDA (Co-evolving Compositional DAG for Tool-Augmented Agents), a framework that solves a critical scaling problem in tool-augmented language models. As tool libraries grow, traditional flat or text-indexed retrieval methods force prompt context to expand linearly, making them impractical for smaller models with fixed context windows. CoCoDA tackles this by storing tools as nodes in a directed acyclic graph (DAG), where edges encode invocation dependencies and each node holds a typed signature, description, pre/post-conditions, and worked examples. At inference time, the system prunes candidates in stages: first symbolic unification, then description scoring, then behavioral specification filtering, and finally example disambiguation — keeping the expensive context materialization for only the most relevant candidates.

During training, CoCoDA folds successful trajectories into validated composite nodes and rewards the planner based on the primitive expansion size of composites, incentivizing reuse. The framework provides theoretical guarantees: sublinear retrieval time, compositional advantage under the shaped reward, and monotonic co-evolution. Experimentally, an 8B parameter student model using CoCoDA matched or exceeded a 32B teacher on GSM8K and MATH benchmarks, and consistently outperformed strong tool-use baselines across mathematical reasoning, tabular analysis, and code tasks. This suggests that smart library structure can compensate for raw parameter count, a key insight for deploying capable AI on resource-constrained hardware.

Key Points
  • Typed DAG retrieval prunes candidates via symbolic signature unification, description scoring, and behavioral specs, keeping context materialization for only the most relevant tools.
  • An 8B parameter student model using CoCoDA matches or exceeds a 32B teacher's performance on GSM8K and MATH benchmarks.
  • The framework provides theoretical guarantees including sublinear retrieval time, monotonic co-evolution, and DAG well-formedness.

Why It Matters

Smaller AI models can now rival larger ones by intelligently managing tool libraries, reducing compute and cost.