Research & Papers

GoSkills paper rethinks agent skill retrieval with role-labeled groups

New group-structured retrieval method outperforms flat and dependency baselines on ALFWorld...

Deep Dive

Skill-augmented agents increasingly rely on large reusable skill libraries, but conventional retrieval methods return either atomic skills or dependency bundles that lack explicit role information. This forces the agent to infer execution entry points, support skills, validation requirements, and failure-avoidance guidance on its own. A new paper from researchers at [institution not stated, but likely Chinese universities] introduces Group of Skills (GoSkills), an inference-time group-structured retrieval method that fundamentally changes what the agent receives: instead of a flat list, it gets a compact, role-labeled execution contract with four explicit fields: Start, Support, Check, and Avoid. GoSkills constructs anchor-centered skill groups from a typed skill graph, expands support groups via a group graph, and bottlenecks the selected group plan into a bounded set of atomic payloads, without altering the downstream agent or execution environment.

Experiments on the SkillsBench benchmark and the ALFWorld environment demonstrate that GoSkills preserves visible-requirement coverage even under small skill budgets, consistently improving over flat skill-access baselines. It also often achieves better reward and agent-only runtime compared to structural retrieval baselines. The paper includes 30 pages with 4 figures and 24 tables, offering detailed ablation studies and comparisons. The work addresses a practical bottleneck in building autonomous agents that must retrieve and compose skills on the fly, making it relevant to anyone designing agentic systems or tool-using LLMs.

Key Points
  • GoSkills returns a 'Start, Support, Check, Avoid' execution contract instead of a flat skill list
  • Uses anchor-centered skill groups from a typed skill graph and a group graph for expansion
  • Outperforms flat and dependency-aware baselines on SkillsBench and ALFWorld in reward and runtime

Why It Matters

Makes autonomous agent skill retrieval more interpretable and efficient, reducing inference overhead.