Research & Papers

From History to State: Constant-Context Skill Learning for LLM Agents

arXiv cs.AI May 09, 2026

⚡AI agents learn reusable skills without needing full history—privacy and efficiency improved.

Deep Dive

LLM agents operating browsers, files, and tools face a fundamental tension: cloud models execute complex multi-step workflows but expose sensitive intermediate context to external APIs, while local models preserve privacy but underperform. Both approaches also waste tokens on long skill prompts and growing histories. A new paper from researchers led by Haoyang Xie introduces constant-context skill learning—a context-to-weights framework that distills recurring workflows into lightweight task-family modules. Instead of conditioning inference on a full history, agents process only the current observation and a compact state block maintained by a deterministic tracker. This tracker maps task progress to a fixed-size state and supplies aligned subgoal rewards, enabling step-level supervised fine-tuning followed by online reinforcement learning.

The method was validated on three benchmarks—ALFWorld, WebShop, and SciWorld—using Qwen3-4B, Qwen3-8B, and Llama-3.1-8B. With Qwen3-8B and SFT+RL, success rates hit 89.6% on unseen ALFWorld tasks, 76.8% on WebShop, and 66.4% on SciWorld, matching or exceeding prior agent-training results while reducing prompt tokens per turn by 2–7× relative to controlled ReAct baselines. By shifting procedural context from prompts into weights, the framework enables private, cost-effective personal assistants that retain cloud-level reliability without leaking history.

Key Points

Reduces prompt tokens per turn by 2–7× vs. ReAct baselines, drastically lowering API costs and latency.
Achieves 89.6% unseen success on ALFWorld and 76.8% on WebShop using Qwen3-8B with SFT+RL.
Enables local LLM agents to match cloud-level performance without exposing sensitive intermediate context.

Why It Matters

Constant-context learning makes private, cost-efficient personal AI assistants viable without sacrificing capability.

Read Original Article

From History to State: Constant-Context Skill Learning for LLM Agents

Why It Matters

Stay Ahead in AI