SDP constructs certified state spaces from raw text using natural-language predicates, eliminating the need for pre-defined MDP structure?

SDP constructs certified state spaces from raw text using natural-language predicates, eliminating the need for pre-defined MDP structure.

Achieves best training-free results across all five benchmarks, with widening advantage on long-horizon tasks?

Achieves best training-free results across all five benchmarks, with widening advantage on long-horizon tasks.

Enables per-predicate credit assignment, failure localization, and modular operator replacement – capabilities unavailable to reactive agents?

Enables per-predicate credit assignment, failure localization, and modular operator replacement – capabilities unavailable to reactive agents.

Research & Papers

New SDP framework enables AI to build its own state spaces

arXiv cs.AI May 14, 2026

⚡No more hand-crafted state spaces: AI agents now self-construct MDPs from raw text.

Deep Dive

Language environments like web browsers, code terminals, and interactive simulations emit only raw text, not explicit states. Traditional MDP analysis requires a well-defined state space, observation-to-state mapping, certified transitions, and a termination criterion – none of which exist in these settings. In the paper "State-Centric Decision Process," researchers from (affiliation not specified) propose SDP, a framework that turns this on its head: instead of the environment providing structure, the agent constructs it while acting. At each step, the agent commits to a natural-language predicate describing a desired world state, takes an action, checks the observation against it, and if the predicate passes, that observation becomes a certified state. This builds a trajectory that contains all four missing MDP objects.

Evaluated across five benchmarks in planning, scientific exploration, web reasoning, and multi-hop question answering, SDP achieves the best training-free results on all tasks, with the advantage growing as the horizon increases. Beyond raw performance, the certified trajectories unlock capabilities that reactive agents lack: per-predicate credit assignment (pinpointing which parts of a task worked), failure localization, partial-progress measurement, and modular swapping of operators without retraining. This approach promises more reliable and interpretable autonomous agents in unstructured environments.

Key Points

SDP constructs certified state spaces from raw text using natural-language predicates, eliminating the need for pre-defined MDP structure.
Achieves best training-free results across all five benchmarks, with widening advantage on long-horizon tasks.
Enables per-predicate credit assignment, failure localization, and modular operator replacement – capabilities unavailable to reactive agents.

Why It Matters

SDP could unlock more reliable, explainable AI agents for real-world tasks like web automation and scientific discovery.

Read Original Article

New SDP framework enables AI to build its own state spaces

Why It Matters

Related Articles

🚀 Stay Ahead in AI