New SDP framework enables AI to build its own state spaces
No more hand-crafted state spaces: AI agents now self-construct MDPs from raw text.
Language environments like web browsers, code terminals, and interactive simulations emit only raw text, not explicit states. Traditional MDP analysis requires a well-defined state space, observation-to-state mapping, certified transitions, and a termination criterion – none of which exist in these settings. In the paper "State-Centric Decision Process," researchers from (affiliation not specified) propose SDP, a framework that turns this on its head: instead of the environment providing structure, the agent constructs it while acting. At each step, the agent commits to a natural-language predicate describing a desired world state, takes an action, checks the observation against it, and if the predicate passes, that observation becomes a certified state. This builds a trajectory that contains all four missing MDP objects.
Evaluated across five benchmarks in planning, scientific exploration, web reasoning, and multi-hop question answering, SDP achieves the best training-free results on all tasks, with the advantage growing as the horizon increases. Beyond raw performance, the certified trajectories unlock capabilities that reactive agents lack: per-predicate credit assignment (pinpointing which parts of a task worked), failure localization, partial-progress measurement, and modular swapping of operators without retraining. This approach promises more reliable and interpretable autonomous agents in unstructured environments.
- SDP constructs certified state spaces from raw text using natural-language predicates, eliminating the need for pre-defined MDP structure.
- Achieves best training-free results across all five benchmarks, with widening advantage on long-horizon tasks.
- Enables per-predicate credit assignment, failure localization, and modular operator replacement – capabilities unavailable to reactive agents.
Why It Matters
SDP could unlock more reliable, explainable AI agents for real-world tasks like web automation and scientific discovery.