7B Intent Router achieves 80.9% accuracy vs GPT-4o's 48.9% on FSM-constrained adversarial routing?

7B Intent Router achieves 80.9% accuracy vs GPT-4o's 48.9% on FSM-constrained adversarial routing

Blocks all 22 injection and illegal HR operations, with 100% precision and 88% recall in message audits?

Blocks all 22 injection and illegal HR operations, with 100% precision and 88% recall in message audits

86.5% end-to-end task completion on Beisen's 185-scenario benchmark spanning 1,671 live API calls?

86.5% end-to-end task completion on Beisen's 185-scenario benchmark spanning 1,671 live API calls

Research & Papers

SDOF Framework Boosts Multi-Agent Accuracy to 80.9% by Enforcing Business Stage Constraints

arXiv cs.AI May 18, 2026

⚡New state-machine approach beats GPT-4o by 32 points with a tiny 7B model

Deep Dive

Existing multi-agent frameworks like LangChain and CrewAI route tasks through pipelines but fail to enforce the stage constraints of real business processes. SDOF solves this by treating execution as a constrained state machine with two defensive layers: an Online-RLHF Specialized Intent Router (7B model fine-tuned via Generative Reward Modeling) and a StateAwareDispatcher that runs GoalStage finite-automaton checks and precondition/postcondition validation from a SkillRegistry. This design eliminates the 'alignment tax'—performance drops caused by unconstrained agent actions—while keeping outputs auditable.

On the Beisen iTalent platform serving 6,000+ enterprises, 185 expert-curated scenarios triggered 1,671 live API calls. The 7B Intent Router scored 80.9% joint accuracy on FSM-constrained adversarial routing, crushing zero-shot GPT-4o's 48.9%. In end-to-end execution, SDOF achieved 86.5% task completion (95% CI: 80.8–90.7) and blocked all 22 injection/illegal operations. A broader message-level audit yielded 100% precision, 88% recall, and expert agreement kappa=0.94. A separate evaluation across 960 dialogues from 8 service domains identified 201 stage-order conflicts under the FSM mapping. The framework is validated for current scope; extended training comparisons are forthcoming.

Key Points

7B Intent Router achieves 80.9% accuracy vs GPT-4o's 48.9% on FSM-constrained adversarial routing
Blocks all 22 injection and illegal HR operations, with 100% precision and 88% recall in message audits
86.5% end-to-end task completion on Beisen's 185-scenario benchmark spanning 1,671 live API calls

Why It Matters

Enforces stage-order constraints in multi-agent AI, making enterprise workflows auditable, secure, and dramatically more reliable.

Read Original Article

SDOF Framework Boosts Multi-Agent Accuracy to 80.9% by Enforcing Business Stage Constraints

Why It Matters

Related Articles

🚀 Stay Ahead in AI