Achieves 100% true positive rate with 0% false positives on DelegationBench v4 (516 attack scenarios across 13 federal domains)?

Achieves 100% true positive rate with 0% false positives on DelegationBench v4 (516 attack scenarios across 13 federal domains)

Blocks all 30 black-box adversarial attacks while maintaining six mechanically verified security properties across 2.7 million states?

Blocks all 30 black-box adversarial attacks while maintaining six mechanically verified security properties across 2.7 million states

Fine-tuning improves intent verification from 1.7% to 88.3% TPR using 190 government delegation examples?

Fine-tuning improves intent verification from 1.7% to 88.3% TPR using 190 government delegation examples

Agent Frameworks

SentinelAgent framework secures AI agents with 100% attack detection and zero false positives

arXiv cs.MA April 06, 2026

⚡New system blocks 30/30 attacks in federal AI tests with mechanically verified security properties.

Deep Dive

Researcher KrishnaSaiReddy Patil has published SentinelAgent, a groundbreaking framework designed to solve the critical security gap in multi-agent AI systems where delegation chains become opaque and unverifiable. The system introduces a Delegation Chain Calculus (DCC) that defines seven core security properties—six deterministic and one probabilistic—for tracking authorization from User X through Agent A to Tool C. Crucially, the framework's non-LLM Delegation Authority Service enforces these properties at runtime, achieving perfect 100% true positive detection with zero false positives on the comprehensive DelegationBench v4 benchmark, which includes 516 scenarios across 10 attack categories and 13 federal domains.

In rigorous testing, SentinelAgent's deterministic security properties proved unbreakable under adversarial stress, with the system successfully blocking all 30 black-box attacks. While intent verification against sophisticated paraphrasing initially degraded to 13%, fine-tuning on 190 government delegation examples dramatically improved performance to 88.3% true positive rate. The six deterministic properties (including authority narrowing and forensic reconstructibility) were mechanically verified via TLA+ model checking across 2.7 million states with zero violations, ensuring that even if intent verification is bypassed, adversaries remain constrained to permitted API calls and traceable actions. The framework already integrates with LangChain agents, providing immediate practical application for securing complex AI workflows in sensitive environments.

Key Points

Achieves 100% true positive rate with 0% false positives on DelegationBench v4 (516 attack scenarios across 13 federal domains)
Blocks all 30 black-box adversarial attacks while maintaining six mechanically verified security properties across 2.7 million states
Fine-tuning improves intent verification from 1.7% to 88.3% TPR using 190 government delegation examples

Why It Matters

Enables secure, auditable deployment of multi-agent AI systems in government and enterprise where accountability is non-negotiable.

Read Original Article

SentinelAgent framework secures AI agents with 100% attack detection and zero false positives

Why It Matters

Related Articles

🚀 Stay Ahead in AI