[D] Seeking feedback: Safe autonomous agents for enterprise systems
Open-source agent 'Sentri' combines policy, RAG, and an LLM judge to prevent unsafe actions in databases.
An experienced infrastructure researcher is formalizing a novel framework for creating safe autonomous AI agents in high-stakes enterprise environments like databases, cloud systems, and financial platforms. The core problem is that most LLM agent frameworks prioritize capability over verifiable safety, where a single unsafe action can have serious real-world consequences. The proposed solution, demonstrated in an open-source tool named 'Sentri,' is a three-layer safety architecture. This includes hard-coded policy enforcement to block destructive operations, RAG (retrieval-augmented generation) to ground decisions in past incidents and policy documents, and a final independent 'LLM judge' to evaluate safety before any action is executed.
Sentri is specifically built for database remediation, automating Level 2 DBA workflows by following an 'Alert → Root Cause Analysis → Remediation → Guarded Execution' pipeline. Early validation indicates it generates 'significantly fewer unsafe actions' compared to agents without this guardrail system. The researcher, with a background from Georgia Tech and extensive industry experience, is now seeking community feedback to strengthen the work before publication. Key questions revolve around the best academic venue (AI safety vs. systems conferences like VLDB), how to rigorously prove an agent is 'production-safe,' and whether to demonstrate deep capability in one domain or lighter validation across multiple infrastructure domains.
The work also touches on broader research interests, including multi-agent financial reasoning benchmarks. By open-sourcing Sentri and soliciting input on baselines and adversarial testing methods, the researcher aims to create a generalized safety pattern that could apply to various autonomous systems operating under real-world constraints, moving beyond theoretical safety to practical, deployable guardrails.
- Proposes a 3-layer safety architecture: policy rules, RAG verification, and an independent LLM judge for pre-execution checks.
- Open-source 'Sentri' agent automates database remediation, showing 'significantly fewer unsafe actions' vs. naive LLM agents.
- Researcher seeks feedback on evaluation metrics and domain generalization before arXiv submission and conference presentation.
Why It Matters
Provides a blueprint for deploying powerful AI agents in critical systems without risking destructive, real-world errors.