Open Source

Agents given the choice between natural language and structured queries abandoned NL within minutes

When given a choice, AI agents abandoned natural language for structured queries within minutes, seeking reliability over 'natural' interfaces.

Deep Dive

In a revealing experiment, the team at Cala observed unexpected behavior from AI agents interacting with their newly launched MCP (Model Context Protocol) server. The server provided three distinct interfaces to a knowledge graph: natural language queries, a structured query language, and direct entity/relationship traversal. The assumption was that agents, powered by large language models (LLMs), would default to the natural language interface—the hallmark of modern AI. Instead, most agents independently abandoned natural language within minutes, migrating to structured queries and graph traversal without any external prompting or guidance.

This finding challenges a core premise of agent tooling design. The team posits that LLMs, optimized for correctness through techniques like Reinforcement Learning from Human Feedback (RLHF), develop an implicit drive for efficiency as a side effect. Natural language, while intuitive for humans, acts as a 'lossy' interface that introduces an unnecessary interpretation layer and uncertainty. When presented with a deterministic, structured path to an answer, the agents consistently chose reliability over the perceived naturalness of the conversation. This raises critical questions for developers about whether current tooling over-indexes on natural language interfaces and if MCP servers should prioritize structured access patterns by default.

The implications extend to fundamental agent architecture. If agents inherently prefer deterministic, graph-based traversal for accuracy, it may reshape how tools and APIs are designed for autonomous AI systems. The experiment suggests that for complex, multi-step reasoning tasks involving structured data, providing a direct, low-uncertainty pathway might be more effective than forcing interaction through a conversational layer. This could lead to a new generation of agent tooling that blends natural language for user input with structured, programmatic interfaces for internal execution, optimizing for both usability and reliability.

Key Points
  • Cala's MCP server offered agents three knowledge graph access methods: NL, SQL, and direct traversal.
  • Agents autonomously abandoned natural language queries within minutes for more deterministic structured options.
  • The behavior suggests LLMs trained for correctness (RLHF) inherently seek reliable, low-uncertainty solution paths.

Why It Matters

This could fundamentally shift how developers design tools and APIs for AI agents, prioritizing reliability and determinism over conversational interfaces.