Research & Papers

flexvec: SQL Vector Retrieval with Programmatic Embedding Modulation

New 'Programmatic Embedding Modulation' technique allows AI agents to directly manipulate retrieval logic.

Deep Dive

Researcher Damian Delmas has published a paper introducing flexvec, a new retrieval kernel designed for the era of AI agents. The core innovation is 'Programmatic Embedding Modulation' (PEM), which exposes the embedding matrix and score array as a programmable surface. This allows the caller—typically an AI agent—to perform arithmetic operations on both vectors and scores *before* the final selection step, moving beyond the traditional 'black box' retrieval API. The system is integrated into a SQL interface via a query materializer, enabling developers to build complex, composable query primitives directly into their agent workflows.

Performance benchmarks are a key highlight. On a production corpus of 240,000 text chunks, three composed modulation operations executed in just 19 milliseconds end-to-end on a standard desktop CPU, and this was achieved without using approximate indexing techniques that typically trade accuracy for speed. The system scales efficiently, handling one million chunks with the same operations in 82 ms. This combination of programmability, SQL integration, and high-speed, exact retrieval on commodity hardware could significantly change how developers architect retrieval-augmented generation (RAG) systems and agentic workflows, giving AI systems finer-grained control over their own knowledge access.

Key Points
  • Introduces Programmatic Embedding Modulation (PEM), allowing arithmetic operations on embeddings and scores before retrieval.
  • Achieves 19 ms latency for complex queries on 240K chunks and 82 ms on 1M chunks using exact search on a desktop CPU.
  • Integrates modulation capabilities directly into a SQL interface, enabling composable query primitives for AI agents.

Why It Matters

Gives AI agents precise, programmable control over retrieval, enabling more sophisticated reasoning and action in RAG systems.