Research & Papers

Cognis: Context-Aware Memory for Conversational AI Agents

Open-source memory system combines BM25 keyword search with vector similarity and a cross-encoder reranker.

Deep Dive

A team of researchers from Lyzr has introduced Cognis, a novel memory architecture designed to solve a core limitation of current LLM agents: the lack of persistent, personalized memory. Published on arXiv, the system provides a unified backend that allows conversational AI agents to remember past interactions across sessions, preventing the frustrating reset that occurs in most current implementations. Its core innovation is a multi-stage retrieval pipeline that intelligently combines different search methodologies for optimal recall.

Cognis employs a dual-store approach, pairing the traditional keyword-matching power of OpenSearch BM25 with the semantic understanding of a Matryoshka vector similarity search. These results are fused using Reciprocal Rank Fusion (RRF). The pipeline is context-aware, retrieving existing memories before extracting new ones to enable intelligent version tracking and maintain consistency. A final refinement stage uses a BGE-2 cross-encoder reranker to boost final result quality, while temporal boosting helps prioritize recent, time-sensitive information.

The system's performance is rigorously validated, achieving state-of-the-art results on two independent benchmarks: LoCoMo and LongMemEval, across eight different answer generation models. This demonstrates its robustness and model-agnostic utility. Crucially, Lyzr has made Cognis open-source and reports it is already deployed in production, serving real-world conversational AI applications, moving it from a research concept to a practical tool for developers.

Key Points
  • Uses a dual-store retrieval pipeline combining OpenSearch BM25 keyword search with Matryoshka vector similarity, fused via Reciprocal Rank Fusion (RRF).
  • Achieved state-of-the-art performance on both the LoCoMo and LongMemEval benchmarks across eight different answer generation models.
  • The system is open-source and already deployed in production for conversational AI applications, featuring context-aware ingestion and temporal boosting.

Why It Matters

Enables truly personalized AI assistants that remember user history, a critical step beyond single-session chatbots.