Models & Releases

Open-source memory layer for OpenAI apps. Your chatbot can now remember things between sessions and say "I don't know" when it should.

New system cuts token usage from 20k to 500 by scoring and retrieving only relevant conversation facts.

Deep Dive

A new open-source tool called Widemem AI tackles a fundamental limitation of current chatbot applications: their inability to remember information between user sessions. Built by developer remete618, it acts as a memory layer that sits between an application and the OpenAI API. Instead of expensively re-sending an entire conversation history (which can be 20,000 tokens or more), Widemem AI extracts important facts, scores them for relevance, and retrieves only the critical 500 or so tokens needed for the next query. This approach can reduce context window usage and associated API costs by over 95%.

The recently released v1.4 introduces a major upgrade: confidence scoring. This allows the system to recognize when it lacks sufficient, high-quality context to answer a question accurately. Developers can now configure their chatbots to respond with 'I don't know' instead of generating a potentially incorrect 'hallucination' based on weak vector matches. The tool offers three distinct modes—Strict, Helpful, and Creative—to balance accuracy with user experience, alongside configurable retrieval modes (Fast, Balanced, Deep) for tuning the speed-accuracy tradeoff. A `mem.pin()` function ensures critical facts are never forgotten.

Widemem AI is model-agnostic, offering native support for OpenAI's GPT-4o and GPT-4o-mini, Anthropic's models, and local models via Ollama. This flexibility allows developers to build more efficient, cost-effective, and reliable conversational agents without being locked into a single provider. By providing granular control over memory, confidence, and retrieval, the tool moves beyond simple chat history management toward creating AI assistants with persistent, contextual understanding.

Key Points
  • Cuts typical 20k token chat history to ~500 relevant tokens, slashing API costs by over 95%.
  • v1.4 adds confidence scoring, enabling chatbots to say 'I don't know' and reduce hallucinations.
  • Offers three answer modes (Strict/Helpful/Creative) and supports OpenAI, Anthropic, and Ollama models.

Why It Matters

Enables developers to build affordable, persistent, and honest AI assistants that remember user context across sessions without breaking the bank.