98% fewer tokens than grep+read by returning only relevant code snippets?

98% fewer tokens than grep+read by returning only relevant code snippets

~250 ms indexing and ~1.5 ms query times on CPU, no GPU or API keys needed?

~250 ms indexing and ~1.5 ms query times on CPU, no GPU or API keys needed

Works as MCP server or bash tool with Claude Code, Cursor, Codex, OpenCode?

Works as MCP server or bash tool with Claude Code, Cursor, Codex, OpenCode

Developer Tools

Semble code search cuts token usage 98% for AI agents

Hacker News May 18, 2026

⚡Indexes a full codebase in ~250 ms, answers queries in ~1.5 ms on CPU.

Deep Dive

Semble is a new code search library purpose-built for AI agents, offering massive efficiency gains over traditional grep-based approaches. Instead of combing through entire files with keywords, agents can query in natural language (e.g., „How is authentication handled?“) and receive only the exact code snippets needed. The system claims to use ~98% fewer tokens than the typical grep+read workflow, dramatically reducing both cost and latency. An average repository indexes in about 250 milliseconds, and searches complete in around 1.5 milliseconds, all running on a standard CPU.

Built for seamless agent integration, Semble can run as an MCP server or be invoked via bash. It supports tools like Claude Code, Cursor, Codex, OpenCode, and any MCP-compatible agent. The library indexes local directories or remote git URLs, caches indexes per session, and watches local paths for automatic re-indexing on changes. For sub-agents that cannot call MCP tools directly, a bash integration via AGENTS.md/CLAUDE.md is also provided. No API keys, GPUs, or external services are required.

Benchmarks show Semble achieves an NDCG@10 of 0.854, on par with code-specialized transformer models like CodeBERT, while being roughly 200x faster to index and 10x faster to query. This combination of speed, accuracy, and zero-setup makes it ideal for developers looking to reduce token costs and improve agent response times in code-oriented tasks. The tool also tracks token savings with a `semble savings` command.

Key Points

98% fewer tokens than grep+read by returning only relevant code snippets
~250 ms indexing and ~1.5 ms query times on CPU, no GPU or API keys needed
Works as MCP server or bash tool with Claude Code, Cursor, Codex, OpenCode

Why It Matters

Cuts token costs and latency for AI agents, making code exploration vastly more efficient.

Read Original Article

Semble code search cuts token usage 98% for AI agents

Why It Matters

Related Articles

🚀 Stay Ahead in AI