TextWeb converts web pages to markdown for LLM interaction
Ditch expensive screenshots: TextWeb lets LLMs browse the web as native markdown.
TextWeb, built by developer woheller69, is a markdown-based web renderer designed specifically for AI agents. Unlike traditional approaches that rely on expensive screenshot captures and vision-language model processing, TextWeb renders entire web pages as structured markdown text. This allows large language models (LLMs) to understand and interact with web content natively without needing vision capabilities. The tool supports full JavaScript execution, meaning it can handle dynamic pages and single-page applications. It also annotates interactive elements like text fields, buttons, and scrollable areas, making them directly accessible to the LLM via text-based reasoning.
The tool comes with a command-line interface (CLI) for quick testing and an MCP (Model Context Protocol) server for integration into agent frameworks. It works seamlessly with llama.cpp's web UI, providing a local, cost-effective way for LLMs to browse and interact with web pages. Developers can use TextWeb to enable agents that navigate, scroll, input text, and click buttons—all without ever needing a vision model. This reduces API costs and latency while keeping the interaction purely text-based, which many LLMs handle more reliably. TextWeb is open source and available on GitHub, inviting community contributions and custom integrations.
- TextWeb renders full web pages as markdown with full JavaScript execution and interactive element annotations
- Offers both a CLI and an MCP server for easy integration into LLM workflows
- Works with llama.cpp web UI, enabling local, vision-free web browsing for AI agents
Why It Matters
Replaces costly vision-based web browsing for LLMs, enabling cheaper, faster agentic web interaction.