Show HN: I taught GPT-OSS-120B to see using Google Lens and OpenCV
A clever hack bypasses API limits to give any local LLM real vision capabilities.
Deep Dive
A developer built an MCP server that gives local LLMs like GPT-OSS-120B real vision by combining OpenCV and Google Lens. The system detects objects in an image, crops them, and queries Lens for identification, successfully recognizing hardware like an NVIDIA DGX Spark. The tool also provides 17 other Google services like Search and Maps without API keys, though commenters immediately raised concerns about TOS violations and fragility.
Why It Matters
This hack demonstrates a powerful, low-cost method to add multimodal capabilities to any text model, but its legal and technical fragility is a major risk.