Media & Culture

Subquadratic claims to break LLM scaling limits! 1000x less costs

Ex-DeepMind and Meta engineers say their model handles 12M tokens at 52x efficiency.

Deep Dive

Subquadratic, a startup founded by former DeepMind and Meta engineers, has announced a new LLM architecture it claims shatters current scaling limits. The company asserts its model achieves linear computational scaling — meaning doubling input data only doubles processing requirements — a stark contrast to the exponential cost growth of traditional transformers. Subquadratic reports a 12-million-token context window and a 52x efficiency gain at the 1-million-token scale compared to standard architectures. If validated, this would allow models to natively process entire datasets without retrieval-augmented generation (RAG) or vector databases, drastically simplifying AI pipelines and reducing costs by up to 1,000x.

The bold claims are met with significant skepticism from the scientific community. Critics point to the lack of independent peer review, the closed-source nature of the model, and unresolved fundamental issues — notably the theoretical trade-off between simple data retrieval and complex global reasoning, and the physical memory bandwidth bottlenecks that software alone cannot circumvent. Despite these concerns, Subquadratic has already secured $29 million in seed funding from investors including Vision Fund, Tinder co-founder, and early backers of OpenAI and Anthropic, pushing its valuation to $500 million. Whether this is a genuine architectural leap or a heavily optimized hybrid approach with hidden accuracy losses remains to be seen.

Key Points
  • Subquadratic claims linear scaling: doubling input doubles compute, versus exponential cost in standard transformers.
  • Reports 52x efficiency gain at 1M tokens and 12M context window, potentially eliminating need for RAG pipelines.
  • Raised $29M seed at $500M valuation; faces skepticism over lack of peer review and hardware bandwidth limits.

Why It Matters

If validated, Subquadratic could slash LLM costs 1,000x and render RAG obsolete, reshaping AI infrastructure.