Startups & Funding

Startup Gimlet Labs is solving the AI inference bottleneck in a surprisingly elegant way

Startup's software splits AI workloads across CPUs, GPUs, and high-memory systems, boosting efficiency 3-10x.

Deep Dive

Gimlet Labs, a startup founded by Stanford adjunct professor Zain Asgar, has secured an $80 million Series A funding round led by Menlo Ventures. The company is tackling the massive and costly AI inference bottleneck with a novel software solution. Its core product is a 'multi-silicon inference cloud,' an orchestration layer that intelligently splits complex, multi-step AI agent workloads to run across diverse hardware types simultaneously. This includes traditional CPUs, AI-tuned GPUs, and high-memory systems, addressing the fact that different tasks—like inference, decoding, and tool calls—have different hardware needs (compute-bound, memory-bound, network-bound).

Gimlet's software aims to utilize the vast, underused capacity in existing data centers. Co-founder Asgar notes that current AI apps use deployed hardware only 15-30% of the time, representing hundreds of billions in wasted resources. By dynamically allocating workloads to the best-suited available silicon, Gimlet claims it can speed up AI inference by 3x to 10x for the same cost and power. The technology can even slice a single underlying model to run across different chip architectures. The startup, which launched publicly in October 2024 with reported eight-figure revenue, is not targeting everyday developers but rather large AI model labs and cloud providers. It has already partnered with major chipmakers including NVIDIA, AMD, Intel, ARM, Cerebras, and d-Matrix.

With the new capital, Gimlet Labs plans to scale its operations. The founding team, which previously worked together at the observability startup Pixie (acquired by New Relic), has deep technical expertise in distributed systems. The funding round was oversubscribed, bringing the company's total raised to $92 million. Backers include a notable list of angel investors like Sequoia's Bill Coughran and Intel CEO Lip-Bu Tan, alongside firms like Factory and Eclipse Ventures. The company currently employs 30 people and has seen its customer base more than double in recent months, including a major model maker and a large cloud provider.

Key Points
  • Raised $80M Series A led by Menlo Ventures to commercialize its multi-silicon inference orchestration software.
  • Claims 3-10x faster AI inference for same cost by utilizing idle hardware, boosting efficiency from 15-30% utilization.
  • Targets large AI labs/data centers, has eight-figure revenue and partnerships with NVIDIA, AMD, Intel, ARM, Cerebras, and d-Matrix.

Why It Matters

Could dramatically reduce the cost and energy footprint of running large-scale AI, challenging the 'just buy more GPUs' paradigm.