LLMs show up to 39% provider bias in code generation, study finds
New benchmark reveals AI coding assistants favor their own company's tools 90% of the time.
Get AI news that actually matters
One email a day. Zero fluff. Join 10,000+ professionals.
Large Language Models (LLMs) may favor their own providers' ecosystems when generating code, a behavior called Vertical Integration Bias (VIB). A new benchmark, VIBench, measures this in direct and agentic code generation across 20 provider-selectable scenarios. Evaluating 10 provider-affiliated models against 3 non-affiliated controls reveals significant bias: up to +18.8 percentage points in direct generation and +39.2 points in agentic workflows. Early ecosystem choices in agentic tasks persist into downstream files up to 90.3% of the time, raising concerns about developer lock-in.
- Six of ten provider-affiliated LLMs showed statistically significant VIB, with direct code generation bias up to +18.8 percentage points.
- Agentic workflows amplify bias to +39.2 percentage points, and ecosystem choices persist in 90.3% of downstream files.
- VIBench benchmark tested 20 software-integration scenarios against 10 affiliated and 3 non-affiliated models.
Why It Matters
Developers risk vendor lock-in as AI coding tools increasingly steer code toward their parent company's ecosystem.