Anthropic downgraded cache TTL on March 6th
Cache duration dropped from 1 hour to 5 minutes, causing subscription users to hit quota limits unexpectedly.
Anthropic, the AI company behind Claude, has come under scrutiny after a silent server-side change significantly increased costs for developers. Analysis of 119,866 API calls from January to April 2026 reveals that the company changed the default cache TTL (Time To Live) for its Claude Code API from 1 hour to just 5 minutes around March 6-8, 2026. This regression reversed a previous improvement from February where users enjoyed consistent 1-hour caching, causing cache creation costs to surge by 20-32% and pushing subscription users toward their quota limits for the first time.
The impact was documented in a detailed GitHub issue showing day-by-day token usage patterns. Before March 6, users experienced 33 consecutive days of clean 1-hour-only caching with near-zero cost waste. After the change, 5-minute cache tokens became dominant, representing 83% of usage by March 8 and 93% by March 21. The data comes from two independent machines with different accounts, strengthening the evidence that this was a server-side configuration change rather than client-side behavior.
Cost analysis using official Anthropic pricing shows dramatic differences. In March 2026 alone, users paid $2,776.11 when they should have paid $2,057.01 with 1-hour caching—a $719.09 overpayment representing 25.9% waste. The issue was closed by Anthropic as 'not planned,' indicating the company doesn't intend to revert the change, leaving developers to absorb the increased costs or adjust their usage patterns significantly.
- Cache TTL silently regressed from 1 hour to 5 minutes in early March 2026, confirmed by analysis of 119,866 API calls
- Costs increased 20-32% for cache creation, with March showing 25.9% waste ($719 overpayment on $2,776 bill)
- Subscription users hit quota limits unexpectedly after the change, with 5-minute cache tokens becoming 93% dominant by March 21
Why It Matters
Silent API changes can dramatically increase operational costs for businesses relying on AI services, requiring careful monitoring and budget adjustments.