Google's Gemini 3.5 Flash costs 5.5x more but runs 70% faster
5.5x price hike, 280 tok/s speed, yet 75% pricier than Pro in practice.
Google quietly released Gemini 3.5 Flash this week, and the pricing numbers are raising eyebrows across the AI community. According to Artificial Analysis data, the new model costs roughly 5.5 times more to run than the older 3.0 Flash. Input tokens have tripled to $1.50 per million, and output tokens now sit at $9.00 per million — a steep climb that puts it in the premium tier. More concerning for developers: 3.5 Flash takes an average of 49 processing steps for complex tasks, compared to just 23 for the heavier 3.1 Pro. This means in practical use, 3.5 Flash ends up being about 75% more expensive to run than the Pro model itself, despite being the 'Flash' variant.
On the performance side, there are real trade-offs. The model is fast — pumping out 280 tokens per second, a 70% speed improvement over previous versions. It also posted a strong IQ index score of 55, beating Grok 4.3 and Claude Sonnet 4.6. However, coding benchmarks are weak at 45, and while hallucinations dropped by 31 points to 61%, that's still a high error rate. This pricing trend mirrors the broader industry: OpenAI's GPT-5.5 is 50-90% more expensive than its predecessor, and Claude Opus 4.7 is up 30-40%. The shift toward autonomous, multi-step AI systems is driving compute costs upward, forcing everyone to rethink API budgets.
- Gemini 3.5 Flash is 5.5x more expensive than 3.0 Flash: $1.50/M input, $9.00/M output tokens.
- Takes 49 steps per task (vs 23 for Pro model), making it 75% costlier than Pro in practice.
- Benchmarks: IQ 55 (beats Grok 4.3, Claude Sonnet 4.6), coding 45, hallucinations down 31 pts to 61%.
Why It Matters
API costs are ballooning across models; developers must rethink budgets as autonomous systems demand more compute.