Models & Releases

Spent 7.356.000.000 input tokens in November 🫣 All about tokens

⚡After processing 7.3B tokens in one month, an AI founder shares hard-won optimization strategies.

Deep Dive

An AI founder's analysis of processing 7.3 billion input tokens in November—costing approximately $1.1 million at OpenAI's o1 pricing—reveals critical insights for managing large-scale AI deployments. Tilen, who runs an SEO automation agent, discovered that token pricing varies dramatically between models: GPT-4o-mini charges $0.15 per million input tokens while OpenAI's reasoning model o1 costs $15 per million, a 100x difference. He emphasizes that output tokens typically cost 4x more than input tokens, making response optimization crucial.

To combat these costs, Tilen developed five key strategies. First, prompt caching (enabled by default in OpenAI's API) can significantly reduce token usage when dynamic content is placed at the end of prompts. Second, structuring prompts to return minimal data—like position numbers instead of full text—reduced his output costs by 60%. Third, OpenAI's Batch API offers 50% discounts for non-urgent processing with 24-hour turnaround. Fourth, choosing the right model for each task is essential given the massive price variations. Finally, he recommends setting billing alerts after learning from painful experience.

The analysis also clarifies token fundamentals: tokens function as "LEGO pieces" for language processing, with approximately 1 token equaling 4 English characters or ¾ of a word. Non-English languages often require more tokens per phrase, and OpenAI provides a free tokenizer tool for precise calculations. For teams using multiple providers, Tilen recommends Open Router API as a unified interface supporting OpenAI, Claude, DeepSeek, and Gemini models through one client.

Key Points
  • Processed 7.3B tokens in November with costs ranging from $0.15/M (GPT-4o-mini) to $15/M (OpenAI o1)
  • Output tokens cost 4x more than input tokens—prompt restructuring reduced costs by 60%
  • Batch API offers 50% savings for non-urgent processing with 24-hour turnaround

Why It Matters

As AI scales, token costs become major budget items—these optimizations can save enterprises millions annually.