AI Safety

The Hidden Cost of Thinking: Energy Use and Environmental Impact of LMs Beyond Pretraining

arXiv cs.CY May 05, 2026

⚡New Olmo 3 analysis shows 82% of compute goes to experiments, not final models.

Deep Dive

A comprehensive new study from researchers at Carnegie Mellon University and the Allen Institute for AI (Jacob Morrison, Noah A. Smith, Emma Strubell) provides the first detailed breakdown of the environmental impact of a full language model development pipeline, moving beyond the narrow focus on training a single final model. The paper analyzes the Olmo 3 model family (7B and 32B parameters) across all stages: pretraining, supervised fine-tuning, preference optimization, and reinforcement learning (RL). The key finding? Reasoning models—those fine-tuned via RL for chain-of-thought—are a staggering 17x more expensive to post-train in terms of datacenter energy compared to their instruction-tuned counterparts, driven almost entirely by rollout generation during reinforcement learning.

The study further reveals a massive hidden cost: what the authors call "development costs"—including experimentation, failed runs, and ablations—account for 82.2% of total compute used in the entire process. This is roughly a 65% increase over the ~50% reported in prior work that focused only on pretraining. In absolute terms, the Olmo 3 development consumed approximately 12.3 GWh of datacenter energy, emitted 4,251 metric tons of CO2 equivalent (tCO2eq), and used 15,887 kiloliters of water. Notably, water consumption was driven entirely by thermoelectric power generation infrastructure rather than data center cooling. The authors argue these costs are almost entirely unreported by model developers and are growing rapidly as post-training pipelines become more complex, urging updated environmental reporting standards.

Key Points

Reasoning models cost 17x more energy to post-train than instruction-tuned models due to RL rollout generation.
Development experiments and failed runs account for 82.2% of total compute—65% more than previously estimated for pretraining-only pipelines.
Total for Olmo 3: 12.3 GWh energy, 4,251 tCO2eq emissions, 15,887 kL water (all from power generation, not cooling).

Why It Matters

As AI reasoning models proliferate, unaccounted development energy could dwarf training costs, demanding new transparency standards.

Read Original Article

The Hidden Cost of Thinking: Energy Use and Environmental Impact of LMs Beyond Pretraining

Why It Matters

Stay Ahead in AI