Hugging Carbon: Quantifying the Training Carbon Emissions of AI Models at Scale
Training the most popular open-source AI models emitted ~58,000 metric tons of CO2.
A team of researchers—Xinlei Wang, Ruibo Ming, Jing Qiu, Junhua Zhao, and Jinjin Gu—has released a preprint on arXiv (2605.01549) titled 'Hugging Carbon: Quantifying the Training Carbon Emissions of AI Models at Scale.' The paper addresses the lack of systematic carbon accounting in the AI industry by treating Hugging Face as a large, publicly accessible corpus. They propose a FLOPs-based framework that estimates aggregate training emissions from open-source models, even when training metadata is incomplete. To handle uneven disclosure quality, they introduce a tiered approach supported by empirical regressions, and they define a new metric—AI Training Carbon Intensity (ATCI)—measuring emissions per unit of compute. Their analysis reveals that the most popular open-source models (those with over 5,000 downloads) have collectively produced about 58,000 metric tons of carbon emissions.
The study's findings highlight a growing environmental concern as AI scales under the scaling-law paradigm. By focusing on open-source models widely shared on Hugging Face, the authors provide a reproducible and audit-ready methodology that can guide future standards and sustainability strategies. The tiered framework allows for estimation even when key details like GPU type or training duration are missing, using statistical regressions to fill gaps. This work not only quantifies current emissions but also sets a foundation for ongoing monitoring. As AI adoption accelerates, such carbon accounting tools become essential for developers, policymakers, and industry leaders aiming to balance performance with environmental responsibility.
- Proposes a FLOPs-based framework that estimates training emissions for Hugging Face models with incomplete metadata using a tiered statistical approach.
- Top open-source models (5,000+ downloads) collectively produced ~58,000 metric tons of CO2 during training.
- Introduces AI Training Carbon Intensity (ATCI) as a new efficiency metric to compare sustainability across models.
Why It Matters
Provides the first scalable, transparent carbon accounting for open-source AI, enabling sustainability benchmarks for the industry.