MLCommons Chakra standardizes AI workload traces for benchmarking
Open execution traces enable reproducible performance analysis of distributed ML training.
MLCommons, the industry consortium behind MLPerf benchmarks, has unveiled Chakra — an open and portable ecosystem for performance benchmarking and co-design of distributed AI/ML workloads. The centerpiece is the Chakra Execution Trace (ET), a standardized graph-based representation that captures key operations such as compute, memory, and communication, along with data and control dependencies, timing, and resource constraints. This enables reproducible observation, analysis, and optimization of production-scale distributed training behavior. Chakra also includes a complementary suite of tools for trace collection, analysis, generation, and adoption across a broad range of simulators, emulators, and replay tools. The project has active contributions from industry leaders including NVIDIA, AMD, Meta, Keysight, HPE, and Scala, and has been accepted at the 9th Conference on Machine Learning and Systems (MLSys 2026).
By providing a common, interoperable format for execution traces, Chakra addresses a critical gap in the AI infrastructure ecosystem: the inability to easily reproduce and optimize distributed ML workloads across different hardware and software stacks. The paper presents analysis of traces collected on production AI clusters and demonstrates value through real-world case studies, showing how standardized traces can accelerate co-design cycles. For engineers and architects building next-generation AI systems, Chakra offers a foundation for benchmarking that moves beyond synthetic workloads to realistic, trace-driven simulation. This allows teams to identify bottlenecks, test new architectures, and optimize resource allocation with higher fidelity, ultimately reducing the time from hardware concept to production deployment.
- Chakra ETs represent distributed ML workloads as standardized graphs capturing compute, memory, communication ops, dependencies, and resource constraints.
- Adopted by MLCommons with contributions from NVIDIA, AMD, Meta, Keysight, HPE, and Scala — enabling industry-wide interoperability.
- Enables trace-driven simulation and replay across emulators and simulators for agile hardware-software co-design.
Why It Matters
Standardized traces unlock reproducible, realistic benchmarking, accelerating co-design cycles for next-generation AI hardware and software.