Research & Papers

ISOMORPH: open-source supply chain digital twin for forecasting benchmarks

First public multi-echelon logistics simulator with 30+ scenario rollouts and zero-shot model evaluation.

Deep Dive

ISOMORPH is a new open-source digital twin designed to fill a critical gap in time-series forecasting (TSF) benchmarks: supply chain logistics. While existing TSF datasets cover retail, energy, and weather, no comparable resource existed for multi-echelon supply chains. The simulator models a directed routing graph in discrete time, tracking per-node inventory, outstanding orders, in-transit shipments, and smoothed demand. Its dynamics form a Markov chain with a linear transition kernel, enabling rigorous verification via three conservation laws. The released data spans two catalog scales (C=50 and C=200) and includes six scenario sweeps (30 rollouts) plus 20 Latin-hypercube perturbations, capturing variance amplification, regime shifts, and cross-channel coupling—phenomena absent from static benchmarks.

To benchmark the utility of foundation models for supply chain forecasting, the authors evaluated four models (Chronos, Moirai, TimesFM, Lag-Llama) in a zero-shot setting. Results showed MASE values that outperform public GIFT-Eval references at low-to-moderate horizons. Additionally, by perturbing demand-side knobs via Latin-hypercube sampling, the digital twin enables forward uncertainty quantification (UQ) from parameter uncertainty—something impossible on standard TSF datasets. This suggests foundation models can serve as fast surrogates for the digital twin's UQ. ISOMORPH is released under the MIT license, with code and datasets publicly available, aiming to become a standard benchmark for supply chain forecasting research.

Key Points
  • First public digital twin of a multi-echelon logistics network with interpretable parameters and modular topology.
  • Datasets at C=50 and C=200 scales, plus 30 scenario rollouts and 20 Latin-hypercube perturbations capturing bullwhip effect and regime shifts.
  • Zero-shot evaluation of Chronos, Moirai, TimesFM, and Lag-Llama shows MASE values exceeding GIFT-Eval baselines; enables forward UQ via demand-side knob perturbations.

Why It Matters

This open benchmark lets supply chain teams rigorously test forecasting models on realistic dynamics previously unavailable in public datasets.