Where did we fail? -- Reproducing build failures in embedded open source software
New tool recovers 4,628 lost CI build logs with 98% accuracy.
Continuous integration (CI) builds in embedded systems frequently fail due to complex cross-compilation, board configurations, and toolchain constraints. Build logs are short-lived and difficult to reuse across heterogeneous runners and log formats, making debugging and failure analysis a persistent challenge. To address this, researchers developed PhantomRun—a unified abstraction layer and publicly reusable dataset that standardizes how CI build logs and metadata are retrieved, stored, and reproduced.
In an empirical evaluation with 4,628 failing CI runs from embedded open-source projects, PhantomRun reconstructed 91.8% of the original builds and preserved execution outcomes in 98% of cases. The reproduced builds closely matched their originals, differing only in timestamps or minor nondeterministic reordering. PhantomRun exposes all build artifacts in a uniform, machine-readable format, enabling large-scale historical CI reconstruction and longitudinal studies of failure patterns. This work was presented at EASE 2026 and is available on arXiv.
- Reconstructed 91.8% of 4,628 failing CI builds from embedded systems
- Preserved execution outcomes in 98% of evaluated cases
- Provides a unified, machine-readable format for build logs and metadata
Why It Matters
Enables reproducible research and systematic analysis of embedded CI failures at scale.