Is this Build Failure Related to my Patch? An Empirical Study of Unrelated Build Failures in Continuous Integration
77,354 build failures from 7 Apache projects reveal 20% unrelated to your code.
Deep Dive
Researchers analyzed 77,354 CI build failures from seven open source Apache projects. They found developers spend a median of 4 hours determining if a failure relates to their patch. Using semi-supervised PU learning with 33 features (latency, error repeats, comments), their models achieved 0.70–0.88 precision and 0.63–0.97 AUC, helping engineers skip false alarms and focus on actionable failures.
Key Points
- Developers spend a median of 4 hours determining if a CI failure relates to their patch.
- 20% of unrelated failures are due to test flakiness or infrastructure, not code changes.
- PU learning models using 33 features achieved 0.70–0.88 precision and 0.63–0.97 AUC across 7 Apache projects.
Why It Matters
Saves engineering teams hours per failure by automating detection of irrelevant CI red flags.