Human-Machine Co-Boosted Bug Report Identification with Mutualistic Neural Active Learning
New AI framework uses 'mutualistic' active learning to make bug reports 95.8% more readable for developers.
A team of researchers has introduced a novel AI framework called Mutualistic Neural Active Learning (MNAL) designed to tackle the growing challenge of managing software bug reports. As software projects scale, manually identifying, categorizing, and assigning bug reports becomes a massive time sink. MNAL addresses this by creating a collaborative loop between a machine learning model and human developers. The system uses a neural language model to learn from bug reports across different GitHub projects, coupled with an active learning component that intelligently selects which reports need human review.
The key innovation is the 'mutualistic' relationship it fosters. The model presents developers with bug reports that are easier to read and identify, having been pre-processed for clarity. In return, the human-labeled data and corresponding machine-generated 'pseudo-labels' are used to update and improve the model. This creates a positive feedback cycle where both the AI and the human team get better at their jobs. Evaluated on a large-scale dataset, MNAL significantly outperformed state-of-the-art approaches and baseline methods.
The results are substantial for engineering teams. The framework demonstrated a 95.8% effort reduction in terms of improving report readability for labelers and a massive 196.0% reduction in the effort required for identifiability. This means developers spend far less time deciphering poorly written bug tickets. Furthermore, MNAL is model-agnostic, meaning it can boost the performance of various underlying neural language models like BERT or GPT variants. A qualitative case study with 10 human participants confirmed that MNAL was rated as more effective while saving significant time and resources.
- Achieves up to 95.8% effort reduction in bug report readability for human labelers.
- Demonstrates a 196.0% reduction in identifiability effort, drastically cutting triage time.
- Framework is model-agnostic, working with various neural language models like BERT or GPT.
Why It Matters
This could dramatically reduce software maintenance costs and accelerate bug resolution by automating the most tedious parts of issue triage.