Research & Papers

RuleEdit lets you preview AI model edits before applying them

Failure-guided model editing with prospective previews improves human-AI accuracy by 14%.

Deep Dive

RuleEdit is a new interactive system from arXiv researchers that addresses a critical gap in human-AI collaboration: how to let practitioners detect model failures and inspect the consequences of edits before applying them. The system uses interpretable mismatch signals from rule tables to surface likely failures, then supports user-authored rule feedback with prospective previews of projected performance changes and embedding shifts. In a study with health professionals and students working on stroke rehabilitation assessments, rule-guided failure detection significantly increased Human+AI performance by 14.16% (p<0.001), while reducing over-reliance, under-reliance, and ChangedToWrong decisions.

The prospective embedding previews proved especially powerful—participants who saw them improved their model adaptation feedback, boosting post-update local performance gains from 11.50% to 36.38% after incorporating their rule-based feedback (p<0.001). However, the study also uncovered a critical local-global tradeoff: edits that help a specific case can degrade performance when transferred globally. This finding underscores the need for failure-aware, controllable human-AI systems that let experts understand not just immediate impacts, but ripple effects across the model's entire deployment space.

Key Points
  • RuleEdit surfaces likely AI failures via interpretable mismatch signals from rule tables, enabling targeted editing
  • Prospective impact previews boosted post-update local performance gains from 11.50% to 36.38% (p<0.001)
  • Study revealed a local-global tradeoff: edits beneficial for one case can degrade overall model performance when scaled

Why It Matters

Empowers domain experts to safely edit AI models by predicting failure impacts before deployment.