EvalAI improved provision-level accuracy under AI error, especially for near-miss and overbreadth errors?

EvalAI improved provision-level accuracy under AI error, especially for near-miss and overbreadth errors.

Conventional XAI enabled faster decisions when AI output was correct, but not when errors occurred?

Conventional XAI enabled faster decisions when AI output was correct, but not when errors occurred.

Research & Papers

EvalAI Assistants Boost Illegal Content Reporting Accuracy Under DSA

arXiv cs.HC May 25, 2026

⚡New study tests LLM assistants for EU Digital Services Act reporting

Deep Dive

A new study from researchers at Marie-Therese Sekwenz, Shreyan Biswas, Rita Hermann-Gsenger, and Ujwal Gadiraju investigates how large language model (LLM) assistants can support users in reporting illegal content under the EU Digital Services Act (DSA). Article 16 of the DSA requires user notices to be sufficiently substantiated, placing a heavy burden on individuals to interpret legal categories. The team conducted a controlled user study (N=450) using an interface modeled on a major platform's reporting workflow. They compared three conditions: unaided reporting, a conventional explainable AI (XAI) that suggests a single legal category with a rationale, and an evaluative AI (EvalAI) that presents balanced pro and con arguments across candidate legal provisions. The study also introduced systematically varied AI error regimes to mimic real-world imperfections.

Results show that EvalAI improves provision-level accuracy under AI error and reduces misclassification distance, particularly for near-miss and overbreadth errors. When AI output is correct, conventional XAI enables faster decisions. However, neither AI assistance form reliably improves the quality of users' substantiated explanations compared to unaided reporting. The authors discuss trade-offs between accuracy, deliberation speed, explanation quality, and vulnerability to misleading AI output. These findings have direct implications for designing compliance-oriented reporting interfaces under the DSA, highlighting that while LLM assistants can help, they must be carefully calibrated to avoid over-reliance or degradation of user reasoning.

Key Points

EvalAI improved provision-level accuracy under AI error, especially for near-miss and overbreadth errors.
Conventional XAI enabled faster decisions when AI output was correct, but not when errors occurred.
Neither AI form improved the quality of users' substantiated explanations compared to unaided reporting.

Why It Matters

As DSA enforcement ramps up, smart AI design can help or hinder legal compliance in content moderation.

Read Original Article

EvalAI Assistants Boost Illegal Content Reporting Accuracy Under DSA

Why It Matters

Related Articles

🚀 Stay Ahead in AI