MELD detector beats commercial AI-text detectors on RAID leaderboard
New open-source detector catches 99.9% of AI texts at 1% false positive rate.
MELD (Multi-Task Equilibrated Learning Detector) is a new open-source AI-text detector that sets a new state-of-the-art on the RAID leaderboard, matching or beating leading commercial detectors—especially under adversarial attacks and at low false-positive rates. Developed by researchers Chenjun Li, Cheng Wan, and Johannes C. Paetzold, MELD enriches the standard binary AI/human classification with three auxiliary tasks: predicting the generator family, the attack type (e.g., rewriting, paraphrasing), and the source domain. All four loss functions are balanced via learned homoscedastic uncertainty weights, allowing the shared encoder to learn robust representations that generalize far better than single-task detectors.
To further improve robustness, MELD uses an EMA (exponential moving average) teacher that predicts on clean inputs while an attack-augmented student is distilled toward the teacher, plus a hard-negative pairwise ranking loss that widens the score margin between AI outputs and the most confusable human texts. At inference, all auxiliary heads are discarded, so MELD has the same cost and API as any standard detector. On the newly introduced MELD-eval benchmark—built from recent chat models by four major LLM providers—MELD achieves 99.9% true positive rate at 1% false positive rate without any additional fine-tuning, while many baselines degrade sharply. This makes MELD a practical, deployable solution for maintaining academic integrity, content moderation, and provenance tracking in an era of ubiquitous AI writing.
- MELD uses multi-task learning with four heads (binary AI/human, generator family, attack type, source domain) balanced by learned uncertainty weights.
- On the public RAID leaderboard, MELD is the strongest open-source detector and competitive with leading commercial models, especially under attack.
- Without fine-tuning, MELD achieves 99.9% TPR at 1% FPR on a held-out evaluation pool of recent chat models (MELD-eval).
Why It Matters
MELD provides a free, open-source detector that rivals commercial tools, crucial for academic integrity and content moderation.