Research & Papers

MedFabric and EtHER: A Data-Centric Framework for Word-Level Fabrication Generation and Detection in Medical LLMs

arXiv cs.CL May 07, 2026

⚡Medical AI hallucinations just got a new enemy—word-level detection that outperforms SOTA by 15%.

Deep Dive

A team of researchers led by Tung Sum Thomas Kwok and colleagues has introduced MedFabric and EtHER, a data-centric framework designed to generate and detect word-level fabrications in medical large language models (LLMs). The problem is critical: LLMs often produce fluent but factually incorrect statements in expert domains like medicine, a phenomenon called hallucination. Existing datasets for detecting such fabrications suffer from limited coverage, stylistic differences between human and AI texts, and distributional drift. MedFabric addresses this with a pipeline that creates realistic, subtle factual deviations while preserving syntax and style.

Building on this dataset, EtHER is a modular detector combining three components: Text2Table Decomposition (structuring text into tables for easier comparison), Word Masking and Filling (identifying suspect tokens), and Hybrid Sentence Pair Evaluation (assessing factual alignment). Empirical results show EtHER beats state-of-the-art detectors by over 15% on word-level fabrication benchmarks, maintaining performance across structurally similar sentences. This framework provides a reliable way to ensure medical LLMs stay factually accurate, a critical step for clinical applications.

Key Points

MedFabric dataset generates realistic word-level medical fabrications with subtle factual deviations, improving detection training data.
EtHER detector uses Text2Table Decomposition, Word Masking/Filling, and Hybrid Sentence Pair Evaluation for modular factuality checking.
Outperforms existing state-of-the-art detectors by over 15% on word-level fabrication benchmarks.

Why It Matters

For healthcare AI, catching subtle fabrications at the word level prevents dangerous misinformation in clinical decisions.

Read Original Article

MedFabric and EtHER: A Data-Centric Framework for Word-Level Fabrication Generation and Detection in Medical LLMs

Why It Matters

Stay Ahead in AI