GuangJian's DeFakerOne unifies fake image detection across all forgery types
New model beats 48 benchmarks, detects and localizes fakes from GPT-Image-2 and beyond.
The rapid evolution of generative AI has created a paradoxical gap: while forgery techniques across document editing, natural image manipulation, DeepFakes, and full-AIGC synthesis are converging, existing fake image detection and localization (FIDL) research remains domain-specific and fragmented. The GuangJian Team proposes DeFakerOne to solve this mismatch. The model is a data-centric, unified FIDL foundation that integrates InternVL2 (vision-language backbone) with SAM2 (segmentation) to perform simultaneous image-level detection and pixel-level localization across all forgery types. This design allows it to understand and exploit cross-domain artifact patterns rather than being limited to a single manipulation class.
DeFakerOne demonstrates state-of-the-art performance on a massive scale: 39 forgery detection benchmarks and 9 localization benchmarks, outperforming prior baselines. Notably, it shows superior robustness against real-world perturbations and modern generators like GPT-Image-2. The paper also systematically analyzes data scaling laws, cross-domain artifact transfer and interference, the necessity of fine-grained supervision, and the importance of preserving original resolution artifacts. These insights provide design principles for building scalable, robust, and truly unified FIDL systems as generative AI continues to evolve.
- Integrates InternVL2 and SAM2 for simultaneous fake detection and pixel-level localization across all forgery domains.
- Achieves state-of-the-art on 39 detection benchmarks and 9 localization benchmarks.
- Robust against GPT-Image-2 and real-world perturbations, with analysis of data scaling and cross-domain artifact transfer.
Why It Matters
Unified detection across all image forgery types is critical as AI-generated content becomes indistinguishable from real.