Aletheia: Physics-Conditioned Localized Artifact Attention (PhyLAA-X) for End-to-End Generalizable and Robust Deepfake Video Detection
New detector combines optical flow, heart rate, and reflections to spot AI-generated videos with 4-7% better cross-generator accuracy.
Researcher Devendra Ghori has introduced Aletheia, a novel deepfake detection framework built around a core innovation called Physics-Conditioned Localized Artifact Attention (PhyLAA-X). The system directly addresses the primary weakness of current detectors: their failure to generalize across different AI generators, compression levels, and adversarial attacks. PhyLAA-X works by injecting three differentiable, physics-based feature volumes—optical-flow discontinuities (curl), inconsistencies in specular reflections (skewness), and spatially-upsampled heart-rate signals (rPPG power spectra)—into the model's attention computation. This forces the neural network to focus on manipulation boundaries where semantic artifacts and physical law violations co-occur, regions that are inherently difficult for generative models to replicate consistently.
Aletheia embeds PhyLAA-X across an efficient spatiotemporal ensemble of models, including EfficientNet-B4 with BiLSTM and ResNeXt-101 with a Transformer, using uncertainty-aware adaptive weighting. The results are significant: 97.2% accuracy (0.992 AUC-ROC) on the FaceForensics++ benchmark, 94.9% on Celeb-DF v2, and 90.8% on the challenging DFDC dataset. Crucially, it outperforms the previous state-of-the-art (LAA-Net) by 4.1-7.3% in cross-generator settings and maintains 79.4% accuracy under strong adversarial attacks (epsilon=0.02 PGD-10). Ablation studies confirm the physics conditioning alone delivers a 4.2% cross-dataset AUC gain.
The full production system, version 1.2, is completely open-sourced under an MIT license. The release includes pretrained weights, a new adversarial training corpus (ADC-2026), and full reproducibility artifacts. This move aims to accelerate research and deployment of robust, generalizable deepfake detection in real-world applications, from content moderation to forensic analysis.
- PhyLAA-X injects 3 physics-based features (optical flow curl, reflection skewness, heart-rate rPPG) directly into AI attention mechanisms for 4.2-7.3% better cross-generator accuracy.
- The system achieves 97.2% accuracy on FaceForensics++ and maintains 79.4% accuracy under epsilon=0.02 PGD-10 adversarial attacks, a key robustness benchmark.
- The complete framework is open-sourced (MIT license) with pretrained models and a new adversarial dataset (ADC-2026) to spur industry and research adoption.
Why It Matters
Provides a robust, physics-grounded defense against increasingly sophisticated AI-generated video forgeries that threaten media integrity.