Research & Papers

Liveness Detection Models Struggle With New Synthetic Media Generation Techniques

Can models trained on old deepfakes detect today's AI-generated faces?

Deep Dive

The core issue is that production liveness detection systems were built around an older threat model: attackers submitting static images or basic replay videos. The generation quality of today's synthetic media — powered by diffusion models, GANs, and real-time face swapping — is categorically different from what those original training datasets captured. A model trained on deepfakes from 2020, for example, has never seen the artifacts introduced by the latest video synthesis techniques that can produce photorealistic, blinking, and speaking avatars in real time. The question is whether these systems can generalize to techniques that did not exist when the training data was assembled.

When the author asked two identity verification vendors directly about this temporal gap, both gave answers that sounded confident but conspicuously avoided addressing the core issue: whether their training data covers generation methods from the past 12–18 months. If the answer is no — and evidence suggests it likely is — then the update cycle becomes critical. Vendors claiming deepfake detection as a core capability would need to continuously retrain on new synthetic media samples, but many lack access to the latest generation tools. This creates a cat-and-mouse problem where attackers using cutting-edge AI are always a step ahead of detection models stuck in the past.

Key Points
  • Most liveness detection systems trained on static images or replay videos, not modern synthetic media.
  • Current generation techniques (diffusion models, real-time face swap) produce artifacts unseen in training data.
  • Vendors gave confident answers but avoided clarifying whether training data covers recent generation methods.

Why It Matters

If liveness detection cannot generalize, identity verification systems may be vulnerable to modern AI-generated attacks.