Intervention-Based Self-Supervised Learning: A Causal Probe Paradigm for Remote Photoplethysmography
Researchers crack the 'correlation trap' with active intervention on video chrominance.
Remote Photoplethysmography (rPPG) measures heart rate from video without contact, but existing self-supervised learning (SSL) methods often fall into a 'correlation trap' – learning dominant periodic noise (e.g., head motion, lighting flicker) instead of the faint true rPPG signal. To solve this, a team led by Zhiyi Niu introduces a new SSL paradigm called Physiological Causal Probing (PCP). The key insight is to shift from passive correlation learning to active, precise intervention: the model hypothesizes an rPPG signal, then intervenes on the video in the chrominance domain to verify if the resulting changes match physical expectations.
The proposed framework, Interv-rPPG, consists of two components: an rPPG extractor named PhysMambaFormer that hypothesizes the signal, and a Controllable Physiological Signal Editor that performs targeted edits on low-frequency chrominance components. Validation uses two principles – 'Falsifiability via Nulling' (the null hypothesis of no rPPG signal should be rejected) and 'Axiomatic Equivariance' (interventions should produce physically consistent outputs). Results show that PCP improves both in-domain and cross-domain performance on challenging datasets like VIPL-HR and MMPD, even surpassing supervised baselines in complex cross-dataset settings while remaining competitive on clean datasets.
- Proposes Physiological Causal Probing (PCP), a new SSL paradigm that actively intervenes on video chrominance to validate rPPG hypotheses.
- PhysMambaFormer extracts the hypothesized rPPG signal; Controllable Physiological Signal Editor edits low-frequency chrominance components.
- Outperforms supervised baselines on cross-dataset tests (VIPL-HR, MMPD) and resists motion/illumination artifacts via diagnostic analysis.
Why It Matters
Enables robust, non-contact heart rate monitoring from video even under noisy real-world conditions.