LRP exposes shortcut learning in EEG foundation models
AI models for brain signals were cheating—using eye movements instead of neural activity.
A team of researchers led by Justus Meyer zu Bexten applied Layer-wise Relevance Propagation (LRP) to interpret EEG foundation models—black-box Transformer architectures increasingly used for diagnostics and brain-computer interfaces. While these models show promise despite data scarcity, their opaque nature hinders clinical adoption. The team extended LRP, previously used on CNNs, to Transformer-based EEG models, allowing them to trace which input signals drive model predictions.
Their analysis revealed critical flaws: in motor imagery tasks, models exhibited 'Clever Hans' behavior—relying on eye movement artifacts rather than actual motor cortex activity. In naturalistic affect prediction, LRP consistently highlighted a central electrode cluster, pointing to a candidate sensorimotor signature of arousal. These findings position LRP as a dual-use tool for verifying model integrity and discovering biologically plausible neural correlates, with potential to accelerate trustworthy EEG AI.
- LRP exposed 'Clever Hans' shortcuts in EEG Transformers that used ocular signals instead of motor brain activity in motor imagery tasks.
- In affect prediction, LRP identified a recurring central electrode cluster as a key driver, suggesting an arousal-related sensorimotor signature.
- The method extends LRP from CNN-based EEG models to modern Transformer foundation models, enabling post-hoc interpretability.
Why It Matters
Ensures EEG AI reliability by revealing hidden shortcuts, paving way for trustworthy diagnostics and brain-computer interfaces.