EmoMind reads fMRI to generate personalized emotional captions
First AI system decodes continuous 34D emotion vectors from brain scans.
Researchers Bilal A. Mohammed, Lin Gu, and Ruogo Fang introduced EmoMind, the first system to decode emotionally rich captions directly from fMRI brain scans—moving beyond standard semantic decoding that discards affect. Current brain-to-text systems recover only factual content, while language models like GPT-4 generate emotional text only when given coarse categorical labels (e.g., “happy”), losing the rich, individual variability of human emotion.
EmoMind solves this with a two-stage pipeline: first, it retrieves a neutral scene description from brain-decoded visual features; second, it rewrites that description using a continuous 34-dimensional emotion vector decoded from the same fMRI recording. The system uses classifier-free guidance to balance content preservation and emotional expression. Across two independent emotion fMRI datasets, EmoMind significantly outperforms GPT-4 prompted with brain-decoded top-5 emotion labels—especially on metrics measuring subject-specific affective structure rather than population-level averages. This opens the door to truly personalized affective AI that respects individual brain organization.
- EmoMind is the first end-to-end pipeline to decode affective captions directly from fMRI signals.
- It uses a continuous 34-dimensional emotion vector instead of discrete categorical labels, preserving inter-subject variability.
- Outperforms label-prompted GPT-4 on three validation axes: subject-specificity, structural geometry, and causal control.
Why It Matters
EmoMind enables personalized emotional captioning from brain data, advancing affective computing and individual brain mapping.