Atrial Fibrillation Detection Using Machine Learning
A new machine learning framework achieves near-perfect detection of a dangerous heart arrhythmia from wearable sensor data.
A new research paper presents a highly accurate machine learning framework for detecting atrial fibrillation (AF), a common cardiac arrhythmia that significantly increases stroke risk. The work, led by researchers Ankit Singh, Vidhi Thakur, and Nachiket Tapas, demonstrates how combining data from photoplethysmogram (PPG) and electrocardiogram (ECG) sensors with ensemble learning techniques can achieve near-perfect detection rates.
Technically, the team analyzed 481 data segments from 35 subjects, extracting 22 features from each segment including time-domain statistics and heart-rate variability metrics. They trained and evaluated three classifiers using 10-fold cross-validation. The subspace k-nearest neighbors (KNN) model emerged as the top performer with 98.7% test accuracy, slightly outperforming bagged decision trees (97.9%) and a cubic-kernel support vector machine (97.1%). All top models maintained sensitivity and specificity rates above 95%, meaning they reliably identified both AF episodes and normal heart rhythms with minimal false positives or negatives.
The context makes this particularly significant: AF often goes undetected until it causes serious complications like stroke, and current detection methods typically require clinical visits or specialized monitoring equipment. This research demonstrates that the sensor data already available from many consumer wearables (like smartwatches with PPG) and portable ECG devices, when processed with the right ML models, could provide continuous, passive screening. The practical implication is a potential pathway to democratize early AF detection, transforming personal health devices into powerful diagnostic tools that alert users to seek medical intervention before a catastrophic event occurs.
- Subspace KNN classifier achieved 98.7% test accuracy on AF detection from combined PPG/ECG data
- Framework analyzed 481 data segments from 35 subjects, extracting 22 features per segment for model training
- All top models maintained >95% sensitivity and specificity, minimizing false positives and negatives
Why It Matters
Enables early, non-invasive detection of a major stroke risk factor using data from existing consumer wearables and health devices.