Audio & Speech

LMU-Based Sequential Learning and Posterior Ensemble Fusion for Cross-Domain Infant Cry Classification

arXiv eess.AS March 04, 2026

⚡A compact acoustic framework uses an enhanced Legendre Memory Unit for stable, efficient on-device monitoring.

Deep Dive

A research team from the University of Ottawa and Carleton University has published a novel AI framework designed to tackle the challenging problem of automatically classifying the causes of infant crying. The paper, "LMU-Based Sequential Learning and Posterior Ensemble Fusion for Cross-Domain Infant Cry Classification," addresses core issues in healthcare monitoring: short, non-stationary audio signals, limited annotated data, and significant domain shifts between different infants and datasets. The proposed system aims to move beyond lab conditions to a practical tool that can generalize across real-world scenarios.

The technical innovation centers on a compact acoustic model that fuses three types of audio features—MFCCs, STFT, and pitch—using a multi-branch convolutional neural network (CNN) encoder. For modeling the sequence of these features, the researchers employed an enhanced Legendre Memory Unit (LMU), a recurrent neural network variant that provides stable sequence modeling with substantially fewer parameters than traditional LSTMs, enabling efficient deployment. A key contribution is the "calibrated posterior ensemble fusion" technique, which uses entropy-gated weighting to intelligently combine predictions from domain-specific experts, mitigating dataset bias and improving cross-dataset generalization. Experiments demonstrated improved macro-F1 scores under rigorous cross-domain evaluation protocols, including leakage-aware data splits, confirming the model's robustness and its feasibility for real-time, on-device monitoring applications.

Key Points

Uses an enhanced Legendre Memory Unit (LMU) backbone for stable sequence modeling with fewer parameters than LSTMs, enabling efficient deployment.
Introduces calibrated posterior ensemble fusion with entropy-gated weighting to improve cross-dataset generalization and mitigate bias.
Demonstrates improved macro-F1 scores on Baby2020 and Baby Crying datasets with a framework designed for real-time, on-device health monitoring.

Why It Matters

This research advances towards practical, deployable AI for infant health monitoring, potentially enabling early detection of distress or medical issues from cry patterns.

Read Original Article

LMU-Based Sequential Learning and Posterior Ensemble Fusion for Cross-Domain Infant Cry Classification

Why It Matters

Stay Ahead in AI