Cross-Linguistic Rhythmic and Spectral Feature-Based Analysis of Nyishi and Adi: Two Under-Resourced Languages of Arunachal Pradesh
Researchers use rhythm formant analysis to differentiate two under-resourced Tani languages...
A new study from researchers at an Indian institution uses advanced audio processing techniques to differentiate Nyishi and Adi, two closely related but under-resourced languages from the Tani subgroup spoken in Arunachal Pradesh, North-East India. The paper, submitted to Sadhana (Indian Academy of Sciences), employs rhythm formant analysis (RFA)—a frequency-domain method based on amplitude modulation (AM) low-frequency (LF) spectrum analysis—to capture macro-temporal speech rhythm patterns.
By extracting three rhythm formant features—Number of Dominant Peaks (NDP), Mean Frequency of Dominant Peaks (MFDP), and Variance of Dominant Frequencies (VFDP)—along with Discrete Cosine Transform (DCT) coefficients and Mel Frequency Cepstral Coefficients (MFCC), the team found that Nyishi exhibits higher dominant modulation frequencies and greater dispersion than Adi. Rhythm-only features achieved 84-85% classification accuracy using support vector machine (SVM), while adding MFCC representations boosted performance to 90.9% with SVM and 93.96% with a multilayer perceptron (MLP). These results demonstrate that low-frequency modulation captures constrained macro-temporal structure, while spectral features reflect finer phonological differentiation.
- Rhythm-only features (NDP, MFDP, VFDP) achieved 84-85% classification accuracy using SVM
- Adding MFCC spectral features boosted accuracy to 93.96% with MLP and 90.9% with SVM
- Nyishi shows higher dominant modulation frequencies and greater dispersion than Adi, revealing hierarchical differentiation
Why It Matters
This work advances preservation and analysis of under-resourced languages, enabling automated classification for linguistic research and digital inclusion.