Trade-offs between structural richness and communication efficiency in music network representations
Researchers find simpler music encodings create more predictable patterns, while complex ones capture surprise.
A research team including Lluc Bono Rosselló, Robert Jankowski, and M. Ángeles Serrano published a paper titled "Trade-offs between structural richness and communication efficiency in music network representations" on arXiv. The study investigates how different ways of encoding musical features affect the transition networks AI models build from music sequences. Using eight distinct encodings of piano music—from simple single-feature vocabularies to complex multi-feature combinations—the researchers analyzed how these choices reorganize the state space and reshape network topology, fundamentally altering how uncertainty is distributed across musical transitions.
The team found that compressed, single-feature representations create dense transition structures with higher entropy rates, meaning higher average uncertainty per step, yet maintain low model error—indicating the constrained estimates stay close to actual corpus transitions. In contrast, richer multi-feature representations preserve finer musical distinctions but dramatically expand the state space, sharpen transition profiles, lower entropy rates, and increase model error. Across all representations, uncertainty concentrates in diffusion-central nodes while model error remains low there, suggesting an informational landscape where predictable musical flow coexists with localized moments of surprise.
To connect these technical findings to human perception, the researchers adopted a perceptual-constraint model that simulates imperfect access to transition statistics—mimicking how listeners actually process music. The results demonstrate that feature encoding choice doesn't just affect the networks AI reconstructs; it determines whether the resulting uncertainty serves as a plausible proxy for the expectations real listeners can learn and use. This has significant implications for designing AI music generation systems that better align with human musical cognition.
- Tested 8 different feature encodings of piano music, from single-feature to multi-feature combinations
- Simple encodings yield dense networks with higher entropy (more uncertainty) but lower model error
- Complex encodings expand state space, lower entropy, but increase model error by up to measurable margins
Why It Matters
Helps AI music systems better model human expectation and surprise, leading to more natural generative music.