Audio & Speech

Spheres Dataset: 23-mic orchestral recordings for AI source separation

Over one hour of multitrack orchestral music with 23 microphones for training source separation models.

Deep Dive

A team of researchers from multiple institutions, led by Jaime Garcia-Martinez, has released The Spheres Dataset — a collection of over one hour of multitrack orchestral recordings designed to advance machine learning in music source separation and information retrieval (MIR) for classical music. The dataset features the Colibrì Ensemble performing two canonical works (Tchaikovsky’s Romeo and Juliet and Mozart’s Symphony No. 40), along with chromatic scales and solo excerpts for each instrument. Recorded at The Spheres studio with 23 microphones (close spot, main, and ambient), the setup allows the creation of realistic stereo mixes with controlled bleeding and provides isolated stems for supervised training. Room impulse responses estimated for each instrument position add valuable acoustic characterization.

Baseline evaluations using X-UMX based models demonstrate the dataset’s utility for orchestral family separation and microphone debleeding. Results highlight both the promise and challenges of source separation in complex orchestral scenarios, making the dataset a key benchmark for future research in separation, localization, dereverberation, and immersive rendering. Published in IEEE Transactions on Audio, Speech and Language Processing (2026), The Spheres Dataset fills a critical gap in classical music MIR resources, enabling reproducible comparisons and exploration of new approaches.

Key Points
  • Over one hour of multitrack orchestral recordings using 23 microphones across close spot, main, and ambient positions
  • Includes Tchaikovsky's Romeo and Juliet and Mozart's Symphony No. 40, plus chromatic scales and solo excerpts
  • Baseline X-UMX models achieve orchestral family separation and microphone debleeding, establishing a new benchmark

Why It Matters

Provides a critical benchmark for isolating instruments in complex orchestral mixes, advancing AI for music production and analysis.