STRUM: AI turns any song into playable Guitar Hero charts
Open-source pipeline converts raw audio into Clone Hero charts for all instruments.
STRUM (Spectral Transcription and Rhythm Understanding Model) is a groundbreaking open-source pipeline that turns any raw audio recording into fully playable rhythm-game charts for drums, guitar, bass, vocals, and keys — no metadata or manual annotation required. Developed by Joshua Opria and built for the Clone Hero and YARG communities, it combines a two-stage CRNN onset detector with a six-model ensemble for drums, neural onset detectors plus monophonic pitch tracking for guitar/bass, word-aligned ASR for vocals, and spectral keyboard detection for keys. The system was evaluated on a carefully curated 30-song benchmark screened for audio quality.
Results show strong performance: drums onset F1=0.838, bass F1=0.694, guitar F1=0.651, and vocals F1=0.539 at a ±100ms tolerance with per-song global offset optimization. The paper also includes a thorough ablation study of seven drum-pipeline components using paired Wilcoxon tests, a timing distribution analysis of community charts, and a per-class confusion matrix for drums. By releasing code, model weights, and the full benchmark manifest, STRUM democratizes chart creation — letting anyone turn any song into a playable rhythm-game experience without manual transcription.
- STRUM converts raw audio into Clone Hero/YARG charts for drums, guitar, bass, vocals, and keys using a multi-stage hybrid pipeline.
- Benchmark results: drums F1=0.838, bass F1=0.694, guitar F1=0.651 at ±100ms tolerance with per-song global offset search.
- Full open-source release: code, model weights, and a 30-song benchmark manifest are available on GitHub.
Why It Matters
Opens up infinite song libraries for rhythm games without manual charting, empowering musicians and gamers alike.