ArtifactNet: Detecting AI-Generated Music via Forensic Residual Physics
A new lightweight model spots AI-generated audio by analyzing neural codec artifacts with 49x fewer parameters.
Researcher Heewon Oh has introduced ArtifactNet, a novel framework that reframes AI-generated music detection as a problem of forensic physics. Instead of analyzing musical content, the system extracts and examines the physical artifacts that neural audio codecs inevitably imprint on generated audio. The architecture combines a lightweight 3.6M-parameter UNet (ArtifactUNet) to extract codec residuals from spectrograms with a compact 0.4M-parameter CNN for classification, totaling just 4M parameters. This represents a dramatic 49x reduction compared to previous state-of-the-art models like CLAM.
ArtifactNet was rigorously evaluated on ArtifactBench, a new benchmark comprising 6,183 tracks from 22 different AI music generators and 6 diverse real music sources. On an unseen test partition of 2,263 tracks, the system achieved an impressive F1 score of 0.9829 with a false positive rate of just 1.49%, significantly outperforming existing methods. A key innovation is codec-aware training with 4-way audio format augmentation (WAV/MP3/AAC/Opus), which reduced cross-codec probability drift by 83%—solving a major failure mode in previous detection systems.
The research establishes forensic physics as a more generalizable and parameter-efficient paradigm for AI music detection than representation learning approaches. By directly targeting the technical artifacts introduced during the audio generation pipeline rather than learning musical patterns, ArtifactNet demonstrates superior generalization across different generators and audio formats. The framework's efficiency and accuracy make it particularly suitable for real-world deployment where computational resources may be limited.
- Achieves 98.29% F1 score on 2,263-track test set with just 1.49% false positives
- Uses only 4M total parameters—49x fewer than previous state-of-the-art model CLAM
- Reduces cross-codec probability drift by 83% through novel codec-aware training
Why It Matters
Provides an efficient, accurate tool for platforms to detect AI-generated music, addressing copyright and authenticity concerns in the streaming era.