Integrates differentiable anti-aliasing into activation and upsampling layers to eliminate distortion?

Integrates differentiable anti-aliasing into activation and upsampling layers to eliminate distortion

Outperforms existing models on singing voice, music, and general audio; matches speech quality?

Outperforms existing models on singing voice, music, and general audio; matches speech quality

Accepted by TASLP; code, demos, and checkpoints are publicly available?

Accepted by TASLP; code, demos, and checkpoints are publicly available

Audio & Speech

Pupu-Vocoder and Pupu-Codec deliver aliasing-free neural audio synthesis

arXiv eess.AS May 14, 2026

⚡New anti-aliasing technique beats existing models on singing voice and music

Deep Dive

Aliasing artifacts—distortions from nonlinear activations and upsampling—have long plagued neural audio synthesis, especially for high-fidelity music and singing voices. In a new paper accepted by IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP), a team led by Yicheng Gu introduces Pupu-Vocoder and Pupu-Codec, which integrate differentiable anti-aliasing techniques directly into the activation functions and upsampling modules. The authors built a dedicated test-signal benchmark to evaluate anti-aliased modules and validated their models across speech, singing voice, music, and general audio benchmarks.

Experimental results show that Pupu-Vocoder and Pupu-Codec significantly outperform prior state-of-the-art models on singing voice, music, and audio tasks, while achieving comparable performance on speech. This breakthrough promises to unlock higher-quality synthetic audio for music production, voice synthesis, and audio codecs—reducing the metallic or blurry artifacts common in current neural audio systems. The researchers have released demos, code, and pretrained checkpoints to foster further development.

Key Points

Integrates differentiable anti-aliasing into activation and upsampling layers to eliminate distortion
Outperforms existing models on singing voice, music, and general audio; matches speech quality
Accepted by TASLP; code, demos, and checkpoints are publicly available

Why It Matters

Delivers cleaner synthetic audio for music and voice without artifacts, advancing neural audio codecs

Read Original Article

Pupu-Vocoder and Pupu-Codec deliver aliasing-free neural audio synthesis

Why It Matters

Related Articles

🚀 Stay Ahead in AI