P-SWA uses diagonal wavefronts to enable parallel decoding of transformer-based neural video codecs?

P-SWA uses diagonal wavefronts to enable parallel decoding of transformer-based neural video codecs.

Decoding speed increases by 36% over the parallel VCT baseline?

Decoding speed increases by 36% over the parallel VCT baseline.

Achieves up to 10.0% BD-rate savings for I-frames and 7.1% for P-frames vs. sequential SWA?

Achieves up to 10.0% BD-rate savings for I-frames and 7.1% for P-frames vs. sequential SWA.

Image & Video

P-SWA speeds neural video decoding 36% with parallel wavefronts

arXiv eess.IV May 21, 2026

⚡Sliding window attention gets a parallel boost, slashing latency and improving compression.

Deep Dive

Most neural video codecs rely on temporal conditioning, which causes error propagation across long sequences. Transformer-based architectures like the Video Compression Transformer (VCT) avoid this drift but suffer from high computational cost and inferior rate-distortion (RD) performance. The recent Sliding Window Attention (SWA) method reduces complexity and improves RD, but it forces strictly sequential raster-scan decoding, creating a latency bottleneck. Researchers Alexander Kopte and André Kaup have now introduced P-SWA (Parallel Sliding Window Attention), which uses diagonal wavefronts to break the sequential dependency and enable parallel decoding. This is achieved by embedding a hyperprior and an accumulator that fuses side information with local spatial context.

In experiments, P-SWA achieves a 36% decoding speed increase over the parallel VCT baseline while delivering Bjøntegaard Delta-rate savings of 10.0% for I-frames and 7.1% for P-frames compared to the sequential SWA baseline. The paper has been accepted for ICIP 2026 and is available on arXiv. For professionals working on video streaming, real-time communications, or edge deployment, P-SWA represents a practical step toward fast, drift-free neural video decoding without sacrificing compression efficiency.

Key Points

P-SWA uses diagonal wavefronts to enable parallel decoding of transformer-based neural video codecs.
Decoding speed increases by 36% over the parallel VCT baseline.
Achieves up to 10.0% BD-rate savings for I-frames and 7.1% for P-frames vs. sequential SWA.

Why It Matters

Faster neural video decoding without quality loss means better real-time streaming and lower latency for edge AI applications.

Read Original Article

P-SWA speeds neural video decoding 36% with parallel wavefronts

Why It Matters

Related Articles

🚀 Stay Ahead in AI