Viewport-based Neural 360{\deg} Image Compression
A novel neural pipeline replaces spherical projection with viewport extraction, solving distortion and oversampling.
A team of researchers has introduced a breakthrough method for compressing 360° images, a critical format for VR and social media. The conventional approach, which projects the spherical image onto a single 2D plane, is plagued by oversampling (wasting bits on unseen areas) and geometric distortion. Their novel pipeline, detailed in the arXiv paper 'Viewport-based Neural 360° Image Compression', tackles this by shifting the paradigm: instead of compressing one distorted projection, it extracts and efficiently compresses multiple 2D viewports—the rectangular windows a user actually sees.
This viewport-based method, however, creates a new challenge by isolating data and losing global information about the original sphere. The team's key innovation is a neural viewport codec empowered by a transformer-based ViewPort ConText (VPCT) module. This module is designed to capture and share 'global prior' information across all the compressed viewports, enabling more efficient compression of each individual view. When integrated with standard learning-based 2D image codecs, this system outperforms all existing 360° compression models, achieving an average bitrate saving of 14.01% without compromising visual quality. The proposed VPCT-based codec also beats other 2D codecs within this new pipeline framework, validating the core architectural advance.
- Replaces flawed spherical projection with a viewport-extraction pipeline, minimizing distortion and oversampling.
- Uses a novel transformer-based VPCT module to share global information across viewports, saving an average of 14.01% in bitrate.
- Outperforms all existing 360° compression methods and standard 2D codecs within the new pipeline, with code publicly available.
Why It Matters
This directly reduces bandwidth and storage costs for VR experiences, 360° video streaming, and social media platforms hosting immersive content.