NeuralLVC: Neural Lossless Video Compression via Masked Diffusion with Temporal Conditioning
A new neural codec compresses video losslessly, outperforming H.264 and H.265 by a significant margin.
Researchers Tiberio Uricchio and Marco Bertini have introduced NeuralLVC, a novel neural network architecture designed for lossless video compression. The system tackles a largely unexplored area by applying advanced AI techniques, specifically masked diffusion models, to the problem of compressing video without any quality loss. Its core innovation is a hybrid I/P-frame architecture that mimics traditional video codecs but uses neural networks. The I-frame model compresses individual frames using a bijective linear tokenization, a mathematical guarantee for perfect reconstruction. The P-frame model is where the temporal intelligence lies; it compresses only the differences between consecutive frames.
This P-frame model is conditioned on the previously decoded frame via a lightweight 'reference embedding' mechanism, which the authors note adds a mere 1.3% to the model's trainable parameters, keeping efficiency high. A key feature is group-wise decoding, which allows users to trade off decoding speed for better compression ratios, providing practical flexibility. The codec is rigorously lossless, reconstructing YUV420 video planes or RGB image channels exactly as they were input.
In experiments, NeuralLVC was tested on 9 standard Xiph CIF video sequences. The results showed it 'outperforms H.264 and H.265 lossless by a significant margin,' marking a potential paradigm shift. The team verified the exact, bit-for-bit reconstruction through end-to-end encode-decode testing using arithmetic coding. This work establishes masked diffusion with temporal conditioning as a serious and promising new direction for creating highly efficient, AI-native video compression for applications where data integrity is non-negotiable.
- Uses a masked diffusion model with an I/P-frame architecture for temporal redundancy, adding only 1.3% extra parameters for frame conditioning.
- Guarantees mathematically exact (lossless) reconstruction of video frames, verified via end-to-end testing with arithmetic coding.
- Outperforms traditional industry standards H.264 and H.265 in lossless compression on the Xiph CIF benchmark suite.
Why It Matters
Enables more efficient archival of sensitive video data in medicine, science, and film where every pixel must be preserved.