Image & Video

MLICv2: Enhanced Multi-Reference Entropy Modeling for Learned Image Compression

New learned compression model reduces file sizes 20-24% better than current industry standard VVC.

Deep Dive

A research team from Peking University and Peng Cheng Laboratory has unveiled MLICv2 and its enhanced variant MLICv2+, marking a significant leap in learned image compression (LIC). These models systematically address key limitations of previous AI-based codecs, particularly performance degradation at high bitrates and suboptimal entropy modeling. The announcement, detailed in a paper accepted to ACM TOMM, demonstrates that these successors to the MLIC series substantially outperform the current industry benchmark, Versatile Video Coding (VVC) Intra, setting a new state-of-the-art for compression efficiency.

The technical breakthroughs include a lightweight token mixing block inspired by MetaFormer architecture to boost transform capacity, and a novel hyperprior-guided global correlation prediction mechanism that captures global context from the very first slice of data. Combined with a channel reweighting module and exploration of Stochastic Gumbel Annealing for instance-specific optimization, the models achieve remarkable rate reductions of 16.54% to 24.35% across standard datasets like Kodak and CLIC Pro. This advancement signals a shift where AI-native codecs can consistently outperform traditional, hand-engineered standards, paving the way for more efficient image storage and transmission across the web and mobile networks.

Key Points
  • Achieves 16-24% better compression (Bjøntegaard-Delta Rate) than VVC Intra standard on Kodak, Tecnick, and CLIC datasets.
  • Introduces a hyperprior-guided global correlation predictor to capture context from the initial data slice, fixing a key entropy modeling flaw.
  • Uses a lightweight token mixing block to prevent quality loss at high bitrates, maintaining efficiency without heavy computation.

Why It Matters

Enables significantly smaller image files with identical quality, reducing bandwidth costs and storage needs for websites and apps.