Research & Papers

EXaCTz: Guaranteed Extremum Graph and Contour Tree Preservation for Distributed- and GPU-Parallel Lossy Compression

New parallel algorithm processes 512GB datasets in under 48 seconds while preserving critical topological features.

Deep Dive

A research team from the University of Utah and other institutions has developed EXaCTz, a breakthrough algorithm that solves a critical bottleneck in scientific data compression. While error-bounded lossy compression is essential for managing petabytes of data from simulations in climate science, astrophysics, and engineering, existing methods that preserve topological features like contour trees and extremum graphs were painfully slow—operating at MB/s speeds while compression ran at GB/s. EXaCTz introduces a novel bounded-iteration approach that enforces topological consistency by making targeted edits to decompressed data, theoretically guaranteeing convergence and preserving critical-point classification and connectivity.

EXaCTz's performance gains are staggering. On a single GPU, it achieves 4.52 GB/s throughput, outperforming the previous state-of-the-art contour-tree-preserving method by 213x on comparable hardware and by 3,285x when comparing GPU to CPU implementations. The algorithm demonstrates exceptional scalability in distributed environments, maintaining 55.6% efficiency when scaling to 128 GPUs compared to just 6.4% for naive parallelization. This enables processing of massive 512GB datasets in under 48 seconds with aggregate throughput reaching 32.69 GB/s. The researchers have provided theoretical convergence bounds based on vulnerability graph analysis, addressing a key limitation of previous approaches.

The technology represents a fundamental shift from explicit topology reconstruction to consistency enforcement through min/max neighbor preservation and global ordering of critical points. By preserving merge/split events in scalar field data—essential for understanding phenomena like fluid dynamics, magnetic fields, and temperature gradients—EXaCTz enables scientists to compress data by orders of magnitude without sacrificing the structural information needed for accurate analysis. This breakthrough bridges the throughput gap that has long hindered practical adoption of topology-preserving compression in large-scale scientific workflows.

Key Points
  • Achieves 4.52 GB/s single-GPU throughput, 3,285x faster than previous state-of-the-art methods
  • Scales to 128 GPUs with 55.6% efficiency, processing 512GB datasets in under 48 seconds
  • Provides theoretical convergence guarantees for preserving extremum graphs and contour trees in compressed data

Why It Matters

Enables scientists to compress massive simulation datasets by orders of magnitude without losing critical structural features needed for analysis.