Spherical Flows for Sampling Categorical Data
New generative model learns discrete sequences by flowing on a sphere—outperforms on Sudoku and language tasks.
A new paper on arXiv (2605.05629) from researchers at TU Berlin proposes Spherical Flows for sampling categorical data, offering a fresh approach to generative modeling of discrete sequences. Unlike prior methods that operate in Euclidean space or on the probability simplex, Spherical Flows embed sequences on the hypersphere S^{d-1}. The authors leverage the von Mises-Fisher (vMF) distribution, which naturally accommodates spherical geometry and provides a closed-form conditional score. By exploiting vMF's radial symmetry, they reduce the continuity equation on the sphere to a scalar ODE in cosine similarity, whose unique bounded solution determines the velocity field. This mathematical trick makes both ODE-based sampling and predictor-corrector (PC) sampling tractable.
The marginal velocity and marginal score on the product space (S^{d-1})^L decompose into posterior-weighted tangent sums that differ only by per-token scalar weights. The posterior is the only learned component, trained via a straightforward cross-entropy loss. Experiments compare the vMF-based path against geodesic and Euclidean alternatives on discrete sequence tasks. The vMF+PC combination significantly outperforms baselines on Sudoku puzzle generation and language modeling benchmarks, demonstrating that spherical geometry can better capture the structure of categorical data. The work opens new avenues for generative models in discrete domains without requiring complex autoregressive decoding.
- Spherical Flows embed categorical sequences on S^{d-1} using the von Mises-Fisher (vMF) distribution, enabling closed-form conditional score computation.
- The method reduces the continuity equation on the sphere to a scalar ODE in cosine similarity, making both ODE and predictor-corrector (PC) sampling feasible.
- On Sudoku and language modeling benchmarks, vMF+PC significantly outperforms geodesic and Euclidean alternatives, as shown in controlled experiments.
Why It Matters
Spherical Flows provide a principled, efficient new way to generate discrete sequences without autoregressive decoding, improving performance on structured prediction tasks.