Sinkhorn Based Associative Memory Retrieval Using Spherical Hellinger Kantorovich Dynamics
A new AI memory model uses optimal transport theory to store and retrieve complex data like point clouds with exponential capacity.
Researchers Aratrika Mustafi and Soumya Mukherjee have introduced a novel framework for associative memory, a core AI concept for storing and retrieving patterns. Their model, detailed in the paper "Sinkhorn Based Associative Memory Retrieval Using Spherical Hellinger Kantorovich Dynamics," is specifically designed for complex, modern data types like empirical measures—essentially weighted point clouds or distributions. Unlike classic models that store simple vectors, this approach treats both stored memories and user queries as probability distributions. Retrieval works by minimizing a specialized energy function based on the debiased Sinkhorn divergence, a metric from optimal transport theory that effectively measures the "cost" of moving one distribution to another.
The retrieval process is formulated as a continuous gradient flow, called a spherical Hellinger Kantorovich (SHK) flow. This mathematical framework allows the model to dynamically update not just the positions of data points in the cloud, but also their relative importance or weights. The authors prove that under certain conditions, this flow converges geometrically fast to a correct memory and that the storage "basins" for different memories are distinct. Crucially, their analysis shows the model's capacity—the number of patterns it can store—grows exponentially with the dimensionality of the data, a significant leap over many traditional models.
In practical experiments using synthetic memories of Gaussian point clouds, the Sinkhorn-based model demonstrated robust recovery from corrupted or noisy queries. It significantly outperformed a standard Euclidean-based Hopfield network baseline, showcasing its potential for handling real-world, noisy, and high-dimensional data where relationships between points are as important as the points themselves. This work bridges advanced mathematical theory in optimal transport with practical machine learning architectures for memory.
- Designed for complex data like weighted point clouds, treating queries and memories as probability measures.
- Uses debiased Sinkhorn divergence and SHK gradient flow, enabling updates to both point locations and their weights.
- Proven to have exponential storage capacity and outperforms Euclidean baselines in recovering noisy data.
Why It Matters
Provides a robust, high-capacity framework for AI systems to remember and reason with complex, real-world data structures like 3D scenes or molecular sets.