MEDAL provides out-of-sample mapping and approximate inversion for t-SNE, UMAP, and other manifold embeddings?

MEDAL provides out-of-sample mapping and approximate inversion for t-SNE, UMAP, and other manifold embeddings

Enables held-out validation, allowing practitioners to quantitatively compare methods and tune hyperparameters?

Enables held-out validation, allowing practitioners to quantitatively compare methods and tune hyperparameters

Detects distribution shift in reference manifolds and reveals biologically coherent structures not visible in 2D?

Detects distribution shift in reference manifolds and reveals biologically coherent structures not visible in 2D

Research & Papers

MEDAL framework gives t-SNE and UMAP out-of-sample mapping for rigorous validation

arXiv stat.ML May 26, 2026

⚡New autoencoder distillation method lets you validate manifold embeddings like supervised models

Deep Dive

Low-dimensional embeddings from methods like t-SNE and UMAP are widely used for visualizing high-dimensional data, but they lack formal validation because they do not provide a way to map new samples into the embedding space (out-of-sample) or invert the embedding back to the original features. A team led by Irene Chang at Rice University introduces MEDAL (Manifold Embedding Distillation via Autoencoder Learning) to solve this. The approach trains a constrained autoencoder where the bottleneck layer is forced to exactly reproduce a fixed teacher embedding. The decoder then reconstructs the original input from the bottleneck, yielding both an explicit mapping for new data and an approximate inverse. The reconstruction error serves as a pointwise distortion measure, making the embedding amenable to held-out validation.

In practice, MEDAL turns static embeddings into testable models. Users can split data into training and test sets, fit t-SNE or UMAP on the training set, then distill that embedding with MEDAL. The resulting encoder-decoder can be evaluated on held-out data, enabling quantitative comparisons between different methods and hyperparameter choices. The authors demonstrate MEDAL across multiple benchmarks and scientific case studies. Notably, it reveals biologically coherent regions that are difficult to preserve in 2D, and it detects distribution shift when new samples are mapped into a fixed reference manifold. MEDAL acts as a general validation wrapper for any existing dimension reduction technique, promising to improve reproducibility and rigor in exploratory data analysis.

Key Points

MEDAL provides out-of-sample mapping and approximate inversion for t-SNE, UMAP, and other manifold embeddings
Enables held-out validation, allowing practitioners to quantitatively compare methods and tune hyperparameters
Detects distribution shift in reference manifolds and reveals biologically coherent structures not visible in 2D

Why It Matters

Makes non-linear dimension reduction scientifically rigorous with reusable, testable models for downstream validation

Read Original Article

MEDAL framework gives t-SNE and UMAP out-of-sample mapping for rigorous validation

Why It Matters

Related Articles

🚀 Stay Ahead in AI