Scales to 1 million rows on a single H100 GPU, up from 100K in version 2.5?

Scales to 1 million rows on a single H100 GPU, up from 100K in version 2.5

10x–1000x faster inference, with 120x speedup on SHAP via KV caching?

10x–1000x faster inference, with 120x speedup on SHAP via KV caching

93% win rate over classical ML and beats 4-hour-tuned AutoGluon by over 200 Elo on TabArena?

93% win rate over classical ML and beats 4-hour-tuned AutoGluon by over 200 Elo on TabArena

Research & Papers

TabPFN-3 scales to 1M rows with 10x faster inference

r/MachineLearning May 12, 2026

⚡Handles 1M rows on a single GPU with zero training – 10x larger than before

Deep Dive

TabPFN-3 was released today, building on the TabPFN line that began with a Nature publication and over 3 million downloads. The model requires no training, hyperparameter search, or tuning – it simply predicts on tabular data in a single forward pass. The new version pushes scale to 1 million rows on a single H100 GPU, a 10x increase over TabPFN-2.5, achieved through a reduced KV cache (~8 GB per million rows per estimator) and row-chunked inference. This makes highly practical, large-scale tabular inference possible on one GPU.

Speed improvements are dramatic: 10x to 1000x faster inference than previous versions, with a 120x speedup on SHAP via KV caching. A new 'thinking mode' (API only) applies extra compute during inference via one-time fitting, further boosting accuracy. On the TabArena benchmark, TabPFN-3 beats every non-TabPFN method by over 200 Elo, including 4-hour-tuned AutoGluon 1.5 extreme. On larger datasets, the gap more than doubles to 420 Elo. It also achieves a 93% win rate over classical ML models and natively supports up to 160 classes, calibrated quantile regression, and lifting on adjacent tasks like time series and relational benchmarks.

Deployment options include API access, enterprise licensing, and open-source weights (permissive for research and academic evaluation). The model is available to try now, with a full model report published. TabPFN-3 effectively makes state-of-the-art tabular AI accessible to anyone with a single GPU, eliminating the need for extensive model tuning.

Key Points

Scales to 1 million rows on a single H100 GPU, up from 100K in version 2.5
10x–1000x faster inference, with 120x speedup on SHAP via KV caching
93% win rate over classical ML and beats 4-hour-tuned AutoGluon by over 200 Elo on TabArena

Why It Matters

Zero-training tabular AI now matches weeks of tuning in seconds, democratizing high-performance ML for professionals.

Read Original Article

TabPFN-3 scales to 1M rows with 10x faster inference

Why It Matters

Related Articles

🚀 Stay Ahead in AI