Research & Papers

In-Context Positive-Unlabeled Learning

arXiv stat.ML May 08, 2026

⚡New transformer handles PU classification without retraining on each dataset.

Deep Dive

Positive-unlabeled (PU) learning tackles binary classification when only a set of labeled positives is available alongside a pool of unlabeled samples. Traditional methods require dataset-specific training or iterative optimization, making them slow for many tasks. Now, Siyan Liu and colleagues present PUICL (Positive-Unlabeled In-Context Learning), a pretrained transformer that solves PU classification entirely through in-context learning. PUICL is pretrained on synthetic PU datasets generated from randomly instantiated structural causal models, exposing it to diverse feature-label relationships and class-prior configurations. At inference, it receives labeled positives and unlabeled samples as one input and returns class probabilities for unlabeled rows in a single forward pass—no gradient updates or per-task fitting needed.

Tested on 20 semi-synthetic PU benchmarks derived from the UCI Machine Learning Repository, OpenML, and scikit-learn, PUICL outperforms four standard PU learning baselines in average AUC and accuracy, and is competitive on F1-score. This work demonstrates that in-context learning extends naturally beyond fully supervised tabular prediction to semi-supervised PU settings, enabling rapid deployment on new PU tasks without dataset-specific tuning. The approach promises efficiency gains for applications like anomaly detection, medical diagnosis, and fraud detection where labeled positives are scarce.

Key Points

PUICL requires only one forward pass with no gradient updates per task.
Outperforms 4 standard PU baselines on 20 semi-synthetic benchmarks (UCI, OpenML, scikit-learn).
Pretrained on synthetic data from randomly instantiated structural causal models.

Why It Matters

Enables rapid PU classification on new tasks without per-dataset training or tuning.

Read Original Article

In-Context Positive-Unlabeled Learning

Why It Matters

Stay Ahead in AI