Research & Papers

A Finite Time Analysis of Thompson Sampling for Bayesian Optimization with Preferential Feedback

arXiv stat.ML April 29, 2026

⚡New proof shows TS for pairwise comparisons performs as well as standard BO...

Deep Dive

A new paper from Joseph Lazzaro, Davide Buffelli, Da-shan Shiu, and Sattar Vakili, accepted at AISTATS 2026, presents a finite-time analysis of Thompson Sampling (TS) for Bayesian optimization (BO) with preferential feedback—where the algorithm learns from pairwise comparisons (e.g., "A is better than B") instead of scalar scores. This is increasingly relevant for human-in-the-loop design, laboratory experiments, and scientific discovery where direct numerical feedback is impractical. The method models comparisons using a monotone link function on latent utility differences and leverages a dueling kernel induced by a base kernel.

The key theoretical contribution is a proof that the proposed TS approach achieves the same finite-time regret bounds as standard TS for conventional BO with scalar feedback—a significant step in formalizing the efficiency of preference-based optimization. The analysis exploits the anchor invariance property of TS for challenger selection and introduces a novel double-TS pairing variant. Experimental results on both synthetic benchmarks and real-world applications demonstrate the method's effectiveness, bridging theory and practice for preference-driven optimization tasks.

Key Points

First finite-time analysis proving Thompson Sampling for preferential BO matches standard BO performance bounds.
Uses a monotone link function on latent utility differences and a dueling kernel for pairwise comparisons.
Introduces a double-TS pairing variant that exploits anchor invariance for challenger selection.

Why It Matters

Enables efficient Bayesian optimization when only pairwise comparisons are available, critical for human-in-the-loop and scientific discovery.

Read Original Article

A Finite Time Analysis of Thompson Sampling for Bayesian Optimization with Preferential Feedback

Why It Matters

Stay Ahead in AI