Research & Papers

Learning in Prophet Inequalities with Noisy Observations

New algorithms achieve a 1 - 1/e competitive ratio for online decisions with imperfect observations.

Deep Dive

Researchers Jung-hun Kim and Vianney Perchet have tackled a core challenge in online decision-making: the 'prophet inequality' problem in a noisy, real-world setting. This classic problem involves making optimal stopping decisions (like when to sell an asset or hire a candidate) by sequentially observing values drawn from known distributions. Their new work, accepted at ICLR 2026, addresses the practical scenario where the true rewards are hidden, observed only through noisy signals, and the underlying distributions are unknown. They propose novel algorithms that integrate learning and decision-making via lower-confidence-bound (LCB) thresholding, allowing an agent to learn the latent parameters of a linear reward model while making irrevocable choices.

In the case where observations are independent and identically distributed (i.i.d.), the team proves that both an Explore-then-Decide strategy and an ε-Greedy variant can achieve the sharp competitive ratio of 1 - 1/e, matching the performance of an omniscient prophet under a mild condition. For more complex, non-identical distributions, their algorithm guarantees a competitive ratio of 1/2 against a relaxed benchmark. Furthermore, they show that even with limited access to a window of past rewards, a tight ratio of 1/2 can be maintained against the optimal benchmark. This work bridges theoretical online optimization with practical machine learning constraints, providing robust frameworks for sequential decision-making under uncertainty.

Key Points
  • Algorithms achieve a 1 - 1/e competitive ratio for i.i.d. noisy observations, matching a key theoretical limit.
  • Guarantees a 1/2 ratio for non-identical distributions and with limited historical data access.
  • Uses LCB (lower-confidence-bound) thresholding to simultaneously learn latent parameters and make optimal stopping decisions.

Why It Matters

Provides robust AI decision frameworks for real-world scenarios like dynamic pricing, hiring, and trading where data is imperfect.