Research & Papers

Online Scalarization in Vector-Valued Games

arXiv cs.GT May 08, 2026

⚡New algorithm lets AI players dynamically adjust preferences mid-game for better outcomes.

Deep Dive

Researchers (Asadollahi, Hawkins, Hale) propose an online scalarization framework for vector-valued games where players adapt their payoff weightings in real time. Using a bi-level bandit learning approach, the method achieves sublinear regret and boosts convergence to preferred equilibria from ~50% to ~80% in experiments. This enables players to dynamically reshape their objectives during repeated interactions.

Key Points

Bi-level framework: outer learner chooses scalarization, inner learner selects actions via bandit no-regret learning.
Sublinear regret guarantees proven using bandit online mirror descent with stabilized importance weighting.
Convergence to preferred equilibrium improved from ~50% to ~80% in vector-valued game experiments.

Why It Matters

Enables multi-agent AI systems to dynamically rebalance conflicting objectives, improving cooperation and negotiation outcomes.

Read Original Article

Online Scalarization in Vector-Valued Games

Why It Matters

Stay Ahead in AI