Fast Strategy Solving for the Informed Player in Two-Player Zero-Sum Linear-Quadratic Differential Games with One-Sided Information
A bi-level optimization trick lets AI exploit hidden info in real-time differential games.
A team from Arizona State University (Ghimire, Xu, Ren) published a preprint on arXiv (2605.03112) tackling a classic challenge in game theory: how should an informed player behave in a zero-sum differential game when the opponent only has a public belief over possible payoffs? Previous work showed that Nash equilibrium strategies have an atomic structure, but real-time computation was infeasible. This paper bridges that gap by focusing on linear dynamics and quadratic losses (LQ games), a common modeling framework in control theory.
The key insight: the informed player's optimal strategy can be cast as a bi-level optimization problem. The outer level decides when and how to reveal private information through control actions (the "signaling" strategy), while the inner level solves a game-tree Linear-Quadratic Regulator (LQR) for optimal closed-loop control conditioned on that signaling. The authors solve this via an adjoint-enabled backpropagation scheme — a backward LQR pass followed by a forward gradient descent to improve the signaling policy. Tested on a homing problem variant with 8D state space, 2D action spaces, and a horizon of K=10 time steps, the algorithm achieves approximately 10Hz sub-game solving. This speed enables robust, real-time game-theoretic planning under random disturbances and information asymmetry, a significant step toward practical deployment in autonomous systems.
- Bi-level optimization: outer layer optimizes signaling, inner layer solves LQR control.
- Achieves ~10Hz sub-game solving on 8D/2D state-action space with K=10 horizon.
- Adjoint-enabled backpropagation replaces expensive brute-force search for equilibria.
Why It Matters
Enables real-time strategic planning for autonomous agents facing opponents with hidden information, like drones or robots in adversarial environments.