CROSS: A Mixture-of-Experts Reinforcement Learning Framework for Generalizable Large-Scale Traffic Signal Control
New RL system clusters traffic patterns and deploys specialized AI agents, outperforming single-policy models.
A research team led by Xibei Chen, Yifeng Zhang, and Guillaume Sartoretti has introduced CROSS, a novel reinforcement learning (RL) framework designed to tackle the complex challenge of large-scale, adaptive traffic signal control (ATSC). The core problem with existing RL-based traffic systems is their reliance on a single, shared policy for all intersections, which lacks the representational capacity to handle diverse topologies and highly dynamic traffic patterns. CROSS addresses this by implementing a two-stage Mixture-of-Experts (MoE) architecture. First, its Predictive Contrastive Clustering (PCC) module forecasts short-term traffic state transitions to identify latent patterns, then uses clustering and contrastive learning to refine these pattern-level representations.
Second, the framework's Scenario-Adaptive MoE module augments a base, shared policy with multiple specialized expert networks. This allows the system to dynamically select and combine the most relevant experts for specific traffic conditions, enabling adaptive specialization and more flexible, scenario-specific control strategies. The decentralized design means each intersection's controller can operate independently while benefiting from the shared expert knowledge. In extensive experiments conducted using the SUMO traffic simulator on both synthetic and real-world datasets, CROSS demonstrated superior performance and, crucially, better generalization to unseen traffic environments compared to current state-of-the-art baselines. This represents a significant step toward more resilient and intelligent urban traffic management systems.
- Uses a Predictive Contrastive Clustering (PCC) module to identify and represent latent, short-term traffic patterns.
- Employs a Scenario-Adaptive Mixture-of-Experts (MoE) module, allowing a shared base policy to be augmented by multiple specialized AI 'experts'.
- Outperformed existing state-of-the-art models in the SUMO simulator, showing superior generalization on real-world and synthetic traffic datasets.
Why It Matters
This approach could lead to more adaptive, efficient city traffic systems that reduce congestion and emissions at scale.