C$^2$T: Captioning-Structure and LLM-Aligned Common-Sense Reward Learning for Traffic--Vehicle Coordination
New AI framework distills common sense from LLMs to optimize traffic lights and autonomous vehicles.
A research team led by Yuyang Chen has introduced C2T (Captioning-Structure and LLM-Aligned Common-Sense Reward Learning), a breakthrough framework for coordinating Traffic Light Controllers (TLCs) and Connected Autonomous Vehicles (CAVs). Current state-of-the-art systems use Multi-Agent Reinforcement Learning (MARL) but are limited by simplistic, hand-crafted rewards like intersection pressure, which fail to encapsulate complex, human-centric objectives. C2T overcomes this by learning a 'common-sense' coordination model directly from traffic dynamics, fundamentally changing how rewards are designed.
The core innovation is distilling knowledge from a Large Language Model (LLM) into a learned intrinsic reward function. This LLM-aligned reward then guides the policy of a cooperative multi-intersection MARL system. Tested on CityFlow-based benchmarks, C2T demonstrated superior performance in traffic efficiency, safety, and an energy-related proxy compared to existing MARL baselines. Crucially, the framework offers unprecedented flexibility: by simply modifying the prompt given to the underlying LLM, operators can shift the system's priority, training distinct 'efficiency-focused' or 'safety-focused' policies without altering the core architecture. This research, accepted to the CVPR 2026 Findings Track, represents a significant step toward more adaptive and intelligently goal-oriented urban traffic management systems.
- Uses LLM to create learned reward functions for traffic AI, moving beyond hand-crafted metrics like intersection pressure.
- Outperforms strong MARL baselines on CityFlow benchmarks across efficiency, safety, and energy metrics.
- Policy focus (efficiency vs. safety) can be steered by simply modifying the LLM prompt, offering major operational flexibility.
Why It Matters
Paves the way for safer, more efficient, and human-aligned autonomous traffic systems in smart cities.