Nash-MADDPG boosts EV-to-EV energy trading by 62%
New multi-agent RL achieves 61.6% better social welfare, 62.9% more volume.
A team from Monash University (Yujin Lin, Yue Yang, Hao Wang) has proposed Nash-MADDPG, a novel multi-agent reinforcement learning framework for vehicle-to-vehicle (V2V) energy trading. The paper, accepted at IEEE INDIN 2026 and posted on arXiv, addresses the challenge of coordinating self-interested electric vehicles (EVs) with uncertain schedules. By integrating Nash Bargaining Solution into Multi-Agent Deep Deterministic Policy Gradient, Nash-MADDPG enables fair and efficient bilateral pricing. Nash-guided price proximity rewards align each agent's learning toward bargaining-optimal strategies, eliminating the need for centralized optimization.
Over a 30-day continuous simulation with 6 to 100 agents, Nash-MADDPG outperformed standard Double Auction by 61.6% in social welfare, 62.9% in trading volume, and 40.1% in fairness (Jain's index). The system maintained stable pricing near the theoretical Nash benchmark and scaled robustly with population size. This approach allows EVs to monetize surplus battery capacity through peer-to-peer trades, reducing strain on the power grid while ensuring equitable outcomes for all participants.
- Achieves 61.6% higher social welfare and 62.9% more trading volume vs. Double Auction in 30-day tests
- Improves fairness by 40.1% (Jain's index) using Nash bargaining integration in multi-agent RL
- Scales from 6 to 100 EV agents with continuous vehicle turnover and stable pricing
Why It Matters
Decentralized EV energy trading reduces grid load and lets owners profit, enabling smarter, fairer power markets.