PON allows each agent to use a locally updated running mean and variance for state normalization, avoiding the pitfalls of sharing statistics across heterogeneous environments?

PON allows each agent to use a locally updated running mean and variance for state normalization, avoiding the pitfalls of sharing statistics across heterogeneous environments.

The method is validated on MuJoCo tasks with heterogeneity, showing faster training convergence and better final performance compared to standard FedRL baselines?

The method is validated on MuJoCo tasks with heterogeneity, showing faster training convergence and better final performance compared to standard FedRL baselines.

Accepted at IJCNN 2025, the work addresses a key challenge in federated reinforcement learning?

non-identical input distributions from differing transition dynamics.

Research & Papers

New PON method for FedRL accelerates training in heterogeneous environments

arXiv cs.LG May 28, 2026

⚡Solving the heterogeneity problem in federated reinforcement learning with personalized normalization

Deep Dive

Federated reinforcement learning (FedRL) allows multiple agents to collaboratively train a global policy without sharing raw data, making it ideal for privacy-sensitive applications. However, FedRL struggles in heterogeneous environments where differing state-transition dynamics cause non-identical input distributions and imbalanced parameter updates during aggregation. To address this, researchers Yiran Pang, Zhen Ni, and Xiangnan Zhong developed Personalized Observation Normalization (PON), a method that enables each agent to locally normalize raw state inputs using a continuously updated running mean and variance. This design ensures consistent scaling of local features without overshadowing during aggregation. The authors demonstrate that sharing normalization parameters across agents is ineffective due to diverse local input distributions, highlighting the necessity of personalized statistics. The work has been accepted at the International Joint Conference on Neural Networks (IJCNN) 2025.

Experiments on heterogeneous MuJoCo tasks show that PON accelerates training and achieves superior performance compared to baseline methods. By allowing each agent to maintain its own normalization statistics, PON effectively handles the heterogeneity that plagues traditional FedRL approaches. This breakthrough is particularly relevant for simulation environments with varying dynamics, such as robotics, autonomous driving, and multiplayer games, where agents operate under different conditions. The method's simplicity—only requiring local running mean and variance—makes it easy to integrate into existing FedRL frameworks without significant overhead. For professionals working on distributed reinforcement learning, PON offers a practical solution to the long-standing problem of heterogeneous environments, potentially enabling more robust and efficient federated learning systems in real-world applications.

Key Points

PON allows each agent to use a locally updated running mean and variance for state normalization, avoiding the pitfalls of sharing statistics across heterogeneous environments.
The method is validated on MuJoCo tasks with heterogeneity, showing faster training convergence and better final performance compared to standard FedRL baselines.
Accepted at IJCNN 2025, the work addresses a key challenge in federated reinforcement learning: non-identical input distributions from differing transition dynamics.

Why It Matters

Enables practical, privacy-preserving multi-agent RL in heterogeneous environments like robotics fleets or sensor networks.

Read Original Article

New PON method for FedRL accelerates training in heterogeneous environments

Why It Matters

Related Articles

🚀 Stay Ahead in AI