Research & Papers

Accelerating Reinforcement Learning for Wind Farm Control via Expert Demonstrations

arXiv cs.SY April 28, 2026

⚡Researchers cut years of RL training to 250k steps using wake model demos...

Deep Dive

A team led by Marcus Binder Nilsen from DTU Wind and Energy Systems has developed a method to accelerate reinforcement learning (RL) for controlling wind farms, addressing a key barrier to real-world deployment: the painfully slow training convergence that could otherwise cause years of suboptimal power output. Their approach, detailed in a paper submitted to the Journal of Physics: Conference Series (Torque 2026), uses expert demonstrations from a steady-state wake model (PyWake) to pretrain both the actor and critic networks of a Soft Actor-Critic RL agent. By mimicking the decisions of a domain-knowledge-based optimizer, the agent starts with near-optimal performance instead of from scratch.

In experiments on a 2x2 wind farm simulation (WindGym), the pretrained agent eliminated the costly initial learning phase entirely. While an untrained RL agent underperformed the simple greedy zero-yaw baseline by about 12%, the pretrained agent matched baseline performance from the first step. During online fine-tuning, all configurations converged within 250,000 environment steps to similar performance, ultimately exceeding a lookup-table controller by approximately 7% power gain after 500,000 steps. This work suggests that injecting domain knowledge via behavior cloning can make RL practical for wind farm control, potentially saving years of real-world training time and millions in lost energy revenue.

Key Points

Pretraining with PyWake optimizer demonstrations eliminates the initial 12% power loss seen in untrained RL agents.
The Soft Actor-Critic agent converges to optimal performance in just 250,000 environment steps after pretraining.
Final controller achieves ~7% power gain over a lookup-table baseline after 500,000 steps.
Method uses behavior cloning to transfer domain knowledge from steady-state wake models to dynamic RL agents.

Why It Matters

This could slash years of RL training time for wind farms, enabling faster deployment and millions in energy savings.

Read Original Article

Accelerating Reinforcement Learning for Wind Farm Control via Expert Demonstrations

Why It Matters

Stay Ahead in AI