Scale framework uses deep RL to speed container scheduling by 99%
New deep RL approach cuts container decision time from minutes to milliseconds at the edge.
Serverless edge computing demands rapid, resource-efficient container scheduling in heterogeneous, dynamic environments. Existing methods like Integer Linear Programming (ILP) are accurate but too slow for real-time decisions. Researchers from multiple institutions introduce Scale, a policy-based deep reinforcement learning (DRL) framework that balances system stability and performance under dynamic workloads. Scale incorporates service-level objective (SLO) constraints, end-to-end latency, and data locality directly into the scheduling decision process, enabling it to adapt to changing conditions without manual tuning.
In extensive simulations using large-scale real-world datasets from Huawei Cloud, Scale achieved solutions within a factor of 1.11 to 1.15 of a state-of-the-art ILP solver—meaning near-optimal performance—while slashing decision-making time by up to 99%. This dramatic speedup makes Scale practical for real-time container scheduling at the edge, potentially reducing resource over-provisioning and unnecessary data movement. The work underscores how deep RL can replace slower optimization methods in latency-sensitive distributed systems.
- Scale uses policy-based deep reinforcement learning to jointly optimize SLO, latency, and data locality for container scheduling.
- Achieves near-optimal scheduling within 1.11–1.15x of an ILP solver, according to real-world Huawei Cloud datasets.
- Reduces scheduling decision time by up to 99%, enabling real-time container placement in dynamic edge environments.
Why It Matters
Real-time near-optimal container scheduling at the edge unlocks more efficient serverless deployments and reduces cloud waste.