Risk-Aware and Stable Edge Server Selection Under Network Latency SLOs
A risk-aware decision framework tames tail latency and stops server thrashing in edge networks.
A team of researchers led by Mohan Liyanage has introduced a novel decision framework for dynamic edge server selection that explicitly addresses tail risk and switching stability in latency-critical applications. The framework characterizes each candidate server using predictive mean and uncertainty summaries of network latency, which are then used to estimate the risk of service-level objective (SLO) violations. Risk is evaluated using a tight Normal approximation complemented by a conservative Cantelli bound, while percentile-based scoring coupled with hysteresis stabilizes decisions and suppresses oscillatory switching under short-lived network fluctuations. The approach is designed to be lightweight and interpretable, making it suitable for real-time deployment in dynamic edge environments.
Experimental results on a multi-server edge testbed with a strict SLO of 0.5 seconds demonstrate significant improvements over a mean-only baseline. The proposed approach reduces the deadline-miss rate from 39% to 34% while achieving a dramatic reduction in switching frequency from 46% to just 5.5% — an 88% decrease. Importantly, this stability improvement does not come at the cost of latency performance, as the framework maintains sub-SLO average latency of approximately 0.45 seconds. These results indicate that explicit risk evaluation combined with stability-preserving control enables practical and robust adaptive server selection, addressing a critical challenge in edge computing where network conditions are inherently variable and unpredictable.
- Reduces deadline-miss rate from 39% to 34% under a 0.5s SLO on a multi-server edge testbed.
- Cuts server switching frequency by 88% (from 46% to 5.5%) using hysteresis-based stability control.
- Combines Normal approximation and Cantelli bound for tail risk estimation, keeping average latency at ~0.45s.
Why It Matters
This framework makes edge server selection practical for real-time apps by balancing latency risk with stability, reducing costly oscillations.