New Anytime-Valid FC-RAG cuts LLM swarm bandwidth 14-57%
Statistical guarantees at every stop time, no extra assumptions needed.
A new paper on arXiv (2605.29139) from Dubey and Huo extends Federated Conformal RAG (FC-RAG) to provide anytime-valid coverage for bandwidth-limited swarms of weak language models. The original FC-RAG offered distribution-free coverage only at a fixed horizon; the new Anytime-FC-RAG guarantees validity at every stopping time, even under adaptive controls like recalibration, per-node bandwidth escalation, and distilled-student refresh—all without extra assumptions. The key innovation is a summable per-step calibration-deviation budget that converts the marginal bound into a strict conditional bound on a calibration-good event, paired with a truncated betting e-process that is a nonnegative supermartingale on the entire probability space.
The paper provides four theoretical guarantees: time-uniform alarm validity (the probability that the e-process ever exceeds 1/δ is bounded by δ + calibration error), a Hoeffding-stitched cumulative miscoverage envelope at the same total budget, safety under any predictable controller, and training-side error propagation across unbounded Federated Probe-Logit Distillation (FPLD) refreshes via a summable training budget. Experiments on GPT-2-small + MiniLM swarms across MMLU, DBpedia, and AG News confirm the predicted alarm rate, detection delay, and envelope coverage, while achieving 14–57% bandwidth savings over fixed-high-bandwidth schedules.
- Anytime-FC-RAG guarantees coverage validity at every stopping time, not just a fixed horizon.
- Achieves 14–57% bandwidth savings by escalating retrieval only when e-process triggers a warning.
- Provides four guarantees: time-uniform alarm validity, miscoverage envelope, safety under adaptive controllers, and training-error propagation.
Why It Matters
Enables efficient, reliable retrieval-augmented generation from lightweight LLM swarms with provable statistical guarantees.