CanonicalMerge: Order-Independent Cache Merging for Multi-Agent Reasoning
Byte-identical merged caches under any input permutation with CanonicalMerge
Prior work on multi-agent latent reasoning (Agent Primitives) used BagMerge, which concatenates KV-caches along the sequence axis with RoPE re-encoding. This operation is non-commutative, and the best input ordering varies unpredictably with the regime, latent-step budget, and model scale. Baquero and Brito reformulate the problem as a convergent replicated state. Their method, CanonicalMerge, fixes the layout by content: ordering caches by mean K-norm at a middle layer renders the merged cache byte-identical under any input permutation. The state itself is a set of content-addressed latent fragments whose merge is set union—a state-based CvRDT (commutative, associative, idempotent, absorbing), with CanonicalMerge as its deterministic render.
On a partitioned-reasoning benchmark, CanonicalMerge matches the best BagMerge ordering in every regime-by-budget-by-ordering cell without knowing which order is best, trading a small, statistically insignificant accuracy margin for an unconditional structural guarantee. The behavior transfers to real multi-document QA (HotpotQA), where the closest training-free output-fusion baseline (PackLLM) loses by 45 points at matched budget, placing cache-level merging in a distinct regime from output-level fusion. At k>2, the approach transports and colocates latent traces but does not by itself compose them, motivating future work.
- CanonicalMerge orders KV-caches by mean K-norm at a middle layer, ensuring byte-identical results under any input permutation (verified for arity N≤5 and on real Qwen3-1.7B and 4B models).
- The cache state is treated as a CvRDT (commutative, associative, idempotent, absorbing) set of content-addressed fragments, making re-delivered duplicates absorbed rather than re-concatenated.
- CanonicalMerge matches the best BagMerge ordering in all regimes without knowing the optimal order, and beats PackLLM by 45 points on HotpotQA.
Why It Matters
Eliminates order-tuning for multi-agent KV-cache merging, enabling deterministic, efficient reasoning across agents.