Research & Papers

CS3: Efficient Online Capability Synergy for Two-Tower Recommendation

A new framework enhances two-tower models without sacrificing speed or accuracy.

Deep Dive

A team of researchers from an undisclosed institution (authors include Lixiang Wang, Shaoyun Shi, Peng Wang, Wenjin Wu, and Peng Jiang) has introduced CS3 (Capability Synergy), a novel online framework designed to enhance two-tower recommender systems. Two-tower models are widely used in large-scale candidate retrieval due to their efficiency, but their isolated architecture limits representation capacity, embedding-space alignment, and cross-feature modeling. Prior solutions like late interaction or knowledge distillation often increase latency or complicate online learning. CS3 overcomes these with three key innovations: a cycle-adaptive structure that enables self-revision via adaptive feature denoising within each tower, cross-tower synchronization to improve alignment through mutual awareness, and cascade model sharing to bridge cross-stage consistency by reusing knowledge from downstream models.

Evaluated on three public offline datasets and deployed in a large-scale advertising system, CS3 demonstrated significant real-world impact. It increased online ad revenue by up to 8.36% across three scenarios while maintaining millisecond-level latency, making it suitable for real-time applications. The framework is compatible with various two-tower architectures and consistently performs well, offering a practical solution for improving recommender systems without compromising efficiency. The paper is currently under review and available on arXiv (ID: 2604.22761).

Key Points
  • CS3 introduces three innovations: cycle-adaptive structure, cross-tower synchronization, and cascade model sharing.
  • Deployed in a large-scale advertising system, it boosts online ad revenue by up to 8.36%.
  • Maintains millisecond-level latency and works with various two-tower architectures.

Why It Matters

This framework optimizes ad revenue and recommendation accuracy without slowing down real-time systems.