Research & Papers

Domain-Adaptive Model Merging across Disconnected Modes

A new data-free method combines specialized models, preserving rare knowledge while avoiding privacy risks.

Deep Dive

A research team led by Junming Liu has introduced DMM (Domain-Adaptive Model Merging), a novel framework designed to solve a critical challenge in distributed AI: learning from data that cannot be centralized. In fields like healthcare or finance, data privacy regulations and inherent heterogeneity often prevent pooling information into a single dataset for training a comprehensive model. DMM offers an alternative by merging the knowledge from multiple, independently trained specialized models into one unified model, eliminating the need to share sensitive raw data and significantly reducing computational retraining costs.

DMM operates through a sophisticated three-step pipeline. First, domain-specific models are trained completely independently on their local, disconnected datasets. Second, models exhibiting high architectural or task similarity are merged using established techniques to ensure a stable foundation. The third and most innovative step tackles highly divergent models. Here, DMM synthesizes 'pseudo-data'—artificial samples generated from the normalization statistics of the existing models. This synthetic data is then used to guide a lightweight refinement process, distilling the rare but critical knowledge from the divergent models into the merged core through techniques like knowledge distillation. This method preserves specialized capabilities without compromising the stability of the unified model.

Extensive validation across both unimodal and multimodal benchmarks demonstrates that DMM achieves state-of-the-art performance compared to existing model merging methods. The framework's ability to handle 'disconnected modes'—models trained on vastly different data distributions or for different tasks—makes it particularly powerful for creating robust, general-purpose AI from a federation of specialized experts. The work, accepted for presentation at ICASSP 2026, provides a practical pathway toward more collaborative and privacy-preserving AI development.

Key Points
  • Enables merging of AI models trained on decentralized, private data without sharing the raw data itself.
  • Uses synthesized pseudo-data from model statistics to distill knowledge, preserving rare capabilities from divergent models.
  • Achieves state-of-the-art performance on benchmarks, validated for acceptance at the ICASSP 2026 conference.

Why It Matters

Enables collaborative AI development across organizations and sectors while strictly maintaining data privacy and security.