Space Network of Experts: Architecture and Expert Placement
Solar-powered space data centers could host LLMs—here's how to place experts across satellites.
A new paper from researchers at Xidian University and Hong Kong University of Science and Technology, with collaborators from HKUST-GZ and the University of Hong Kong, introduces Space-XNet (Space Network of Experts). The framework tackles a critical challenge: deploying large language models (LLMs), specifically Mixture-of-Experts (MoE) architectures, across satellite networks that have limited onboard compute and bandwidth. With space data centers becoming viable thanks to continuous solar energy harvesting, companies like SpaceX and Google are actively pursuing the concept. Space-XNet solves the placement problem of partitioning model components across satellites to reconcile the fundamentally different model architecture and satellite network topology.
The proposed two-level placement strategy first handles layer placement by exploiting the ring-like communication pattern of autoregressive inference. The satellite constellation is partitioned along the orbiting direction into subnets arranged on a ring, each hosting one MoE layer. The second level handles intra-layer expert placement by solving an optimization problem that maps experts with heterogeneous activation probabilities onto individual satellites within the same subnet. The derived strategy follows an intuitive principle: frequently activated experts should be placed on satellites with low expected routing latency. Experiments on a thousand-satellite constellation show Space-XNet achieves at least a threefold latency reduction over conventional random and ablation-based placement methods, making space-based LLM inference a more practical reality.
- Two-level placement: layer assignment to orbital subnets, then expert placement by activation frequency.
- Achieves at least 3x latency reduction over random and ablation-based strategies in a 1000-satellite constellation.
- Framework targets MoE models, which are popular for efficient scaling of LLMs in resource-constrained environments.
Why It Matters
Brings low-latency LLM inference to space data centers, enabling global AI services independent of Earth-bound infrastructure.