Task-Centric Personalized Federated Fine-Tuning of Language Models
New 'FedRouter' method improves AI generalization by 136% and handles multi-task interference 6.1% better.
A research team led by Gabriel Talasso and Meghdad Kurmanji has introduced FedRouter, a novel approach to personalized federated learning (pFL) that fundamentally shifts the unit of personalization from the client to the task. Traditional pFL methods train a unique model for each client's data, which struggles when a client's data contains multiple, potentially interfering tasks (like email and code generation) or when the model needs to generalize to new, unseen tasks. FedRouter addresses this by using lightweight adapters—small neural network modules attached to a base model—and clustering them based on the specific tasks they are trained on, both locally within a client and globally across the federated network.
This task-centric clustering allows FedRouter to build specialized, robust models for each distinct task type. During inference, an 'evaluation router' mechanism intelligently directs a new input to the most appropriate adapter based on the learned clusters. In experiments, this architecture demonstrated significant resilience, achieving up to a 136% relative improvement in generalization to unseen tasks and performing 6.1% better in scenarios with intra-client task interference compared to existing pFL approaches. The method was evaluated on a multi-task dataset, showcasing its practical utility for real-world applications where user data is diverse and private.
The work, published on arXiv, tackles two core weaknesses in current federated AI systems: poor generalization and task interference. By moving the focus to tasks, FedRouter enables more efficient and effective collaborative training of large language models (LLMs) across devices like phones and laptops without sharing raw data. This paves the way for AI assistants that can be highly personalized to a user's myriad needs without the performance degradation seen in current methods, all while maintaining strict data privacy.
- FedRouter personalizes models for tasks, not clients, using adapter clustering, improving generalization by 136%.
- It solves intra-client task interference—where multiple tasks (e.g., writing & coding) hurt performance—with 6.1% better results.
- An 'evaluation router' automatically directs test samples to the best specialized model, enabling robust multi-task AI on private data.
Why It Matters
Enables more powerful, personalized AI on private devices by solving key federated learning flaws in generalization and multi-task handling.