PEML method boosts multi-task LLM accuracy by 10.75%
Co-optimizing prompts and low-rank weights for single-model multi-task learning.
A team of researchers from multiple institutions has introduced PEML (Parameter-efficient Multi-task Learning), a novel approach that addresses the growing need to adapt a single large language model (LLM) for multiple tasks simultaneously. Unlike existing PEFT methods such as LoRA and Prefix Tuning, which are designed for single-task adaptation, PEML employs a neural architecture engineering method that jointly optimizes continuous prompts and performs low-rank adaptation of model weights. This co-optimization allows the model to leverage shared features across tasks while using significantly fewer total parameters than deploying separate fine-tuned models.
The paper reports extensive evaluations on four major benchmarks—GLUE, SuperGLUE, Massive Multitask Language Understanding (MMLU), and commonsense reasoning—comparing PEML against state-of-the-art multi-task methods including MTL-LoRA, MultiLoRa, C-Poly, and MoE. Results show an average accuracy improvement of up to 6.67%, with peak gains reaching 10.75% on individual tasks. The framework also offers resource consolidation benefits: deploying a single multi-task model consumes far less memory and compute than multiple single-task models, making it especially valuable for production environments with limited hardware.
- Co-optimizes continuous prompts and low-rank weight adaptation for multi-task LLM fine-tuning.
- Achieves up to 6.67% average accuracy gain and peak 10.75% improvement on individual tasks across GLUE, SuperGLUE, MMLU, and commonsense reasoning.
- Outperforms existing methods like MTL-LoRA, MultiLoRa, C-Poly, and MoE while enabling significant resource consolidation.
Why It Matters
Enables cost-effective multi-task LLM deployment with higher accuracy, reducing hardware needs for enterprises.