VirtualMLE: LLM Agent Automates Recommender Tuning with 60% Fewer Trials
Autonomous ML engineer uses reflection and memory to optimize recommenders faster than humans...
A new paper from researchers (Cao et al.) introduces VirtualMLE, an LLM-agent framework designed to automate the labor-intensive process of tuning sequential recommendation (SR) models. Currently, optimizing SR models on new datasets requires ML engineers to manually run trial-and-error experiments. VirtualMLE replaces that with a closed-loop system: the agent executes an experiment, reflects on the results (analyzing patterns, errors, and gains), then stores concise heuristic feedback in a hierarchical memory. This memory allows the agent to build a knowledge base of effective tuning strategies over time.
Evaluated on three Amazon SR benchmarks using two popular backbones—SASRec and HSTU—VirtualMLE reached competitive recommendation quality with 60% fewer trials compared to manual tuning. Crucially, the cognition summaries (heuristic rules) extracted from previous datasets transferred well to unseen datasets, cutting search time by 40% in cross-domain scenarios. This suggests that LLM agents with reflection and memory can serve as practical virtual ML engineers, amortizing the cost of learning tuning heuristics across multiple projects. The code is publicly available, opening the door for wider adoption in both research and production environments.
- VirtualMLE uses LLM-based agent with execution → reflection → memory update loop to automate SR model tuning.
- Achieved competitive accuracy on Amazon benchmarks (Sports, Beauty, Toys) using SASRec and HSTU with ~60% fewer trials.
- Heuristic summaries from prior datasets transfer to new datasets, accelerating optimization by up to 40%.
Why It Matters
Automates the painstaking manual tuning of recommender systems, slashing time and enabling transferable ML expertise.