Research & Papers

Hyena Operator for Fast Sequential Recommendation

New research replaces attention with Legendre polynomials for linear scaling on long user histories.

Deep Dive

A research team led by Jiahao Liu and Lin Li has published HyenaRec, a new sequential recommendation model that addresses the computational bottleneck of traditional attention-based systems. While attention mechanisms deliver strong accuracy, their quadratic complexity makes processing long user histories—common in real-world platforms—prohibitively expensive. HyenaRec tackles this by integrating polynomial-based kernel parameterization with gated convolutions, specifically designing convolutional kernels using Legendre orthogonal polynomials. This provides a smooth, compact basis for modeling long-term temporal dependencies in user behavior.

The architecture uses a complementary gating mechanism to capture fine-grained, short-term behavioral bursts, creating a hybrid system that balances global temporal evolution with localized user interests. This construction allows the model to scale linearly with sequence length (O(n) complexity) instead of quadratically (O(n²)). Extensive experiments on multiple real-world datasets show HyenaRec consistently outperforms attention-based, recurrent, and other baseline models in ranking accuracy.

Notably, HyenaRec trains up to 6 times faster than attention-based models, with advantages becoming particularly pronounced in long-sequence scenarios. The model maintains efficiency without sacrificing accuracy, demonstrating that polynomial-based kernel parameterization is a viable and scalable alternative to attention for sequential recommendation tasks. The work has been accepted by the ACM Web Conference 2026 (WWW '26), highlighting its significance for large-scale recommendation systems.

Key Points
  • Replaces quadratic attention with linear-scaling polynomial kernels (Legendre polynomials) for long sequences
  • Achieves up to 6x faster training while maintaining or improving ranking accuracy
  • Hybrid architecture balances long-term dependencies (polynomial kernels) with short-term bursts (gated convolutions)

Why It Matters

Enables efficient modeling of years-long user histories for platforms like Netflix or Amazon, making personalized recommendations scalable.