[D] Advice on sequential recommendations architectures
Why complex AI models are losing to a simple 'most clicked' baseline.
A developer reveals that advanced Transformer decoder architectures, including GPT-2 variants, are failing to outperform a naive baseline for sequential user action prediction. The task involves predicting a key future action, like a purchase, from a detailed tokenized sequence of user interactions (e.g., click, button color, location). Despite experimenting with contrastive heads and weighted loss, recall@k metrics show no significant improvement over simply predicting the user's most-clicked item.
Why It Matters
This highlights a major efficiency gap, questioning the over-engineering of AI solutions for practical recommendation systems.