Semantics-Aware Denoising: A PLM-Guided Sample Reweighting Strategy for Robust Recommendation
New method filters accidental clicks and clickbait using semantic analysis, boosting recommendation accuracy by 2.2%.
Researchers Xikai Yang, Yang Wang, Yilin Li, and Sebastian Sun developed SAID (Semantics-Aware Implicit Denoising), a framework that uses pre-trained language models (PLMs) like BERT to analyze text. It compares user interest profiles with item descriptions, downweights semantically inconsistent clicks (like accidental ones), and improves training. The method requires only a loss function change, not a new model, and achieved up to 2.2% higher AUC in tests, especially under high noise.
Why It Matters
Makes streaming and e-commerce recommendations more reliable by filtering out misleading user interactions like clickbait.