Research & Papers

Samsung's Ocean4Rec uses offline LLM profiles to boost VOD recommendations by 61.5%

No more LLM calls per request—precomputed personality scores improve reranking by up to 67%

Deep Dive

A team from Samsung has published a paper on Ocean4Rec, a novel reranking approach for video-on-demand (VOD) recommender systems that eliminates the latency and operational complexity of using LLMs at request time. Instead of invoking an LLM per request to assess relevance—which complicates throughput planning, tail-latency control, and capacity isolation—the system uses an LLM only offline. It materializes item OCEAN profiles (Openness, Conscientiousness, Extraversion, Agreeableness, Neuroticism) from content metadata, and builds user profiles via time-decayed aggregation of recently clicked and deep-linked items in the same five-dimensional personality space.

At serving time, Ocean4Rec performs a purely numeric reranking by joining precomputed item profiles, user profiles, base recommender scores, and catalog recency. This entirely eliminates the need for an LLM inference during the request path while still injecting rich content understanding derived from a large language model. Tested on anonymized Samsung Smart TV VOD logs with a same-candidate Top1000 temporal-holdout evaluation, Ocean4Rec achieved notable improvements: NDCG@20 rose 7.6% against an NCF generator and 61.5% against a LightGCN generator. Hit rate at 20 (HR@20) improved by 67.3% for LightGCN, though results were inconclusive for NCF due to sparse exact-item replay labels. The authors emphasize that the gains come on top of a strong industrial baseline (Base+Recency), making Ocean4Rec a practical, deployable solution for high-volume, latency-sensitive VOD services.

Key Points
  • Ocean4Rec uses an LLM offline to generate OCEAN personality profiles from content metadata, then performs purely numeric reranking at request time with zero LLM calls
  • On Samsung Smart TV VOD logs, NDCG@20 improved by 61.5% for LightGCN and 7.6% for NCF, with HR@20 up 67.3% for LightGCN
  • Removing LLM from the request path simplifies throughput planning, eliminates tail-latency spikes, and improves capacity isolation for production recommender systems

Why It Matters

Eliminating LLM inference at request time enables richer recommendations at scale without sacrificing latency or operational predictability.