Research & Papers

Knowledge graphs help small LLMs but hurt large ones in zero-shot classification

Adding per-article knowledge graphs boosts small LLMs but backfires on large ones by 5x cost.

Deep Dive

Shahana Akter and colleagues propose a zero-shot multi-label topic classification framework that operates without labeled training data. The base framework includes four variants: article-only classification, keyword-enhanced classification, and self-consistency decoding variants of both. They then augment each variant with a per-article knowledge graph extracted via a KGGen-like pipeline of subject-predicate-object triples. This yields eight methods (four base, four graph-augmented) tested across 15 large language models (LLMs) and eight multi-label datasets from different domains.

Results reveal a clear divide: keyword-enhanced classification (AK) performed best among base methods, with six of 15 LLMs surpassing a sentence-encoder baseline. However, graph augmentation had opposite effects on model sizes—it improved small models but hurt large models, indicating that larger LLMs already encode sufficient relational knowledge from pretraining. Self-consistency decoding consistently failed to boost performance while increasing computational cost roughly fivefold. The study provides practical guidance for when to invest in knowledge graph augmentation for zero-shot classification.

Key Points
  • Framework uses four base variants (article-only, keyword-enhanced, plus self-consistency variants) and four KG-augmented versions.
  • Graph augmentation improves small LLMs but degrades large ones, as large models already possess relational knowledge.
  • Self-consistency decoding adds 5x compute cost with zero performance gain across all experiments.

Why It Matters

Zero-shot classification is vital for rapid NLP deployment; this study shows when to use knowledge graphs based on model size.