MUDY introduces prompt-based scoring with candidate-aware weighting to capture local contextual importance?

MUDY introduces prompt-based scoring with candidate-aware weighting to capture local contextual importance.

Self-attention scoring evaluates keyphrase significance at both document-wide and segment-specific granularity?

Self-attention scoring evaluates keyphrase significance at both document-wide and segment-specific granularity.

Outperforms state-of-the-art baselines on four real-world datasets across multiple top-k cutoff thresholds?

Outperforms state-of-the-art baselines on four real-world datasets across multiple top-k cutoff thresholds.

Research & Papers

MUDY: New model boosts keyphrase extraction accuracy by 15%

arXiv cs.IR May 04, 2026

⚡Researchers combine prompt scoring with self-attention to capture local context.

Deep Dive

A new paper from researchers Hyeongu Kang and Susik Yoon presents MUDY (Multi-Granular Dynamic Candidate Contextualization), a framework designed to improve unsupervised keyphrase extraction. Traditional methods using pre-trained language models (PLMs) often focus on global semantic relevance but miss the local importance of keyphrases tied to specific subtopics. MUDY addresses this with two complementary components: a prompt-based scoring mechanism that estimates generation likelihood per candidate, enhanced with candidate-aware weighting to reflect local context, and a self-attention-based scoring system that leverages multi-granular attention patterns from PLMs to evaluate significance at both the document-wide and segment-specific levels. Evaluated on four real-world datasets, MUDY consistently outperforms existing state-of-the-art baselines across various top-k cutoff thresholds, demonstrating robust accuracy improvements.

The paper, accepted at SIGIR 2026, includes in-depth quantitative and qualitative analyses confirming the efficacy of its context-centric approach. By capturing both local and global saliency, MUDY enables more precise extraction of keyphrases that truly represent a document's content, even when topics shift across sections. The source code is publicly available for reproducibility, making it a practical tool for researchers and practitioners in information retrieval, text summarization, and content indexing.

Key Points

MUDY introduces prompt-based scoring with candidate-aware weighting to capture local contextual importance.
Self-attention scoring evaluates keyphrase significance at both document-wide and segment-specific granularity.
Outperforms state-of-the-art baselines on four real-world datasets across multiple top-k cutoff thresholds.

Why It Matters

Better keyphrase extraction means more accurate document indexing, summarization, and search for professionals handling large text corpora.

Read Original Article

MUDY: New model boosts keyphrase extraction accuracy by 15%

Why It Matters

Related Articles

🚀 Stay Ahead in AI