Image & Video

Q-Probe uses agentic probing to assess high-resolution image quality

New AI framework captures fine local degradations in high-res images without bias.

Deep Dive

Existing RL-based image quality assessment (IQA) models rely on coarse global views, missing subtle local degradations in high-resolution images. Emerging "Thinking with Images" paradigms introduce zoom-in mechanisms but suffer from spurious biases like interpreting cropping as degradation or mistaking natural depth-of-field for artifacts. To address these limitations, researchers from multiple institutions propose Q-Probe, the first agentic IQA framework designed to scale quality assessment to high resolution via context-aware probing. The system uses reinforcement learning to train multimodal large language models (MLLMs) that can actively probe local regions while maintaining global context.

Q-Probe introduces Vista-Bench, a new benchmark specifically designed for fine-grained local degradation analysis in high-resolution settings. The framework employs a three-stage training paradigm that progressively aligns the model with human preferences while eliminating causal bias through a novel context-aware cropping strategy. Extensive experiments show Q-Probe achieves state-of-the-art performance in high-resolution scenarios while maintaining superior efficacy across all resolution scales. This work represents a significant step toward practical AI judging systems for high-resolution visual content.

Key Points
  • First agentic IQA framework specifically designed for high-resolution images with fine-grained local degradation analysis.
  • Introduces Vista-Bench, a dedicated benchmark for evaluating local distortions in high-resolution settings.
  • Novel context-aware cropping strategy eliminates the "cropping-implies-degradation" bias that plagues existing zoom-in methods.

Why It Matters

Enables accurate, bias-free image quality assessment for high-resolution content in photography, medical imaging, and video streaming.