Image & Video

PanoAffordanceNet: Towards Holistic Affordance Grounding in 360{\deg} Indoor Environments

arXiv eess.IV March 11, 2026

⚡New framework tackles distorted panoramic views to let robots understand what objects are for in a full room.

Deep Dive

A research team from Zhejiang University and other institutions has published PanoAffordanceNet, a novel AI framework designed to solve 'holistic affordance grounding' in 360-degree indoor spaces. Unlike current computer vision models that identify objects in narrow perspective views, this system aims to give embodied AI agents—like future home robots—a global understanding of a room. It identifies not just what objects are, but what actions they afford (e.g., a chair is 'sit-able,' a handle is 'grasp-able') across the entire panoramic scene. This task is uniquely challenging due to severe distortions in 360-degree equirectangular images and the difficulty of aligning sparse, scattered object data into a coherent spatial map.

To overcome these hurdles, PanoAffordanceNet introduces two key technical innovations: a Distortion-Aware Spectral Modulator (DASM) that performs latitude-dependent calibration to fix warped object shapes, and an Omni-Spherical Densification Head (OSDH) that reconstructs a continuous, topologically correct scene from initially sparse activations. The model is trained using a multi-level constraint system combining pixel-wise, distributional, and region-text contrastive objectives, which helps maintain semantic accuracy even with limited training data. Crucially, the team also constructed and will release '360-AGD,' the first high-quality panoramic dataset specifically for affordance grounding, providing a essential benchmark for future research. Extensive experiments show their framework significantly outperforms existing methods adapted to this new task.

Key Points

Introduces the novel task of Holistic Affordance Grounding in 360° spaces, moving beyond object-centric, perspective-view analysis.
Features a Distortion-Aware Spectral Modulator (DASM) to correct panoramic image warping and an Omni-Spherical Densification Head (OSDH) for scene continuity.
Includes the release of the first panoramic affordance grounding dataset, 360-AGD, establishing a new benchmark for embodied AI research.

Why It Matters

This is foundational tech for next-gen robots that need to understand and interact with complex human environments, not just recognize isolated objects.

Read Original Article

PanoAffordanceNet: Towards Holistic Affordance Grounding in 360{\deg} Indoor Environments

Why It Matters

Stay Ahead in AI