GARD performs diffusion-based denoising inside the feature space of a feed-forward 3D reconstructor, not on input images, preserving geometry-awareness?

GARD performs diffusion-based denoising inside the feature space of a feed-forward 3D reconstructor, not on input images, preserving geometry-awareness.

The framework simultaneously recovers 3D scene geometry and high-quality RGB images via an additional image decoder?

The framework simultaneously recovers 3D scene geometry and high-quality RGB images via an additional image decoder.

Tested on the Depth Anything 3 (DA3) benchmark, GARD shows robust performance against noise, blur, and compression artifacts common in real-world deployment?

Tested on the Depth Anything 3 (DA3) benchmark, GARD shows robust performance against noise, blur, and compression artifacts common in real-world deployment.

Research & Papers

GARD: New diffusion framework makes multi-view 3D reconstruction robust to real-world degradation

arXiv cs.CV May 27, 2026

⚡A team of 11 researchers introduces diffusion-based denoising directly in 3D feature space to handle noisy inputs.

Deep Dive

Multi-view 3D reconstruction models have made remarkable strides under ideal conditions, but real-world scenarios often involve image degradation from noise, blur, compression, or lighting variations—breaking their performance. A research team comprising 11 authors from Korea has introduced Geometry-Aware Representation Denoising (GARD), a diffusion-based framework that operates directly on the feature representations of an existing feed-forward 3D reconstructor. Unlike prior methods that clean input images before reconstruction (which loses 3D context) or rely on post-processing, GARD uses diffusion modeling to denoise the internal geometry-aware features, preserving spatial relationships and recovering accurate scene geometry.

The framework consists of a diffusion denoiser attached to the feature encoder of a pre-trained multi-view reconstruction model. It trains a noise predictor that reverses feature corruption step-by-step, conditioned on multi-view consistency. Additionally, GARD includes an auxiliary RGB decoder that reconstructs clean images from the refined features, enabling simultaneous restoration of both 3D geometry and high-quality 2D imagery. Experiments on the Depth Anything 3 (DA3) benchmark demonstrate GARD's ability to maintain reconstruction accuracy under severe synthetic and real degradations, outperforming both baseline models and image-level denoising pipelines. The work opens a new path for deploying feed-forward 3D models in uncontrolled environments like robotics, autonomous driving, and AR/VR.

Key Points

GARD performs diffusion-based denoising inside the feature space of a feed-forward 3D reconstructor, not on input images, preserving geometry-awareness.
The framework simultaneously recovers 3D scene geometry and high-quality RGB images via an additional image decoder.
Tested on the Depth Anything 3 (DA3) benchmark, GARD shows robust performance against noise, blur, and compression artifacts common in real-world deployment.

Why It Matters

Bridges the gap between ideal training and real-world deployment for 3D reconstruction in robotics, VR, and autonomous driving.

Read Original Article

GARD: New diffusion framework makes multi-view 3D reconstruction robust to real-world degradation

Why It Matters

Related Articles

🚀 Stay Ahead in AI