Applies distance transform on pre-semantic contours to boost depth prediction in low-texture areas?

Applies distance transform on pre-semantic contours to boost depth prediction in low-texture areas

Jointly estimates contours, depth, and ego-motion in a single self-supervised framework?

Jointly estimates contours, depth, and ego-motion in a single self-supervised framework

Outperforms competing self-supervised methods on KITTI, Cityscapes, Waymo, NYUv2, and ScanNet?

Outperforms competing self-supervised methods on KITTI, Cityscapes, Waymo, NYUv2, and ScanNet

Image & Video

New self-supervised depth prediction technique outperforms on low-texture scenes

arXiv eess.IV May 12, 2026

⚡Distance transform over pre-semantic contours boosts depth accuracy in uniform regions across 5 benchmarks

Deep Dive

Self-supervised monocular depth estimation (MDE) struggles in low-texture regions because photometric losses become ambiguous. A new paper by Marwane Hariat, Antoine Manzanera, and David Filliat tackles this by applying a distance transform over pre-semantic contours—edge maps extracted before semantic classification. This augmentation increases variance in uniform areas, making loss functions more effective. The network jointly learns pre-semantic contours, depth, and ego-motion, with theoretical proof that the distance transform is optimal for variance augmentation.

Extensive experiments on five major datasets (KITTI, Cityscapes, Waymo, NYUv2, and ScanNet) show the method surpasses all compared self-supervised techniques. The approach is particularly robust on indoor scenes (NYUv2) and autonomous driving benchmarks (KITTI, Waymo), where low-texture surfaces like walls or roads previously caused errors. This work offers a practical, label-free solution for improving depth perception in real-world applications.

Key Points

Applies distance transform on pre-semantic contours to boost depth prediction in low-texture areas
Jointly estimates contours, depth, and ego-motion in a single self-supervised framework
Outperforms competing self-supervised methods on KITTI, Cityscapes, Waymo, NYUv2, and ScanNet

Why It Matters

Enables more reliable depth estimation for autonomous driving, robotics, and AR without needing expensive labeled data.

Read Original Article

New self-supervised depth prediction technique outperforms on low-texture scenes

Why It Matters

Related Articles

🚀 Stay Ahead in AI