GaussianFlow SLAM: Monocular Gaussian Splatting SLAM Guided by GaussianFlow
New method tackles a core challenge in monocular SLAM by using optical flow as a geometric guide for 3D Gaussian Splatting.
A research team from KAIST has introduced GaussianFlow SLAM, a novel approach to Simultaneous Localization and Mapping (SLAM) that tackles a persistent challenge in computer vision and robotics. While 3D Gaussian Splatting (3DGS) has emerged as a powerful technique for creating dense, photo-realistic 3D maps from images, applying it to monocular (single-camera) SLAM has been problematic. The core issue is the lack of reliable geometric depth information from a single viewpoint, which can cause the system's optimization to fail, leading to distorted maps and inaccurate camera tracking.
GaussianFlow SLAM's key innovation is using optical flow—the pattern of apparent motion of objects between consecutive video frames—as a geometry-aware supervisor. The system ensures that the projected motion of its 3D Gaussian primitives (termed 'GaussianFlow') aligns with the observed 2D optical flow. This alignment injects consistent structural constraints into the optimization process, regularizing both the 3D scene reconstruction and the estimation of the camera's pose. The team also developed normalized error-based densification and pruning modules that actively refine the map by removing unstable Gaussians and reinforcing accurate ones.
In experiments on standard public datasets, the method demonstrated significant improvements. It achieved superior novel-view rendering quality and more accurate camera tracking compared to other leading monocular SLAM algorithms. By making monocular 3DGS-SLAM more robust and accurate, this work represents a meaningful step toward more capable and affordable 3D perception systems for applications like augmented reality, drones, and mobile robots, where relying on a single camera is a major advantage.
- Solves monocular 3DGS-SLAM's geometry problem by using optical flow as a supervisory signal to guide map and pose optimization.
- Introduces 'GaussianFlow', the projected motion of 3D Gaussians, which is aligned with 2D optical flow for consistent structural cues.
- Outperforms state-of-the-art methods on public benchmarks, achieving higher rendering quality and more accurate camera tracking.
Why It Matters
Enables more robust and accurate 3D mapping from a single camera, lowering the cost and complexity for AR, robotics, and autonomous navigation.