Research & Papers

egenioussBench: A New Dataset for Geospatial Visual Localisation

arXiv cs.CV May 08, 2026

⚡42 test images with cm-accurate ground truth from 2,709 smartphone photos

Deep Dive

EgenioussBench tackles a core challenge in computer vision: geospatial visual localization — the task of determining where a photo was taken using reference 3D maps. Unlike traditional benchmarks that rely on structure-from-motion (SfM) reconstructions, egenioussBench leverages deployable city-scale assets: an airborne 3D mesh and a CityGML LoD2 model. This design ensures scalability and reflects real-world mapping data used by urban planners and autonomous systems.

The query set consists of 2,709 smartphone images captured in an urban environment, each paired with centimeter-accurate ground truth poses derived from PPK (precise point kinematic) GPS and ground control point adjustments. A co-visibility matrix computed from rendered depth identifies a maximum independent set, yielding a test split of 42 non-overlapping images with withheld ground truth and a validation split of 412 sequential images. The benchmark also introduces a public leaderboard evaluated with binning metrics at multiple pose-error thresholds, plus global statistics like median, RMSE, and outlier ratio. This ensures fair, like-for-like comparisons across methods using either mesh or LoD2 reference data. Code and data are publicly available, advancing large-scale cross-domain localization.

Key Points

Dataset includes 2,709 smartphone images with centimeter-accurate ground truth (PPK + GCP)
Uses city-scale airborne 3D mesh and CityGML LoD2 models, not SfM reconstructions
Public leaderboard with binning metrics at multiple error thresholds for fair method comparison

Why It Matters

Bridges the gap between computer vision research and real-world geospatial mapping for autonomous navigation and urban planning.

Read Original Article

egenioussBench: A New Dataset for Geospatial Visual Localisation

Why It Matters

Stay Ahead in AI