GATA2Floor: Graph attention model counts building floors from street-view images
New model counts floors without manual labels using self-supervised features and graph reasoning.
GATA2Floor is introduced, a multi-head Graph Attention v2 (GATv2) model for counting floors in street-view facades. It models each facade as a graph over window/door detections with a vertical edge prior, predicting global floor counts and softly assigning elements to latent slots via learnable cross-attention queries. The label-free proposal mechanism uses self-supervised features and vision-language scoring, enabling training without annotated datasets. Accepted at IEEE ICIP 2026.
- GATA2Floor uses GATv2 (Graph Attention v2) to model facades as graphs over window/door detections with a vertical edge prior.
- Employs learnable cross-attention queries to softly assign detections to latent floor slots, enabling interpretable predictions.
- Label-free proposal mechanism leverages self-supervised features and vision-language scoring, eliminating the need for annotated datasets.
Why It Matters
Enables scalable automated building analysis for urban planning, energy assessment, and emergency response without costly manual annotations.