Research & Papers

Open-SAT: LLM-Guided Query Embedding Refinement for Open-Vocabulary Object Retrieval in Satellite Imagery

arXiv cs.CV May 08, 2026

⚡Training-free algorithm uses LLMs to align open-ended queries with satellite imagery.

Deep Dive

Open-SAT, developed by researchers at a corporate lab, addresses the challenge of open-vocabulary object retrieval in satellite imagery. Traditional vision-language models like CLIP struggle with natural language queries that go beyond predefined categories. The new algorithm works in two phases: offline, it uses a VLM to compute embeddings for image tiles and stores them in a vector database; at query time, it leverages an LLM to refine the text embedding by incorporating contextual details about the target object and its surroundings. A threshold-free retrieval mechanism further boosts accuracy and efficiency.

Experimental results across three public benchmarks show that Open-SAT improves F1 scores by up to 16.04% while retrieving a comparable number of image tiles. The key innovation is its training-free nature—no fine-tuning or additional supervision is needed, making it practical for real-world deployment. By enabling more accurate natural language queries for satellite imagery, Open-SAT could power applications in disaster response, agriculture, and urban planning, where users need to find specific objects or patterns quickly.

Key Points

Open-SAT improves F1 score by up to 16.04% on three public satellite image benchmarks.
It uses an LLM to refine query embeddings at inference time, requiring no additional training.
The system employs a threshold-free retrieval mechanism for efficient and accurate results.

Why It Matters

Enables accurate natural language search over satellite imagery without retraining, unlocking new applications in remote sensing.

Read Original Article

Open-SAT: LLM-Guided Query Embedding Refinement for Open-Vocabulary Object Retrieval in Satellite Imagery

Why It Matters

Stay Ahead in AI