A Multi-Agent Feedback System for Detecting and Describing News Events in Satellite Imagery
The new system curates a 5,000-sequence dataset by linking news articles to satellite imagery changes.
A team of researchers has introduced SkyScraper, a novel multi-agent AI system designed to automatically detect and describe news events visible in satellite imagery. The system addresses a critical gap in remote sensing: the lack of large, labeled datasets showing changes over multiple time steps. Traditional methods for finding and labeling these multi-temporal sequences are prohibitively time-consuming and labor-intensive. SkyScraper's iterative workflow first geocodes global news articles to pinpoint locations, then searches for corresponding satellite image sequences that show visible changes, and finally synthesizes descriptive captions for the events.
The core innovation is the use of an agentic feedback loop, where different AI agents work together to refine the search and description process. In experiments, this approach proved dramatically more effective than standard geocoding, successfully surfacing 5x more relevant events. The researchers applied SkyScraper to a large database of news, using it to curate a significant new public dataset called SkyScraper, which contains 5,000 multi-temporal image sequences with captions. This dataset is a major resource for training future computer vision models in change detection and captioning.
Beyond dataset creation, the framework has direct applications in supporting investigative journalism and real-time reporting. By automating the link between textual news and visual satellite evidence, SkyScraper can help reporters quickly verify events, track long-term developments like construction or deforestation, and discover stories that might otherwise go unnoticed. The work, detailed in an arXiv preprint, demonstrates how multi-agent AI systems can tackle complex, multi-step data synthesis tasks that were previously manual.
- SkyScraper uses a multi-agent AI workflow to link news articles to satellite imagery, finding 5x more events than traditional geocoding.
- The system was used to create a new public dataset of 5,000 multi-temporal satellite image sequences with descriptive captions.
- The automated process supports journalism by providing visual evidence for news events, from disasters to industrial development.
Why It Matters
Automates the discovery of visual evidence for global news, accelerating investigative reporting and creating vital training data for AI.