Research & Papers

A High-Resolution Landscape Dataset for Concept-Based XAI With Application to Species Distribution Models

New 2,103-patch dataset uses drone imagery to explain why AI predicts species locations, bridging deep learning and ecology.

Deep Dive

A research team led by Augustin de la Brosse has published a groundbreaking paper and dataset that applies concept-based Explainable AI (XAI) to the ecological challenge of predicting species distributions. The work addresses a critical gap: while deep learning models like Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs) are increasingly used for Species Distribution Models (SDMs), their complexity makes it difficult for ecologists to extract meaningful insights about *why* a model predicts a species will be found in a certain area. The team's solution is the first implementation of Robust TCAV (Testing with Concept Activation Vectors) for SDMs, a method that quantifies the influence of human-understandable concepts—like 'dense forest' or 'open water'—on a model's predictions.

To enable this method, the researchers created and released a novel, open-access dataset derived from high-resolution drone imagery. It includes 653 meticulously labeled patches across 15 distinct landscape concepts, plus 1,450 random reference patches, totaling 2,103 data points. In a case study on two aquatic insects (Plecoptera and Trichoptera), applying Robust TCAV to CNN and ViT models allowed the team to validate model decisions against expert knowledge and, importantly, uncover novel species-habitat associations that can form new testable hypotheses for ecologists. The publicly available code and dataset provide a new toolkit for making powerful but opaque ecological AI models transparent and actionable for conservation policy and land management.

Key Points
  • First application of concept-based XAI (Robust TCAV) to Species Distribution Models (SDMs), making complex deep learning models interpretable for ecologists.
  • New open-source dataset of 2,103 high-resolution landscape patches from drone imagery, with 653 patches labeled across 15 human-interpretable concepts.
  • Case study demonstrated the method can validate AI models against expert knowledge and uncover novel ecological insights for policy-making.

Why It Matters

This bridges the gap between high-performing AI and actionable science, allowing ecologists to trust and learn from complex models used in conservation.