Progressive Per-Branch Depth Optimization for DEFOM-Stereo and SAM3 Joint Analysis in UAV Forestry Applications
Researchers combine DEFOM-Stereo and SAM3 to create precise 3D models of individual branches from drone footage.
A research team from the University of Canterbury and Victoria University of Wellington has developed a novel AI pipeline that significantly improves the 3D modeling of individual tree branches from drone (UAV) imagery. The system, detailed in a new arXiv paper, tackles a critical bottleneck for autonomous forestry: creating precise enough 3D maps for robots to prune specific branches. It does this by intelligently fusing outputs from two foundation models—DEFOM-Stereo for generating initial depth/disparity maps and Meta's Segment Anything Model 3 (SAM3) for identifying individual branch instances—and then applying a sophisticated, five-stage refinement process to clean the noisy data.
The technical core of the work is a progressive optimization scheme that systematically attacks three major error families. First, it corrects 'mask boundary contamination' from SAM3 using morphological operations. Second, it fixes segmentation inaccuracies with LAB-space color validation. Finally, and most impactfully, it tackles pervasive depth noise with a robust cascade of filters, including Median Absolute Deviation (MAD) detection and RGB-guided filtering. Tested on Radiata pine imagery captured with a ZED Mini stereo camera from a UAV, the pipeline slashes depth variability by 82% while preserving the fine geometry of thin branches. The resulting high-fidelity point clouds are a prerequisite for autonomous pruning, and the team has open-sourced all code and data to accelerate development in robotic forestry and precision agriculture.
- Integrates DEFOM-Stereo and SAM3 foundation models to create initial 3D branch maps from stereo drone footage.
- Uses a five-stage depth optimization cascade (MAD, spatial consensus, RGB-guided filtering) to reduce noise by 82%.
- Produces geometrically coherent point clouds accurate enough for autonomous UAVs to position pruning tools on specific branches.
Why It Matters
Enables scalable, autonomous forest management and precision agriculture by giving robots the 3D vision needed for delicate tasks.