FalconApp: Rapid iPhone Deployment of End-to-End Perception via Automatically Labeled Synthetic Data
From video to 6-DoF pose estimation in 20 minutes on an iPhone...
FalconApp is a new iPhone application from researchers at the University of Illinois that solves a major bottleneck in robotics perception: the need for large-scale labeled data. The app lets a user capture a short video of a rigid object with their iPhone, then automatically reconstructs an editable GSplat (Gaussian Splatting) asset, composites it with diverse photorealistic backgrounds, renders synthetic images with ground-truth masks and 6-DoF poses, trains a perception model, and deploys it back to the iPhone frontend. The entire pipeline from capture to deploy averages about 20 minutes per object.
In experiments across five rigid objects with varying geometry and appearance, FalconApp achieved around 30 ms end-to-end latency on iPhone and outperformed a standard PnP baseline in pose accuracy on 4 out of 5 objects in both simulation and real-world tests. This approach dramatically reduces the manual annotation effort traditionally required for robotics perception, making it feasible for rapid prototyping and deployment on consumer hardware. The code and paper are available on arXiv.
- End-to-end pipeline from iPhone video capture to deployed perception model in ~20 minutes per object
- Uses photorealistic auto-labeling with GSplat assets and diverse backgrounds to train mask detection and 6-DoF pose estimation
- Achieves ~30 ms on-device latency on iPhone and beats PnP baseline on 4/5 objects in simulation and real-world tests
Why It Matters
FalconApp democratizes robotics perception by enabling rapid, annotation-free model creation from a single phone capture.