Introduces the first benchmark (FedGUI) for federated GUI agents across mobile, web, and desktop platforms?

Introduces the first benchmark (FedGUI) for federated GUI agents across mobile, web, and desktop platforms.

Provides six datasets to study four key heterogeneity types?

platform, device, OS, and data source.

Finds cross-platform collaboration boosts performance and identifies platform & OS as top influencing factors?

Finds cross-platform collaboration boosts performance and identifies platform & OS as top influencing factors.

Agent Frameworks

FedGUI benchmark enables federated training for cross-platform GUI agents

arXiv cs.MA April 17, 2026

⚡First comprehensive benchmark tackles real-world heterogeneity across mobile, web, and desktop platforms.

Deep Dive

A research team led by Wenhao Wang has introduced FedGUI, the first comprehensive benchmark designed to train and evaluate federated GUI agents across heterogeneous real-world environments. The work addresses a critical gap: traditional centralized training for agents that interact with graphical user interfaces (GUIs) faces prohibitive costs and scalability issues, while federated learning (FL) has lacked proper benchmarks to handle the complexity of different platforms. FedGUI systematically provides six curated datasets to study four crucial dimensions of heterogeneity that agents encounter in practice: cross-platform (mobile vs. web vs. desktop), cross-device, cross-operating system, and cross-data source.

Extensive experiments with FedGUI yielded key insights that will guide future development. First, the research demonstrates that enabling collaboration across different platforms—such as having agents learn from both mobile and desktop interfaces—actually improves overall performance, extending the benefits of federated learning beyond mobile-only applications. Second, the benchmark quantifies the distinct impact of each heterogeneity dimension, identifying the platform (e.g., Android app vs. website) and the operating system as the two most influential factors affecting an agent's learning and performance. By making the code and data publicly available, FedGUI establishes a vital, standardized foundation for the research community to build more robust, scalable, and privacy-preserving AI agents capable of operating in the fragmented digital ecosystem.

Key Points

Introduces the first benchmark (FedGUI) for federated GUI agents across mobile, web, and desktop platforms.
Provides six datasets to study four key heterogeneity types: platform, device, OS, and data source.
Finds cross-platform collaboration boosts performance and identifies platform & OS as top influencing factors.

Why It Matters

Enables development of scalable, privacy-preserving AI assistants that can reliably operate across any app or website.

Read Original Article

FedGUI benchmark enables federated training for cross-platform GUI agents

Why It Matters

Related Articles

🚀 Stay Ahead in AI