Research & Papers

Macaron-A2UI lets agents generate dynamic UIs instead of plain text

New 754B model creates UI controls on the fly, outperforming static schema approaches

Deep Dive

Personal agents today are stuck in plain-text chat, making complex multi-step tasks cumbersome. Macaron-A2UI changes this by treating UI generation as a first-class capability: models produce lightweight, executable interface elements (buttons, forms, menus) adapted in real time to user context. The authors built a large-scale Generative UI corpus from heterogeneous dialogue sources and trained three model sizes (30B, 235B, 754B) using parameter-efficient LoRA fine-tuning followed by reward-driven reinforcement learning. This lets agents dynamically collect information, refine preferences, confirm actions, and organize multiple goals without relying on predefined schemas.

The flagship 754B model achieved a 75.6 overall score on A2UI-Bench, a new controlled evaluation benchmark, without any explicit schema hints — surpassing the strongest frontier baseline that had full schema access. The team is releasing all models, the benchmark dataset, and the full evaluation protocol to support further research. For professionals building personal assistants, this signals a shift from rigid text interfaces to adaptive UI that reduces friction in complex interactions like booking travel, managing projects, or configuring software, where dynamic form generation can dramatically speed up user workflows.

Key Points
  • Models (30B, 235B, 754B) trained with LoRA SFT + reinforcement learning on a large corpus of heterogeneous dialogue data
  • Top model scores 75.6 on A2UI-Bench without any schema hints, beating full-schema frontier baselines
  • Enables agents to generate executable UI actions for info collection, preference refinement, and multi-goal organization

Why It Matters

Dynamic UI generation removes text bottlenecks, letting personal agents handle complex tasks with interactive, adaptive interfaces.