DuCCAE: A Hybrid Engine for Immersive Conversation via Collaboration, Augmentation, and Evolution
Hybrid AI engine solves the latency problem for complex tasks, boosting 7-day retention to 34.2%.
A team of 16 researchers from Baidu has introduced DuCCAE (Conversation while Collaboration with Augmentation and Evolution), a novel hybrid engine designed to solve a core problem in production AI: the trade-off between conversational responsiveness and the ability to execute complex, long-horizon tasks. Current systems struggle when a user request requires planning, web search, or media generation, as these "agentic" actions create heavy latency that breaks the flow of dialogue. DuCCAE's key innovation is its decoupled architecture, which separates the real-time response generation from asynchronous task execution. These two streams are synchronized through a shared state that maintains the full session context and execution traces, allowing results from slow-running agents to be seamlessly integrated back into the live conversation.
This architecture orchestrates five subsystems—Info, Conversation, Collaboration, Augmentation, and Evolution—to enable multi-agent collaboration and continuous system improvement. The team evaluated DuCCAE using both offline benchmarks on the Du-Interact dataset and, crucially, through large-scale A/B testing in the live Baidu Search environment serving millions of users. Since its deployment in June 2025, the results have been striking: Day-7 user retention tripled to 34.2%, and the complex task completion rate surged to 65.2%. These metrics demonstrate that the system successfully maintains conversational continuity while reliably executing agentic workflows, directly addressing user trust and engagement. The paper provides a practical blueprint for deploying scalable, agentic AI systems in real-world industrial applications where latency and reliability are paramount.
- Decouples real-time chat from slow agentic tasks (planning, search) using a shared state for synchronization.
- Deployed in Baidu Search, it tripled 7-day user retention to 34.2% and boosted complex task completion to 65.2%.
- Provides a proven industrial architecture for scalable multi-agent systems that maintain conversational flow.
Why It Matters
It proves a viable architecture for combining fast chat with slow AI agents, a major hurdle for deploying useful assistant AI at scale.