StreamPro-Bench introduces Proactive Agency as a new metric, measuring early decision-making under partial observations?

StreamPro-Bench introduces Proactive Agency as a new metric, measuring early decision-making under partial observations

CB-Stream Loss addresses severe supervision imbalance between silence and response signals during training?

CB-Stream Loss addresses severe supervision imbalance between silence and response signals during training

GRPO with multi-grained rewards (turn-level and trajectory-level) jointly optimizes response accuracy and timing, achieving 41.5 vs prior 10.4 on the benchmark?

GRPO with multi-grained rewards (turn-level and trajectory-level) jointly optimizes response accuracy and timing, achieving 41.5 vs prior 10.4 on the benchmark

Research & Papers

StreamPro AI shifts video models from reactive to proactive reasoning, scoring 4x better

arXiv cs.CV May 19, 2026

⚡New benchmark forces AI to decide when to respond, not just what to say

Deep Dive

Current video understanding models follow a passive “see-then-answer” paradigm: they wait for clear evidence before responding. This reduces proactive reasoning to delayed perception. The new paper “StreamPro: From Reactive Perception to Proactive Decision-Making in Streaming Video” by Ao Li and 9 co-authors tackles the harder problem of deciding when to speak, not just what to say. The authors introduce StreamPro-Bench, a benchmark that scores models on three axes: Perception Understanding, Temporal Reasoning, and Proactive Agency. The last measures a model’s ability to make early, yet reliable, decisions from incomplete streams—a critical skill for real-time applications like live surveillance, autonomous driving, or interactive assistants.

To train such models, the team proposes a two-stage framework also called StreamPro. First, they use CB-Stream Loss during supervised fine-tuning to mitigate the extreme imbalance between long periods of irrelevant silence and short, critical response windows. Second, they apply Group Relative Policy Optimization (GRPO) with a multi-grained reward design that penalizes both wrong answers and poor timing—optimizing correctness and decision delays jointly. Results show dramatic gains: StreamPro achieves 41.5 on its proactive benchmark, far exceeding the previous best of 10.4, while maintaining strong real-time performance (78.9 on StreamingBench-RTVU). The work signals a shift from passive video understanding to agents that can proactively engage with streaming video.

Key Points

StreamPro-Bench introduces Proactive Agency as a new metric, measuring early decision-making under partial observations
CB-Stream Loss addresses severe supervision imbalance between silence and response signals during training
GRPO with multi-grained rewards (turn-level and trajectory-level) jointly optimizes response accuracy and timing, achieving 41.5 vs prior 10.4 on the benchmark

Why It Matters

Proactive video AI is essential for real-time systems that must act before full evidence appears

Read Original Article

StreamPro AI shifts video models from reactive to proactive reasoning, scoring 4x better

Why It Matters

Related Articles

🚀 Stay Ahead in AI