Research & Papers

Say It My Way: Exploring Control in Conversational Visual Question Answering with Blind Users

Research with 11 blind users shows current VQA systems require 3x more conversational turns than necessary.

Deep Dive

Researchers from the University of Maryland and University of Washington published "Say It My Way," analyzing 418 interactions between 11 blind users and conversational visual question answering (VQA) systems. They found interactions averaged 3 conversational turns (sometimes up to 21) and revealed critical system flaws: no verbosity controls, poor spatial/temporal estimation, and inaccessible camera guidance. The study demonstrates how prompt engineering helps users work around limitations and includes a new public dataset for better AI accessibility design.

Why It Matters

Highlights a major accessibility gap in AI-powered vision tools used by millions of blind and low-vision people daily.