Thinking Machines unveils real-time AI 'interaction models' for multimodal collaboration
Mira Murati's startup redefines AI interaction with continuous audio, video, and text in real time.
Thinking Machines, the AI startup founded by former OpenAI CTO Mira Murati in February 2025, unveiled its latest innovation on Monday: interaction models. These models break away from the traditional single-thread interaction paradigm where AI waits for user input and freezes during generation. Instead, interaction models continuously ingest audio, video, and text, enabling real-time perception, thinking, and response — mimicking how humans collaborate face-to-face.
The company shared demos including real-time detection of animal names in a story, live speech translation, and posture correction alerts. According to Thinking Machines, this approach solves the 'bandwidth bottleneck' of current AI interfaces, allowing humans to interact naturally rather than contorting to rigid input-output loops. However, the models aren't publicly available yet; a limited research preview is expected in the coming months, with a wider release planned later this year. Murati's venture has already seen key team members depart to Meta and back to OpenAI.
- Thinking Machines introduces 'interaction models' that process audio, video, and text continuously in real time, unlike traditional AI that waits for input.
- Examples include real-time speech translation, detecting animal mentions in stories, and monitoring posture.
- Limited research preview in coming months, wider release later this year; company founded Feb 2025 by ex-OpenAI CTO Mira Murati.
Why It Matters
Real-time multimodal interaction could unlock new AI use cases in collaboration, accessibility, and ambient computing.