Open Source

DeepSeek released 'Thinking-with-Visual-Primitives' framework

New framework lets AI 'point' at images during chain-of-thought reasoning.

Deep Dive

Key Points
  • Framework uses coordinate points and bounding boxes as 'visual primitives' inside chain-of-thought reasoning.
  • Developed by DeepSeek in partnership with Peking University and Tsinghua University; open-source on GitHub.
  • Improves spatial reasoning and interpretability for multimodal tasks like visual QA and object localization.

Why It Matters

Enables AI to reason spatially with precision, unlocking better performance in robotics, navigation, and visual analysis.