Research & Papers

PhyDrawGen generates physics diagrams that obey physical laws

Outperforms GPT-5 and Gemini on 1,449 physics problems without hallucinating forces

Deep Dive

PhyDrawGen tackles a critical weakness in AI-generated diagrams: systematic violations of physics. While models like GPT-5 and Gemini produce visually plausible images, they often hallucinate force vectors, ignore conservation laws, or break geometric constraints. PhyDrawGen decouples semantic understanding from physical constraint satisfaction using a three-stage neuro-symbolic pipeline. First, a large language model extracts a typed scene graph from the problem text. Then a deterministic solver converts that graph into a Planar Straight-Line Graph (PSLG), encoding force balance, optical paths, and field topologies as exact geometric primitives. Finally, a fine-tuned Qwen-VL model runs a propose-verify loop to iteratively correct violations.

Evaluated on a benchmark of 1,449 problems spanning mechanics, optics, and electromagnetism, PhyDrawGen significantly outperformed GPT-5-image, Gemini 2.5 Flash, and Gemini 3 Pro. The system handles unusual-object problems robustly, suggesting strong generalization. Under review at EMNLP 2026, this work demonstrates that combining LLM reasoning with deterministic physics solvers can fix a key limitation of generative image models. For educators, researchers, and automated tutoring systems, PhyDrawGen offers a path to trustworthy AI-generated instructional diagrams.

Key Points
  • Neuro-symbolic pipeline uses LLM scene graphs + deterministic PSLG solver to encode force balance and field topologies
  • Outperforms GPT-5-image, Gemini 2.5 Flash, and Gemini 3 Pro on 1,449 mechanic, optic, and electromagnetic problems
  • Qwen-VL fine-tuned with propose-verify loop iteratively corrects constraint violations for robust physical accuracy

Why It Matters

Enables AI to generate trustworthy physics diagrams for education and research without hallucinating force vectors.