MTM Dataset: 4,654 math tutoring transcripts open to researchers
A multimodal dataset of real tutoring sessions aims to supercharge AI tutoring systems.
Researchers led by René Kizilcec and colleagues from multiple institutions have released the Million Tutoring Moves (MTM) dataset, a first-of-its-kind open multimodal resource for studying tutoring interactions. MTM v1 contains 4,654 math tutoring transcripts collected from a U.S.-based nonprofit online tutoring platform. The dataset is part of the broader National Tutoring Observatory (NTO) infrastructure, which aims to translate authentic tutoring interactions into actionable insights for research, practice, and the development of AI-powered educational technology.
By making these tutoring sessions systematically observable and analyzable, MTM v1 supports research on instructional processes and enables the creation of AI systems grounded in real educational interactions. The dataset is safe, open, large-scale, broad-coverage, and multimodal, setting a foundation for future expansions. This initiative can accelerate the science of tutoring and help build more effective AI tutors that learn from how humans actually teach math.
- MTM v1 includes 4,654 math tutoring transcripts from a U.S. nonprofit online tutoring platform.
- Dataset is part of the National Tutoring Observatory (NTO) research infrastructure.
- Aims to support AI development grounded in real educational interactions and improve tutoring practice.
Why It Matters
Open tutoring data will enable AI tutors that learn from real human teaching, improving educational outcomes at scale.