AI Safety

New study: LLMs misalign with teachers' AI perceptions across 55 countries

8 state-of-the-art models overestimate both benefits and risks of AI in education.

Deep Dive

Researchers combined representative OECD TALIS survey data from 55 countries with systematic evaluations of eight state-of-the-art large language models from four providers. They measured cross-national variation in teachers' perceived benefits and risks of AI, then benchmarked LLM responses under both general and country-specific prompting conditions, comparing higher- and lower-reasoning models. The findings reveal substantial misalignment: LLMs compress country differences, overestimate both benefits and risks, and show limited improvement from identity prompting or enhanced reasoning capabilities.

This misalignment carries significant implications because LLM-generated guidance and professional discourse increasingly shape how teachers learn about and discuss AI, potentially influencing trust and adoption decisions. The paper cautions against treating LLM outputs as substitutes for direct engagement with educators when informing global AI-in-education initiatives. That said, some models (e.g., Gemini 3 Fast) partially capture cross-national ranking patterns, suggesting a complementary role in hypothesis generation and exploratory comparative analysis. The study was accepted as a full paper at the 13th ACM Conference on Learning @ Scale (L@S'26).

Key Points
  • LLMs compress cross-national differences in teacher AI perceptions across 55 countries
  • Eight models from four providers (including Gemini 3 Fast) overestimate both benefits and risks
  • Identity prompting and enhanced reasoning produced limited gains in alignment accuracy

Why It Matters

LLM-generated guidance may skew global education policy by misrepresenting teachers' actual AI perceptions.