Research & Papers

How Frontier LLMs Adapt to Neurodivergence Context: A Measurement Framework for Surface vs. Structural Change in System-Prompted Responses

New benchmark tests if chatbots truly adjust for neurodivergent users—findings show surface-level changes but persistent harm risks

Deep Dive

A new arXiv paper introduces NDBench, a comprehensive framework to measure how frontier chat-based LLMs adjust outputs when system prompts include neurodivergence (ND) context. The benchmark tests two frontier models across three prompt types—baseline, ND-profile assertion, and ND-profile assertion with explicit adjustment instructions—covering four canonical ND profiles and 24 prompts, including an adversarial masking strategy. With 576 total outputs, the study reveals four consistent trends.

First, LLMs significantly adapt to ND context only when given explicit instructions: outputs become lengthier, with more headings and granular steps (p < 10^-8). However, this adaptation is primarily structural—list density barely changes while headings and step counts rise. Crucially, ND persona assertion alone does not suppress harmful tendencies; masking-reinforcement drops by 36–44% only in explicitly instructed cases. The authors also caution that reliability analysis shows only two of six harm dimensions (masking/reinforcement and validation quality) meet the inter-judge agreement threshold (alpha >= 0.67). NDBench, including prompts, outputs, and code, is publicly available to enable reproducible audits of ND awareness in future LLMs.

Key Points
  • NDBench evaluates 576 outputs from two frontier models across 3 prompt types, 4 ND profiles, and 24 prompts including adversarial masking
  • Explicit ND instructions yield longer, more structured responses (p < 1e-8, Holm-corrected) but list density remains unchanged, showing surface-level adaptation
  • Harmful masking-reinforcement decreases by 36–44% only under explicit instructions; ND persona assertion alone fails to reduce it

Why It Matters

Research shows LLMs need explicit, structured prompts to avoid harmful responses to neurodivergent users—persona assertion is not enough.