One Language, Two Scripts: Probing Script-Invariance in LLM Concept Representations
SAE features in LLMs capture meaning, not spelling, even when scripts share zero tokens.
A new research paper accepted at the UCRL Workshop at ICLR 2026 provides compelling evidence that the internal concepts learned by large language models (LLMs) are abstract and meaning-based, not tied to surface-level text. Researcher Sripad Karne used Serbian digraphia—where the language is written interchangeably in Latin and Cyrillic scripts with a near-perfect character mapping—as a controlled testbed. Crucially, these scripts are tokenized completely differently in models, sharing no tokens whatsoever. By analyzing Sparse Autoencoder (SAE) feature activations across the Gemma model family (from 270M to 27B parameters), the study found that identical sentences in different scripts activate highly overlapping features, far exceeding random baselines.
Strikingly, changing the script caused less representational divergence than paraphrasing within the same script, indicating that SAE features prioritize semantic meaning over orthographic form. Cross-script and cross-paraphrase comparisons provided evidence against mere memorization, as these specific combinations rarely co-occur in training data yet still showed substantial feature overlap. The research also found that this script invariance strengthens with model scale, suggesting larger models develop more abstract internal representations. The authors propose Serbian digraphia as a general evaluation paradigm for probing the abstractness of learned representations in AI systems, offering a cleaner method than previous techniques that couldn't perfectly control for meaning.
- SAE features in Gemma models show high overlap for identical meanings in different scripts (Latin vs. Cyrillic), despite zero shared tokens.
- Script changes cause less representational divergence than paraphrasing within the same script, proving features capture semantics over form.
- The 'script invariance' effect strengthens with model scale across the Gemma family (270M to 27B parameters).
Why It Matters
Provides a clearer method to evaluate if AI models understand abstract concepts, crucial for developing more robust and interpretable systems.