AI Safety

In-context learning of representations can be explained by induction circuits

New analysis shows induction circuits, not complex reasoning, drive LLMs' ability to learn graph structures from context.

Deep Dive

AI researcher Andy Arditi published an ICLR 2026 blogpost challenging the interpretation of how large language models learn from context. Analyzing Park et al.'s 2025 findings where LLMs like Llama-3.1-8B process random walks on graphs and develop internal representations mirroring the graph structure, Arditi argues this phenomenon can be explained by induction circuits—well-documented mechanisms for in-context bigram recall first described by Elhage et al. (2021) and Olsson et al. (2022). This simpler explanation contradicts the broader claim that LLMs actively manipulate representations to reflect complex concept semantics specified entirely within the context.

Arditi reproduced Park et al.'s experiments using the grid tracing task, where 16 word tokens (like 'apple,' 'bird,' 'car') are arranged in a 4×4 grid and models process sequences of random walks. He confirmed that Llama-3.1-8B achieves ~80% accuracy after processing 1400 tokens and that PCA visualizations of layer 26 activations show neighboring tokens clustering in representation space. However, his analysis demonstrates that induction circuits—which essentially learn to predict the next token based on previous token pairs—suffice to explain both the performance gains and the emergent geometric structure, without requiring more sophisticated semantic manipulation capabilities. This work provides important mechanistic clarity about what's actually happening inside transformer models during in-context learning tasks.

Key Points
  • Induction circuits (from 2021/2022 research) explain LLMs' ability to learn graph structures from context, not complex semantic manipulation
  • Reproduction shows Llama-3.1-8B reaches ~80% accuracy on grid tracing task after processing 1400 tokens of random walks
  • PCA visualizations confirm neighboring tokens cluster in activation space, but this geometry emerges from simple bigram prediction mechanisms

Why It Matters

Clarifies fundamental LLM mechanisms, helping researchers build more interpretable models and avoid overestimating AI capabilities.