[P] Visualizing LM's Architecture and data flow with Q subspace projection
A novel technique creates interactive 3D 'MRI scans' of language models like Qwen and Mamba, revealing their internal structure.
An independent researcher operating under the handle y3i12 has developed a novel visualization technique that acts like an 'MRI scan' for language models. By applying what they describe as 'black magic and voodoo'—specifically a method involving Q subspace projection—they can generate stunning 3D volumetric representations of a model's internal architecture and data flow. The resulting interactive visualizations, which the creator finds 'mesmerizing,' reveal the intricate 'structure of structures' within models like HuggingFace's SmolLM-360M, Qwen's 0.8B parameter model, and the Mamba-370M state-space model.
The technique, shared via a Gist link, allows for the exploration of models from different angles, uncovering what the researcher hypothesizes could be a form of 'mediator surface' or a novel interpretation of the model's 'loss landscape.' A key visual shows a complex, web-like structure emerging from the RWKV-4-430M model. This work moves beyond standard performance metrics and attention heatmaps, offering a new, more intuitive lens for researchers to understand how information is organized and processed inside various AI architectures, from transformers to newer state-space designs.
- Creates interactive 3D 'MRI scans' of models like Qwen3.5-0.8B and Mamba-370M using Q subspace projection.
- Visualizes the internal 'structure of structures' and data flow, revealing potential 'mediator surfaces'.
- Offers a new, intuitive method for researchers to interpret complex model internals and loss landscapes beyond 2D graphs.
Why It Matters
Provides AI researchers with a powerful new diagnostic and explanatory tool for understanding the complex internal workings of modern neural networks.