A Miniature Brain Transformer: Thalamic Gating, Hippocampal Lateralization, Amygdaloid Salience, and Prefrontal Working Memory in Attention-Coupled Latent Memory
New neuroscience-inspired AI model shows a 'sharp, discontinuous phase transition' when adding a prefrontal cortex module.
Researcher Hong Jeong has published a novel AI architecture on arXiv titled 'A Miniature Brain Transformer: Thalamic Gating, Hippocampal Lateralization, Amygdaloid Salience, and Prefrontal Working Memory in Attention-Coupled Latent Memory.' The model extends standard transformer designs by incorporating computational analogues of key brain regions: a thalamic relay for gating, amygdaloid salience module, lateralized hippocampal banks, a prefrontal cortex (PFC) working-memory buffer, and a cerebellar fast-path. These components are coupled via inhibitory 'callosal' cross-talk, mimicking the communication between brain hemispheres.
The central, surprising finding came from testing the model on two benchmarks: Multi-Query Associative Recall (episodic memory) and modular arithmetic (rule-based reasoning). Through systematic ablation studies, Jeong discovered that inhibitory coupling alone failed to induce functional lateralization in the hippocampal banks. A dramatic shift only occurred when the PFC working-memory buffer was added. At epoch 10 or 11, a 'sharp, discontinuous phase transition' fired, collapsing a key correlation metric (P_ct) from 0.25 to ~0.002 and more than doubling a separation metric (D_sep) from 0.251 to 0.501 in a single gradient step.
This result makes a novel, falsifiable prediction: no lateralization occurs without working memory context. The PFC buffer acts as a symmetry-breaker, its slowly drifting context creating an initial asymmetry that inhibitory feedback then irreversibly amplifies. The cerebellar component accelerated this transition by one epoch, confirming its role in convergence speed. The work provides a principled, biologically-inspired blueprint for designing hierarchical, persistent memory systems in next-generation sequence models, moving beyond purely engineering-driven architectures.
- The architecture requires a PFC-like working memory buffer to achieve functional lateralization, triggering a performance 'phase transition.'
- Adding the PFC buffer caused key metrics to shift dramatically in one step (P_ct from 0.25 to ~0.002, D_sep from 0.251 to 0.501).
- The model offers a neurobiologically-motivated blueprint for building persistent memory in AI, validated on episodic and rule-based tasks.
Why It Matters
This research provides a neuroscience-backed framework for building AI with more robust, human-like memory and reasoning capabilities, moving beyond black-box engineering.