Phase-Associative Memory: Sequence Modeling in Complex Hilbert Space
A 100M-parameter model using complex numbers achieves perplexity within 10% of a transformer, despite 4x computational overhead.
Researchers Gowrav Vishwakarma and Christopher J. Agostino have introduced Phase-Associative Memory (PAM), a novel recurrent sequence model that departs from standard real-number architectures by operating entirely in complex Hilbert space. In PAM, all representations are complex-valued, associations accumulate in a matrix state via outer products, and retrieval uses a conjugate inner product. The team trained a ~100M parameter version on the WikiText-103 benchmark, where it achieved a validation perplexity of 30.0. This performance is notable because it comes within approximately 10% of a matched transformer model's score of 27.1, despite PAM incurring a 4x arithmetic overhead from complex-number computations and running without custom, optimized kernels.
The paper details an experimental journey from simpler vector-state models, where 'holographic binding' failed due to capacity limitations, to the successful matrix-state approach of PAM. The authors argue that the competitiveness of an architecture built on complex-valued superposition and conjugate retrieval aligns with recent empirical evidence suggesting that semantic interpretation in both humans and large language models (LLMs) exhibits 'non-classical contextuality.' This challenges purely classical computational formalisms for language. The work implies that the mathematical foundations of AI—specifically the choice between real and complex number systems—could have profound implications for modeling the nuanced, contextual nature of meaning.
- PAM is a 100M-parameter recurrent model using complex-valued representations, achieving a perplexity of 30.0 on WikiText-103.
- It performs within 10% of a comparable transformer's score (27.1) despite a 4x computational overhead from complex math.
- The authors connect its success to theories of 'non-classical contextuality' in semantic understanding, suggesting new AI foundations.
Why It Matters
Challenges the dominance of transformer architecture and suggests complex-number math may be better suited for modeling the contextual nature of language.