Use more text than one token to avoid neuralese
A viral post argues current token sampling is a 'pretty big squash' of AI potential.
A viral technical post argues that compressing a transformer's output vector into a single token for the next input is a massive information bottleneck, calling it a 'pretty big squash.' The proposed solution, dubbed 'neuralese,' involves passing the full vector or multi-token sequences to preserve bandwidth. This challenges the assumption that scaling AI requires abandoning natural language, suggesting richer text intermediates could unlock more capability without losing interpretability.
Why It Matters
This rethinking of AI architecture could dramatically improve model efficiency and reasoning depth, impacting how all future LLMs are built.