Research & Papers

[D] Probabilistic Neuron Activation in Predictive Coding Algorithm using 1 Bit LLM Architecture

A viral post argues for ditching backpropagation for a probabilistic, hardware-native AI approach.

Deep Dive

A viral Reddit post by user Sevdat proposes a radical shift in AI architecture, suggesting the combination of predictive coding algorithms with a 1-bit Large Language Model (LLM) design. The core idea is to eliminate the standard backpropagation training method, which the author argues is ill-suited for non-deterministic, random systems. Instead, each artificial neuron would simply activate or not based on a calculated probability. This approach, the post claims, would dramatically increase computational efficiency and reduce memory usage, but it hinges on the development of new, specialized "stochastic hardware" that can natively handle randomness at the transistor level.

The proposed system would work by having the AI constantly re-prompt itself, generating multiple outputs from a single input until a satisfactory answer is found. Memory would be stored in RAM, allowing the model to pull necessary information to retrain its weights for specific questions on the fly, a process argued to prevent catastrophic forgetting. The author points to the physics of the hardware itself—using sources like thermal noise to probabilistically activate components—as the key to making this viable. They cite Extropic's Thermodynamic Computing Unit (TSU) as the closest existing effort, while criticizing the industry's continued focus on scaling current architectures as stagnant and wasteful, warning of an impending "AI bubble" without such hardware advancements.

Key Points
  • Architecture combines predictive coding with 1-bit LLMs to replace backpropagation, using probabilistic neuron activation.
  • Requires new stochastic hardware that uses physical noise (e.g., heat) at the transistor level to act as neurons.
  • Aims to prevent catastrophic forgetting by storing memory in RAM for on-the-fly retraining per query.

Why It Matters

Challenges the fundamental training paradigm of modern AI, proposing a path to greater efficiency and adaptability beyond simple model scaling.