Open Source

I scaled a pure Spiking Neural Network (SNN) to 1.088B parameters from scratch. Ran out of budget, but here is what I found

An indie developer successfully trained a 1.088B-parameter SNN from random initialization, achieving 93% neuron sparsity.

Deep Dive

An 18-year-old independent developer has achieved a significant milestone in neuromorphic computing by successfully training a 1.088 billion-parameter Spiking Neural Network (SNN) language model from random initialization. This challenges the prevailing assumption in research papers like SpikeBERT that training billion-parameter SNNs directly fails due to vanishing gradients, forcing researchers to use ANN-to-SNN conversion or distillation techniques instead. The developer trained the model for 27,000 steps before running out of budget, achieving a loss of 4.4 and demonstrating that pure spike-domain training at scale is feasible.

The model exhibited remarkable emergent properties, including maintaining approximately 93% sparsity—meaning only 7% of neurons fire per token—which translates to dramatically reduced memory usage during inference compared to dense models like GPT. Around step 25,000, it spontaneously began generating structurally correct Russian text despite no explicit targeting in its training data. Furthermore, as the architecture scaled past 600 million parameters, the model autonomously shifted 39% of its activation routing into a persistent memory module, effectively learning that memory becomes more valuable at larger scales.

While the text generation remains "janky" and not yet fluent like GPT-2, the experiment provides a crucial proof-of-concept for scalable SNN training. The developer has open-sourced the complete code, architecture details, and a 12GB training checkpoint on GitHub, inviting technical feedback and collaboration, particularly regarding optimization for neuromorphic hardware such as Intel's Loihi chip. This work opens new pathways for developing ultra-low-power AI systems that mimic biological brain efficiency.

Key Points
  • Trained a 1.088B-parameter Spiking Neural Network from random initialization, achieving 93% neuron sparsity (only 7% fire per token).
  • Model spontaneously generated correct Russian text at 25k steps and shifted 39% of activations to memory at scale.
  • Open-sourced code and 12GB checkpoint; demonstrates pure SNN training is possible, critical for neuromorphic hardware efficiency.

Why It Matters

Proves large-scale SNNs can be trained directly, paving the way for ultra-efficient AI on neuromorphic chips with 10x lower power consumption.