Research & Papers

Field Machine: novel parallel architecture achieves O(1) inference with 1.7M tok/s

No attention, no recurrence, just cumulative sums—and 1.7M tokens per second on consumer hardware.

Deep Dive

A new architecture called the 'Field Machine' (FM) has been introduced by an independent developer (Reddit user TechnoVoyager) as a radical departure from mainstream sequence models. The core innovation is representing each token as structured 'DNA', projecting it into a high-dimensional field space, modulating by analytic position encoding, and then accumulating history using a single cumulative sum. This eliminates the need for attention mechanisms or recurrent connections—both of which are the backbone of transformers and RNNs. Inference is O(1) with respect to sequence length, meaning the state size stays constant forever, and no custom CUDA kernels are required.

The current implementation has 23.54M parameters, uses bf16 precision, and consumes approximately 1.21GB VRAM (plus ~5GB overhead) during training. On consumer hardware, it achieves up to 1.7 million tokens per second. The model has been trained on symbolic music, where REST tokens and beat positions are part of the vocabulary, treating silence and timing as first-class citizens. The developer acknowledges the architecture is not meant to replace transformers, but rather to explore the assumption that 'history can be accumulated into a field' rather than stored explicitly. The code is available on GitHub under the MIT License.

Key Points
  • Field Machine uses cumulative sum (cumsum) over projected token fields, replacing attention and recurrence entirely.
  • Inference is O(1) with constant state size; no custom CUDA needed; runs 1.7M tok/s on consumer GPU.
  • Current prototype (23.54M params) trained on symbolic music, treating silence and timing as learnable tokens.

Why It Matters

Challenges the assumption that sequence models need explicit history storage, enabling ultra-fast, constant-memory inference for long sequences.