Research & Papers

Form Follows Function: Recursive Stem Model

New AI architecture achieves 97.5% Sudoku accuracy with just ~1 hour of training on a single A100 GPU.

Deep Dive

Researcher Navid Hakimi has introduced the Recursive Stem Model (RSM), a new architecture designed to overcome key limitations in existing recursive reasoning models like HRM and TRM. These models use small, weight-shared networks to solve complex puzzles through iterative refinement but suffer from slow, unstable training that can bias them toward greedy solutions. RSM changes the training contract by fully detaching the hidden-state history, treating early iterations as "warm-up," and applying loss only at the final step. This approach, combined with a stochastic outer-transition scheme, enables stable training at greater depths, resulting in a >20x speedup and a ~5x reduction in error rate compared to TRM.

The breakthrough capability is test-time scaling. While trained on a shallow recursion depth (H_train ~20), RSM can run inference for arbitrarily many refinement steps (H_test ~20,000), allowing the model to "think" much longer without any retraining. On the Sudoku-Extreme benchmark, this approach achieved 97.5% exact accuracy after only about one hour of training on a single A100 GPU. On a 30x30 Maze-Hard task, it reached ~80% accuracy in ~40 minutes.

Furthermore, RSM's iterative settling process provides a built-in reliability metric. Non-converging trajectories signal that the model hasn't found a viable solution, acting as a guard against hallucination. Stable fixed points can then be paired with external verifiers for practical correctness checks, offering a path toward more trustworthy AI reasoning systems.

Key Points
  • Trains >20x faster than Tiny Recursive Model (TRM) with ~5x lower error rate.
  • Enables test-time scaling: inference can run for ~20,000 steps vs. 20 trained, allowing unlimited "thinking".
  • Achieves 97.5% accuracy on Sudoku-Extreme with just ~1 hour of training on one A100.

Why It Matters

Paves the way for AI that can reason deeply on complex problems with less compute, offering built-in reliability signals against hallucinations.