Research & Papers

Bicameral Model lets two LMs coordinate via hidden states, boosting accuracy 2.7x

Two frozen language models talk through a neural interface, not text, with 1% extra parameters.

Deep Dive

A new paper from Cedric Flamant, Udaya Ghai, and Kanna Shimizu introduces the Bicameral Model, which enables two frozen language models to coordinate through a continuous, concurrent hidden-state channel rather than serialized text exchanges. The system uses a small trainable neural interface—a translation network and a learned suppression gate (totaling ~1% of combined parameters)—that learns a selective communication protocol solely from task loss, without a prescribed format. At every generation step, both models run in lockstep: a primary model drives the task while an auxiliary model operates external tools (calculator, Z3 solver, Python sandbox), with each conditioning on the other's activations.

Results are striking. On arithmetic, coupling two 0.5B models with a calculator boosted accuracy from 36% to 96%. On ZebraLogic logic grid puzzles, coupling two 0.6B models with a Z3 solver achieved 1.7× the performance of an unaugmented baseline. On mathematical reasoning, the auxiliary model generated problem-specific code from hidden-state signals alone, without ever seeing the problem text. This bidirectional coupling allows models to leverage tools without losing context to text serialization, suggesting a new paradigm for multi-model coordination that is both parameter-efficient and highly effective.

Key Points
  • Uses a trainable neural interface (translation network + gating) at only ~1% of combined model parameters.
  • Arithmetic accuracy jumped from 36% to 96% when coupling two 0.5B models with a calculator.
  • On ZebraLogic puzzles, coupling two 0.6B models with a Z3 solver achieved 1.7x the unaugmented baseline.

Why It Matters

Enables efficient tool use without text serialization, potentially unlocking more capable AI systems with minimal extra compute.