Essay claims reasoning models produce traces and answers from the same operation, making faithful inference impossible?

Essay claims reasoning models produce traces and answers from the same operation, making faithful inference impossible.

Engages with architectural lineages including HRM, TRM, GRAM, AlphaProof, and Kona/Aleph, contrasting them with empirical critiques?

Engages with architectural lineages including HRM, TRM, GRAM, AlphaProof, and Kona/Aleph, contrasting them with empirical critiques.

Introduces a constraint-vs-influence framing to analyze how traces are generated and why they cannot be faithful?

Introduces a constraint-vs-influence framing to analyze how traces are generated and why they cannot be faithful.

Research & Papers

Reasoning models can't be faithful: new essay challenges LLM inference

r/MachineLearning May 26, 2026

⚡A substack essay argues that reasoning traces and answers come from the same operation, breaking faithful inference.

Deep Dive

A new essay published on Substack by mauhaq argues that reasoning models—architectures like HRM, TRM, GRAM, AlphaProof, and Kona/Aleph—cannot perform faithful inference. The core claim is that because the reasoning trace and the final answer are produced by the same generative operation, the trace inherently aligns with the output, making it impossible to separate the model's actual reasoning from post-hoc rationalization. The author engages with empirical critiques from Lanham, Turpin, and Mirzadeh, while contrasting these with the architectural lineage of the models mentioned.

The essay introduces a constraint-versus-influence framing to analyze how reasoning traces function. It posits that current reasoning models are designed to produce coherent narratives rather than faithfully representing internal inference steps. This challenges the assumption that chain-of-thought or similar reasoning traces provide transparent, interpretable insights into model decision-making. For AI developers and researchers, the implication is that improving model faithfulness may require fundamentally different architectural approaches, not just more verbose reasoning traces.

Key Points

Essay claims reasoning models produce traces and answers from the same operation, making faithful inference impossible.
Engages with architectural lineages including HRM, TRM, GRAM, AlphaProof, and Kona/Aleph, contrasting them with empirical critiques.
Introduces a constraint-vs-influence framing to analyze how traces are generated and why they cannot be faithful.

Why It Matters

This challenges the reliability of reasoning traces for interpretability, pushing AI developers to rethink model transparency approaches.

Read Original Article

Reasoning models can't be faithful: new essay challenges LLM inference

Why It Matters

Related Articles

🚀 Stay Ahead in AI