Research & Papers

Researchers' SCA framework pinpoints step-level reasoning errors in black-box LLMs

SCA boosts self-correction success by 13.5% by exposing exactly where LLM reasoning fails.

Deep Dive

Large language models (LLMs) excel at generating step-by-step solutions for reasoning tasks, but pinpointing exactly where a chain of thought goes haywire remains a challenge—especially for black-box (closed-source) models like GPT-4 or Claude. Existing confidence estimation methods either only assess the final answer or require access to internal probabilities, which proprietary APIs don't expose. A new paper from Xiaoou Liu and colleagues introduces Stepwise Confidence Attribution (SCA), a framework that assigns a confidence score to each individual reasoning step using only the model's generated text. SCA leverages the Information Bottleneck principle: steps that align with consensus patterns across multiple correct solutions receive high confidence, while deviations are flagged as suspicious. This approach works entirely without model internals, making it applicable to any commercial LLM.

SCA comes in two variants: NIBS (non-parametric Information Bottleneck scoring) measures consistency without needing graph structures, and GIBS uses a differentiable mask over graph subgraphs to capture logical variability in reasoning. In extensive tests on mathematical reasoning (e.g., GSM8K) and multi-hop question answering (e.g., HotpotQA), SCA reliably identified steps that were highly correlated with incorrect answers. More importantly, feeding these step-level confidence scores back into the model to guide self-correction boosted the success rate of fixing errors by up to 13.5% compared to simply telling the model its final answer was wrong. This work, accepted at ICML 2026, offers a practical way to make black-box LLMs more transparent and self-correcting without needing proprietary model access.

Key Points
  • SCA works on any closed-source LLM by analyzing only the generated reasoning traces—no internal weights or logits required.
  • Uses the Information Bottleneck principle to separate high-confidence consensus steps from low-confidence deviations.
  • Applied to self-correction, SCA's step-level feedback improves error-fixing success by up to 13.5% over answer-level feedback.

Why It Matters

Makes black-box LLMs more debuggable, enabling reliable multi-step reasoning and targeted self-correction in production systems.