Pre‑training uses next‑token prediction on trillions of token pairs, but inference involves sampling from probability distributions, not deterministic guessing?

Pre‑training uses next‑token prediction on trillions of token pairs, but inference involves sampling from probability distributions, not deterministic guessing.

The training regime forces models to learn grammar, facts, and narrative logic—e.g., predicting 'eighteen' from a math textbook or a murderer's name from a mystery?

The training regime forces models to learn grammar, facts, and narrative logic—e.g., predicting 'eighteen' from a math textbook or a murderer's name from a mystery.

Calling LLMs 'next token predictors' fuels philosophical debates about cognition, but the author argues it misrepresents how models actually generate text?

Calling LLMs 'next token predictors' fuels philosophical debates about cognition, but the author argues it misrepresents how models actually generate text.

AI Safety

AI Expert Argues 'Next Token Prediction' Is a Misleading Term for LLMs

LessWrong AI May 17, 2026

⚡Calling LLMs 'glorified autocomplete' ignores how they learn deep structure through training.

Deep Dive

The author argues that describing LLMs as 'next token predictors' is misleading and inaccurate. While pre-training uses next token prediction, during inference the model outputs a probability distribution and a token is randomly picked from it. This process forces the model to learn language, grammar, and content, such as math and narrative understanding, challenging the common dismissal of LLM cognition.

Key Points

Pre‑training uses next‑token prediction on trillions of token pairs, but inference involves sampling from probability distributions, not deterministic guessing.
The training regime forces models to learn grammar, facts, and narrative logic—e.g., predicting 'eighteen' from a math textbook or a murderer's name from a mystery.
Calling LLMs 'next token predictors' fuels philosophical debates about cognition, but the author argues it misrepresents how models actually generate text.

Why It Matters

Reframes the debate on AI cognition, urging professionals to move beyond simplistic 'autocomplete' narratives.

Read Original Article

AI Expert Argues 'Next Token Prediction' Is a Misleading Term for LLMs

Why It Matters

Related Articles

🚀 Stay Ahead in AI