AI Safety

How to not do decision theory backwards

LessWrong AI March 17, 2026

⚡Argues against using 'verdict-level intuitions' to justify decisions, calling it a flawed methodology.

Deep Dive

In a viral post on LessWrong titled 'How to not do decision theory backwards,' Anthony DiGiovanni critiques a fundamental methodological error he sees in rationalist and philosophical circles. The error is starting with a strong, bottom-line intuition about what action to take (a 'verdict-level intuition') and then using that verdict to justify claims about how decisions should be made. He illustrates this with an exaggerated example of a friend who insists on playing a casino game simply because it 'seemed super intuitive,' then tries to justify it by adjusting probabilities to fit the desired outcome.

DiGiovanni argues this process is backwards. A verdict that 'I should choose A' is a claim that reasons exist for A; therefore, the justification must come from discovering and evaluating those reasons on their own merits, not from the brute intuition itself. He distinguishes between using intuitions as predictors of future reasoned conclusions and using them as direct normative expressions, advocating for the former. The methodology he proposes uses verdict-level intuitions only as clues to help discover potential reasons, which are then scrutinized independently.

The post's impact stems from its application to classic rationalist puzzles. DiGiovanni suggests that strong intuitions in scenarios like Pascal's Mugging (where a tiny probability of an astronomical payoff creates a dilemma) or a Prisoner's Dilemma with a perfect copy might lead people to adopt or reject decision theories (like certain forms of utilitarianism or causal decision theory) based on backward reasoning from a desired intuitive outcome. His framework questions whether such intuitions provide reliable evidence in contexts with poor feedback, which is often the case in these abstract philosophical problems.

Key Points

Critiques 'backwards' methodology: using a desired action verdict to justify decision rules, not the other way around.
Distinguishes 'verdict-level intuitions' from valid reasons; argues intuitions should be predictors, not justifications.
Challenges reasoning in high-stakes rationalist dilemmas like Pascal's Mugging and cooperative problems with agent copies.

Why It Matters

Forces a methodological rethink in AI alignment, philosophy, and rationalist communities about how foundational decisions are justified.

Read Original Article

How to not do decision theory backwards

Why It Matters

Stay Ahead in AI