AI Safety

Aligning to Virtues

LessWrong AI February 16, 2026

⚡A radical new proposal argues we've been aligning AI to the wrong targets.

Deep Dive

A new post argues current AI alignment targets—consequentialist values, deontological rules, or pure obedience—all lead to dangerous power struggles. The author proposes a novel alternative: aligning AI to common-sense virtues like integrity, kindness, and honor, similar to how we'd want human leaders to behave. This 'virtue ethics' approach is presented as a more robust and desirable long-term solution to the core safety problem, with more details promised in future posts.

Why It Matters

This could fundamentally reshape how labs build safe AI, moving beyond current flawed paradigms.

Read Original Article

Aligning to Virtues

Why It Matters

Stay Ahead in AI