Argues against aligning AI to aggregated human preferences, citing real-world societal failures from those values?

Argues against aligning AI to aggregated human preferences, citing real-world societal failures from those values.

Proposes a non-negotiable floor of competence, factual accuracy, honesty, and lawfulness for AI alignment?

Proposes a non-negotiable floor of competence, factual accuracy, honesty, and lawfulness for AI alignment.

Pluralism allowed only at surface level (language, conventions) and not for values violating the floor?

Pluralism allowed only at surface level (language, conventions) and not for values violating the floor.

AI Safety

New paper argues: Align AI to aspirations, not flawed human preferences

arXiv cs.CY June 15, 2026

⚡Researchers propose a non-negotiable floor of competence, honesty, and lawfulness for AI alignment

Deep Dive

In a new position paper published on arXiv (2606.13755), authors Nikita Kazeev and Bui Nhat Huyen Phan challenge the dominant approach to AI alignment: training models to reflect aggregated human preferences. They argue that human values have produced societies that thrive or fail—from failed states and extreme inequality to declining happiness and political polarization in wealthy democracies. The pluralistic-alignment program correctly identifies that there is no single 'humanity,' but taking it as the main directive is dangerous.

Instead, the authors propose a non-negotiable floor of objective alignment goals: competence constrained by factual accuracy, honesty, and lawfulness. Pluralism should exist only at the surface level—in language, conventions, and legitimate value tradeoffs that respect that floor. They offer four constructive commitments and address six objections, including commercial pressure, democratic legitimacy, and concerns that the floor itself is culturally laden. The paper was presented at the Pluralistic Alignment Workshop at ICML 2026.

Key Points

Argues against aligning AI to aggregated human preferences, citing real-world societal failures from those values.
Proposes a non-negotiable floor of competence, factual accuracy, honesty, and lawfulness for AI alignment.
Pluralism allowed only at surface level (language, conventions) and not for values violating the floor.

Why It Matters

Could reshape how AI safety researchers think about value alignment, moving from preference aggregation to objective guardrails.

Read Original Article

New paper argues: Align AI to aspirations, not flawed human preferences

Why It Matters

Related Articles

🚀 Stay Ahead in AI