AI Safety

New Paper: ML Models Are Interventions, Not Representations — 13 Theses

Challenging decades of epistemology, this paper redefines how we validate AI in government.

Deep Dive

A new paper from researchers Adolfo De Unánue and Fernanda Sobrino, published on arXiv, systematically dismantles the epistemological foundations of applied machine learning, especially in high-stakes domains like government, public health, and criminal justice. Titled "Machine Learning as Performative Materialist Practice: Thirteen Theses on the Epistemology, Methodology, and Politics of Applied ML," the work argues that standard ML practices—such as treating models as stable representations, validation as context-free, and metrics as politically neutral—are fundamentally flawed. Drawing on Pickering's cybernetic ontology, the performativity literature from economic sociology (Callon, MacKenzie), Simon's bounded rationality, and the formalization of performative prediction (Perdomo et al., 2020), the authors offer a unified alternative framework.

The 13 theses present ML models as temporally situated compressions that act as instruments of intervention, not mirrors of reality. The paper emphasizes that the full "data product" is a complex adaptive system coevolving with its target, navigating a multi-objective space no single algorithm can optimize. Validity, they argue, is performative—measured by effects in the world, not by formal properties. Critically, choices embedded in objective functions, fairness criteria, and resource thresholds are political decisions that belong to stakeholders, not just technicians. The authors unify practical prescriptions—temporal cross-validation, precision and recall at k, pipeline-aware fairness auditing, satisficing over optimizing—as consequences of this materialist epistemology rather than isolated best practices.

Key Points
  • Models are best understood as temporally situated compressions that function as instruments of intervention, not truth-seeking representations.
  • Validity is fundamentally performative—measured by real-world effects rather than formal model properties on held-out data.
  • All choices in objective functions, fairness criteria, and resource thresholds are political decisions belonging to stakeholders, not just technicians.

Why It Matters

If adopted, this framework could reshape how governments audit, validate, and deploy ML systems for public decision-making.