AI Safety

Early Academic Capital as the Causal Origin of Dropout in Constrained Educational Systems -- Evidence from Longitudinal Data and Structural Causal Models

Dropping out may start long before any course failure.

Deep Dive

A new paper by Hugo Roger Paz on arXiv (cs.CY) uses causal computational social science to trace dropout origins in constrained engineering curricula. Analyzing longitudinal data from 16,868 students who survived to their second active term, the study defines treatment as low early academic capital—passing at most one subject by the end of the second term. Using G-estimation of structural nested mean models and marginal structural models with inverse probability weighting, the author finds a robust causal effect: low early progress increases three-year dropout probability by 25.3 percentage points (G-estimation) and 27.4 percentage points (IPTW). This effect is roughly double the estimated impact of later events like first-time gateway-course repetition (12.7 pp).

The findings challenge conventional views that focus on isolated academic failures as dropout triggers. Instead, Paz argues that dropout originates in early trajectory misalignment between student progress and system-imposed temporal constraints. The study's leakage-free panel design and causal modeling framework provide strong evidence for early-stage intervention, suggesting that universities should shift resources from remedial support after failures to proactive scaffolding in the first two terms. For engineers and AI researchers, this work demonstrates how causal ML methods can uncover hidden structural dynamics in education systems, with direct implications for designing adaptive learning pathways and early warning systems.

Key Points
  • Low early academic capital (≤1 subject passed by second term) increases dropout by 25.3 pp (G-estimation).
  • Effect is 2x larger than later gateway-course repetition (12.7 pp).
  • Study uses causal ML methods (G-estimation, IPTW) on 16,868 student records.

Why It Matters

Universities can use causal AI to identify at-risk students early, not after failures.