Research & Papers

[P] preflight, a pre-training validator for PyTorch I built after losing 3 days to label leakage

Open-source CLI prevents label leakage, dead gradients, and other silent failures that cost 3 days of debugging.

Deep Dive

A developer's costly debugging experience has led to a new open-source tool designed to save machine learning engineers from silent training failures. After wasting three days discovering a model was learning nothing due to label leakage between training and validation sets, Rusheel built 'preflight,' a PyTorch validator that runs 10 critical checks before any GPU time is consumed. The tool scans for fatal issues like data contamination, problematic gradients, memory overflows, and configuration errors that cause models to fail silently.

Available as a pip-installable CLI, preflight operates on three severity tiers (fatal/warn/info) and integrates directly into CI/CD pipelines by exiting with code 1 on critical failures. It specifically targets the painful gap between code execution and successful model training, checking for NaNs, label leakage, incorrect channel ordering, dead gradients, class imbalance, and VRAM requirements. The developer emphasizes it's not meant to replace frameworks like pytest or Deepchecks, but to serve as an essential pre-training safeguard.

The tool is in early development (v0.1.1), and the creator is actively seeking community feedback on priority checks and contributions. Each check requires only a passing test, a failing test, and a fix hint to implement, lowering the barrier for community expansion. By catching these issues before expensive training runs begin, preflight aims to prevent the all-too-common scenario where models appear to run perfectly but learn absolutely nothing.

Key Points
  • Catches 10 types of silent training failures including label leakage, dead gradients, and VRAM overflows
  • Exits with code 1 on fatal errors to block CI/CD pipelines automatically
  • Open-source and extensible—each new check requires just a passing test, failing test, and fix hint

Why It Matters

Prevents wasted GPU compute and days of debugging by catching silent training failures before they cost time and money.