Research & Papers

[P] Run Karpathy's Autoresearch for $0.44 instead of $24 — Open-source parallel evolution pipeline on SageMaker Spot

Parallel evolution pipeline cuts AI research costs 18x, running 25 experiments for just $0.44 total.

Deep Dive

A developer has open-sourced a parallel evolution pipeline that dramatically reduces the cost of running Andrej Karpathy's autoresearch framework—an AI agent system that autonomously modifies training code and runs experiments overnight. Built on AWS SageMaker Managed Spot Training, the pipeline executes 4 experiments simultaneously using a "Hurry Up and Get Idle" (HUGI) pattern where GPUs spin up for just 5 minutes per experiment then terminate immediately, eliminating idle costs. The system supports various GPUs including H100, L40S, and A10G, automatically detecting and falling back gracefully when needed.

In a real-world test, the pipeline ran 25 autonomous ML experiments across 5 generations for just $0.44 using L40S Spot instances (ml.g7e.2xlarge in us-east-1), compared to approximately $24 for the same work on an H100. The system achieved 2.3x faster wall-clock time (3.5 hours vs 8 hours) while discovering that EMBEDDING_LR was the most sensitive parameter, improving validation bits-per-byte from 1.0656 to 1.0643. The project includes practical insights about spot instance capacity variations across regions and transferable findings where architecture rankings discovered on cheap L40S GPUs ($0.04/experiment) reliably transfer to expensive H100s for production training.

The entire project was developed through conversational AI coding with Claude Code in a single 13-hour session and is documented as an 8-chapter "vibe coding" tutorial. This approach makes sophisticated AI research accessible to developers without access to expensive H100 hardware, democratizing autonomous machine learning experimentation through cloud optimization techniques.

Key Points
  • Runs 25 autonomous ML experiments for $0.44 total using SageMaker Spot instances (18x cheaper than H100)
  • Achieves 4x parallel execution with 2.3x faster runtime using HUGI pattern (zero idle GPU costs)
  • Includes 8-chapter tutorial built via Claude Code in 13 hours, making AI research accessible

Why It Matters

Democratizes AI research by making autonomous experimentation affordable and accessible without expensive hardware.