Developer Tools

Five Fatal Assumptions: Why T-Shirt Sizing Systematically Fails for AI Projects

A new paper reveals why traditional agile estimation breaks down for LLM and multi-agent system development.

Deep Dive

A new research paper from Raja Soundaramourty, Ozkan Kilic, and Ramu Chenchaiah, published on arXiv, delivers a critical analysis of why traditional agile estimation methods are fundamentally flawed for AI projects. The paper, 'Five Fatal Assumptions: Why T-Shirt Sizing Systematically Fails for AI Projects,' argues that the simplicity of T-shirt sizing (XS, S, M, L, XL) leads to systematic failure when applied to initiatives involving large language models (LLMs) and multi-agent systems. The core of the argument lies in five assumptions that underpin traditional estimation: linear effort scaling, repeatability from prior experience, effort-duration fungibility, task decomposability, and deterministic completion criteria. The authors demonstrate that AI development, with its non-linear performance jumps, complex interaction surfaces, and 'tight coupling' of components, systematically breaks these rules. A small change in data or prompt can cascade through an entire LLM-based application, making upfront estimates highly unreliable. To address this, the researchers propose 'Checkpoint Sizing,' a more adaptive framework. This approach replaces a single upfront estimate with iterative decision gates where scope and feasibility are continuously reassessed based on empirical learnings during development. This shift moves planning from a predictive exercise to a learning-oriented process, acknowledging the inherent uncertainty and experimentation required in modern AI development. The paper is targeted at engineering managers, technical leads, and product owners who are responsible for planning and delivering AI initiatives, providing them with an evidence-backed rationale for changing their estimation practices.

Key Points
  • Identifies five assumptions (e.g., linear scaling, deterministic completion) that hold for traditional software but fail for AI/LLM projects.
  • Highlights unique AI challenges like non-linear performance jumps and 'tight coupling' where small data changes cascade.
  • Proposes 'Checkpoint Sizing,' an iterative approach with decision gates to reassess scope based on development learnings.

Why It Matters

Provides a framework for more accurate AI project planning, preventing costly overruns and misaligned expectations for teams building with LLMs.