Statistical Software Engineering with Tuned Variables
Argues AI systems are not static code, but evolving spaces of tuned variables requiring statistical governance.
In a new position paper, researcher Nimrod Busany challenges traditional software engineering paradigms for AI systems. The paper argues that the core artifact of an AI-enabled system is not static code plus settings, but a dynamic, version-controlled 'governed program space.' This space encompasses domains, structural constraints, evaluation assets, and a statistical release gate. The central thesis is that choices like model selection, prompt structure, and operational thresholds are not fixed assignments but 'tuned variables'—program variables that must be actively maintained and governed as external conditions like API changes, data drift, and shifting business objectives evolve.
Building on prior Software Engineering for AI (SE4AI) work, the paper positions this governed space as the primary object of engineering focus. The 'statistical' aspect is key: promoting changes relies on sampled evaluation sets, estimated evidence, effect-size margins, and confidence thresholds, rather than deterministic passes/fails. This framework addresses the reality that AI systems operate in a fluid world where a configuration valid today may be obsolete tomorrow. It provides a formal structure for the continuous, evidence-based tuning that teams often perform ad-hoc, aiming to bring rigor to the maintenance of modern, non-deterministic software.
- Redefines AI systems as a 'governed program space' of tuned variables like prompts and model choices, not static code.
- Proposes statistical governance using sampled evaluations and confidence thresholds for system updates, not fixed assignments.
- Addresses real-world challenges like API changes, data drift, and evolving cost/safety objectives that break traditional SE models.
Why It Matters
Provides a formal engineering framework for the continuous, evidence-based maintenance required by real-world AI applications.