Context Before Code: An Experience Report on Vibe Coding in Practice
A 2026 arXiv paper finds AI-generated code often fails on critical production constraints like multi-tenancy and access control.
A research team from the University of Jyväskylä has published a pivotal experience report on arXiv titled 'Context Before Code: An Experience Report on Vibe Coding in Practice.' The paper details their hands-on application of conversational 'vibe coding'—using tools like GitHub Copilot or ChatGPT—to build two production systems: a multi-project agent learning platform and an academic retrieval-augmented generation (RAG) system. Their key finding is that while AI dramatically accelerates initial scaffolding and integration, it consistently fails to infer and implement critical architectural constraints unless they are explicitly defined upfront.
The study reveals a significant shift in the software engineering workflow. Effort moves away from writing boilerplate code toward meticulously specifying and auditing constraints. The researchers identified recurring 'non-delegation zones'—areas like multi-tenancy isolation, role-based access control (RBAC), memory management policies, and asynchronous processing—where AI-generated code proved insufficient for production-grade reliability. For their agent platform, which required isolated projects with structured memory, the AI could not autonomously enforce the necessary isolation rules. Similarly, for the RAG system needing citation-grounded answers, explicit prompting for architectural guardrails was mandatory.
This report provides one of the first empirical looks at 'vibe coding' under real production constraints, moving beyond simple code snippets to complex system design. It concludes that success hinges on 'context before code': developers must invest heavily in upfront architectural design and explicit constraint definition. The AI acts as a powerful accelerator for implementation, but the responsibility for system integrity, security, and scalability firmly remains with human engineers who must verify and enforce these critical production boundaries.
- The team built a multi-project agent platform and a RAG system using vibe coding, finding it accelerated scaffolding but failed on implicit constraints.
- Critical 'non-delegation zones' for production AI include multi-tenancy, access control, and memory policies, requiring explicit human specification.
- Engineering effort shifts from writing boilerplate to defining and auditing architectural constraints, emphasizing 'context before code'.
Why It Matters
This study provides a crucial reality check for teams adopting AI coding, highlighting where human architectural oversight remains non-negotiable for building reliable systems.