Quality-Driven Agentic Reasoning for LLM-Assisted Software Design: Questions-of-Thoughts (QoT) as a Time-Series Self-QA Chain
New 'Questions-of-Thoughts' method improves AI-generated API, data, and file system designs by up to 40% on quality scores.
Researchers Yen-Ku Liu and Yun-Cheng Tsai have introduced a novel framework called Questions-of-Thoughts (QoT) to tackle persistent quality issues in AI-assisted software development. The method acts as an inference-time scaffold that transforms a user's goal into a structured, ordered sequence of engineering steps. Crucially, it incorporates a time-series self-questioning chain, where the LLM verifies constraints and reduces omission errors at each step, maintaining a lightweight reasoning record to stabilize subsequent design decisions. This approach directly targets common failure modes like incomplete implementations, weak modularization, and inconsistent security practices that plague current LLM-generated code.
The team rigorously evaluated QoT across three complex backend engineering domains: API Design, Data Communication, and File Systems. They scored the AI-generated artifacts using a custom, ISO/IEC-inspired quality rubric measuring Scalability, Completeness, Modularity, and Security. Results showed the improvements are capacity-dependent; larger, more capable models like GPT-4 and Claude 3.5 Opus saw consistent and significant quality gains when using QoT, with domain-wise score improvements often exceeding 40%. However, smaller models sometimes faced trade-offs due to tight context windows and planning budgets. The researchers have released their full artifact—including prompts, scoring guidelines, and reproduction scripts—to support further applied AI and software engineering research.
- QoT is a scaffold that creates ordered engineering steps and a self-QA chain for LLMs, improving design verification.
- Tested on API, Data, and File System tasks, it boosted quality scores for larger models by over 40% on an ISO-inspired rubric.
- The framework is open-source, providing prompts and scripts to reproduce results for AI-assisted software engineering research.
Why It Matters
Provides a structured method to significantly improve the reliability and quality of AI-generated software architecture and code.