[P] VizPy: DSPy-compatible prompt optimizer that learns from failures automatically.
New prompt optimizer automatically learns from failure patterns, delivering 29% gains on complex QA tasks without manual tweaking.
VizOps has launched VizPy, a novel prompt optimization framework designed to work seamlessly with the popular DSPy library for building language model pipelines. Unlike traditional manual prompt engineering, VizPy automatically analyzes failure cases to improve system performance, offering two distinct optimization approaches depending on task requirements. The ContraPrompt method excels at multi-hop question answering and classification tasks by mining failure-to-success pairs to extract reasoning rules, achieving significant benchmark improvements including 29% gains on HotPotQA and 18% on GDPR-Bench compared to previous methods like GEPA.
For generation and mathematical reasoning tasks where simple retries don't converge, VizPy provides PromptGrad, which takes a gradient-inspired approach to failure analysis. Both methods integrate as drop-in replacements in existing DSPy workflows through simple API calls like optimizer.compile(program, trainset=trainset), eliminating the need for tedious manual prompt tweaking. This represents a shift toward more automated, data-driven optimization of language model systems, particularly valuable for developers building complex AI applications that require consistent performance across varied inputs.
- ContraPrompt method delivers 29% accuracy gains on HotPotQA benchmark for multi-hop QA
- Drop-in DSPy compatibility requires minimal code changes: optimizer.compile(program, trainset=trainset)
- Two specialized approaches: ContraPrompt for classification/QA, PromptGrad for generation/math tasks
Why It Matters
Automates the most tedious part of AI development—prompt engineering—saving developers hours while improving model accuracy significantly.