Research & Papers

[P] VizPy: DSPy-compatible prompt optimizer that learns from failures automatically.

r/MachineLearning March 11, 2026

⚡New prompt optimizer automatically learns from failure patterns, delivering 29% gains on complex QA tasks without manual tweaking.

Deep Dive

VizOps has launched VizPy, a novel prompt optimization framework designed to work seamlessly with the popular DSPy library for building language model pipelines. Unlike traditional manual prompt engineering, VizPy automatically analyzes failure cases to improve system performance, offering two distinct optimization approaches depending on task requirements. The ContraPrompt method excels at multi-hop question answering and classification tasks by mining failure-to-success pairs to extract reasoning rules, achieving significant benchmark improvements including 29% gains on HotPotQA and 18% on GDPR-Bench compared to previous methods like GEPA.

For generation and mathematical reasoning tasks where simple retries don't converge, VizPy provides PromptGrad, which takes a gradient-inspired approach to failure analysis. Both methods integrate as drop-in replacements in existing DSPy workflows through simple API calls like optimizer.compile(program, trainset=trainset), eliminating the need for tedious manual prompt tweaking. This represents a shift toward more automated, data-driven optimization of language model systems, particularly valuable for developers building complex AI applications that require consistent performance across varied inputs.

Key Points

ContraPrompt method delivers 29% accuracy gains on HotPotQA benchmark for multi-hop QA
Drop-in DSPy compatibility requires minimal code changes: optimizer.compile(program, trainset=trainset)
Two specialized approaches: ContraPrompt for classification/QA, PromptGrad for generation/math tasks

Why It Matters

Automates the most tedious part of AI development—prompt engineering—saving developers hours while improving model accuracy significantly.

Read Original Article

[P] VizPy: DSPy-compatible prompt optimizer that learns from failures automatically.

Why It Matters

Stay Ahead in AI