ProcureGym: A Multi-Agent Markov Game Framework for Modeling National Volume-based Drug Procurement
Researchers built a multi-agent AI simulation using data from 2,267 firms and 325 drugs to test procurement strategies.
A team of researchers from Chinese institutions has introduced ProcureGym, a sophisticated multi-agent simulation platform designed to model China's complex National Volume-Based Procurement (NVBP) system for pharmaceuticals. The framework treats the drug procurement process as a Markov game—a sequential decision-making environment where multiple participants (pharmaceutical firms) interact strategically. What sets ProcureGym apart is its foundation in extensive real-world data: it incorporates information from seven actual NVBP rounds, encompassing 325 different drugs and the bidding behaviors of 2,267 distinct firms. This data-driven approach creates a high-fidelity environment for testing procurement strategies and policy impacts before they affect the real market.
Within this simulated marketplace, the researchers evaluated the performance of various AI agent types, including Reinforcement Learning (RL) agents, Large Language Model (LLM)-based agents, and traditional rule-based algorithms. Their experiments revealed that RL agents consistently outperformed others, achieving superior alignment with actual historical auction winners and generating higher simulated profits. Further analysis identified that two key factors—the maximum valid bidding price set by regulators and the total procurement volume—dominate strategic outcomes for competing firms. By providing a rigorous, computational instrument, ProcureGym enables policymakers and pharmaceutical companies to simulate and assess the potential effects of different pricing rules, volume allocations, and firm strategies, moving beyond theoretical models to data-informed decision-making.
- Built on 7 rounds of real NVBP data covering 325 drugs and 2,267 pharmaceutical firms
- Tests RL, LLM, and rule-based AI agents; RL agents showed best performance and profit alignment
- Identifies maximum valid bid price and procurement volume as dominant strategic factors
Why It Matters
Enables data-driven policy testing and strategic planning for multi-billion dollar drug procurement markets, reducing real-world trial-and-error.