Agent Frameworks

A Role-Based LLM Framework for Structured Information Extraction from Healthy Food Policies

A new framework uses three specialized AI agents to extract structured data from 608 complex policy documents.

Deep Dive

A team of researchers led by Congjing Zhang has proposed a novel framework that uses a role-based approach to improve how large language models (LLMs) extract structured information from complex policy documents. The system tackles a common problem in legal and health policy analysis: standard LLM approaches often produce misinformation, hallucinations, misclassifications, and omissions due to the structural diversity and inconsistency of source texts. To solve this, the framework mimics an expert workflow by assigning three distinct, specialized roles to a single Llama-3.3-70B model. Each role—Policy Analyst, Legal Strategy Specialist, and Food System Expert—is guided by a custom prompt containing structured domain knowledge, such as explicit definitions of legal mechanisms and classification criteria.

The researchers rigorously evaluated their framework using 608 healthy food policies from the Healthy Food Policy Project (HFPP) database. They compared its performance against standard baseline methods including zero-shot prompting, few-shot prompting, and chain-of-thought (CoT) reasoning. The results demonstrated that the role-based framework achieved superior performance, particularly in complex reasoning tasks required for accurate metadata extraction, mechanism classification, and identification of legal approaches. This structured, multi-agent methodology offers a more reliable and transparent path to automating the tedious process of analyzing thousands of pages of unstructured policy data, which is critical for researchers, advocates, and policymakers.

Key Points
  • Uses three specialized AI roles (Policy Analyst, Legal Specialist, Food Expert) within a single Llama-3.3-70B model to mimic expert workflows.
  • Tested on 608 real-world healthy food policies, outperforming standard zero-shot, few-shot, and chain-of-thought baselines.
  • Reduces LLM hallucinations and errors by incorporating structured domain knowledge and definitions into role-specific prompts.

Why It Matters

Automates the labor-intensive analysis of complex legal documents, enabling faster, more accurate policy research and compliance tracking.