Multi-agent LLM framework uses an iterative Extract-Evaluate-Improve loop to handle complex, semi-structured documents?

Multi-agent LLM framework uses an iterative Extract-Evaluate-Improve loop to handle complex, semi-structured documents

Achieves 0.97–0.99 correctness on 50 real-world Ethernet switch configuration manuals?

Achieves 0.97–0.99 correctness on 50 real-world Ethernet switch configuration manuals

LLM judgments agree substantially with human evaluators (Cohen's kappa ≥0.72), and generated KGs support downstream test case generation?

LLM judgments agree substantially with human evaluators (Cohen's kappa ≥0.72), and generated KGs support downstream test case generation

Developer Tools

Multi-agent LLM framework extracts knowledge graphs for Ethernet switch testing with 99% accuracy

arXiv cs.SE May 20, 2026

⚡A new approach turns semi-structured manuals into testable knowledge with near-perfect extraction correctness...

Deep Dive

A team of researchers led by Rongqi Pan has introduced a multi-agent LLM framework designed to extract structured knowledge graphs from Ethernet switch configuration manuals (ESCMs), a notoriously difficult document type due to semi-structured formatting, implicit step attributes, and complex cross-section dependencies. Their work, published on arXiv (paper 2605.19180), focuses on system testing automation but is intended as a general framework adaptable to other industrial domains. The approach uses a fine-grained KG schema and an iterative Extract-Evaluate-Improve (EEI) loop, where LLMs first extract candidate facts, then have them evaluated against ground truth, and finally refine extraction prompts for hard cases.

Testing on 50 real-world ESCMs, the framework achieved average extraction correctness scores between 0.97 and 0.99 across three extraction tasks using the original prompts. For challenging manuals, the EEI loop further boosted correctness through manual-specific prompt refinement. Importantly, LLM judgments showed substantial agreement with human evaluators — Cohen's kappa values exceeded 0.72 for all tasks. Industrial testers provided feedback that the resulting knowledge graphs enabled generation of useful and correct test case specifications (TCSs), paving the way for more automated system testing in networking and beyond.

Key Points

Multi-agent LLM framework uses an iterative Extract-Evaluate-Improve loop to handle complex, semi-structured documents
Achieves 0.97–0.99 correctness on 50 real-world Ethernet switch configuration manuals
LLM judgments agree substantially with human evaluators (Cohen's kappa ≥0.72), and generated KGs support downstream test case generation

Why It Matters

Turns messy technical manuals into machine-usable knowledge graphs, enabling automated system testing and reducing manual test design effort.

Read Original Article

Multi-agent LLM framework extracts knowledge graphs for Ethernet switch testing with 99% accuracy

Why It Matters

Related Articles

🚀 Stay Ahead in AI