Code2UML uses 5 Claude agents to auto-generate UML from code
91.5% syntactic validity across 12 repos and 4 languages with zero LLM calls for compaction
Get AI news that actually matters
One email a day. Zero fluff. Join 10,000+ professionals.
A new paper from researchers at the University of Bucharest introduces Code2UML, an agentic system that automatically generates UML diagrams from source code using large language models. The architecture employs five specialized agents built on the Claude Agent SDK: PlannerAgent, AnalyzerAgent, DiagramAgent, CorrectorAgent, and DependencyAnalyzerAgent, each handling a distinct cognitive subtask. A key innovation is a deterministic, importance-weighted intermediate representation compaction layer that transforms full project IRs into diagram-specific views guaranteed to fit within token constraints — requiring no LLM calls and completing in milliseconds.
Code2UML was evaluated across 12 open-source repositories in four programming languages (Java, JavaScript, PHP, Python) and seven UML diagram types, producing 84 observations assessed on five automated metrics. Results showed high syntactic validity (mean 91.5%, with component and deployment diagrams reaching 100%), strong relationship precision (mean 0.858), and consistent structural quality (mean 81.7/100 with cross-language variance of only 3.1 points). Entity recall averaged 0.313, reflecting deliberate architectural prioritization over exhaustive coverage. A sensitivity analysis confirmed that quality scores remain stable across scales from 31 to 4,578 IR entities, demonstrating true scalability.
- Five specialized agents (Planner, Analyzer, Diagram, Corrector, DependencyAnalyzer) built on the Claude Agent SDK automate the full UML generation pipeline
- Deterministic IR compaction runs in milliseconds with zero LLM calls, ensuring token-limit compliance even for large codebases (tested up to 4,578 entities)
- Achieved 91.5% syntactic validity, 0.858 relationship precision, and 81.7/100 structural quality across 84 configurations in 4 languages
Why It Matters
Automates tedious software documentation, making UML generation practical for real-world codebases without manual trimming or expensive LLM calls.