Researchers' deterministic AI workflow hits 75% accuracy on HS tariff classification
New agentic workflow beats LLMs on complex multi-dimensional rule reasoning for tariff codes.
A team of researchers has introduced a deterministic agentic workflow for Harmonized System (HS) tariff classification, a high-stakes task requiring mapping product descriptions to specific six- or eight-digit codes under complex priority rules. Unlike self-planning agents, this workflow uses a fixed six-stage pipeline where language model calls are confined to narrow, interpretable steps. Each decision is decomposed into structured outputs with verbatim citations of chapter or section notes, ensuring transparency. Evaluated on HSCodeComp at the six-digit level, the system achieved 75.0% top-1 and 91.5% top-3 accuracy with Qwen3.6-plus. Using an open-weight Qwen3.6-27B-FP8 in non-thinking mode, it reached 84.2% four-digit and 77.4% six-digit top-1 agreement with the frontier model. A manual audit of 226 disagreements found that a nontrivial fraction of ground-truth labels may themselves deviate from the General Interpretive Rules, adding nuance to benchmark evaluation.
- Achieves 75.0% top-1 and 91.5% top-3 accuracy on HS tariff codes at six-digit level
- Uses deterministic six-stage pipeline with narrow LLM calls—no self-planning agents
- Manual audit reveals some benchmark labels may violate tariff rules; all adjudication records released
Why It Matters
Reliable, interpretable tariff classification cuts trade costs and legal risks—a real-world win for AI in logistics and customs.