From Language to Action in Arabic: Reliable Structured Tool Calling via Data-Centric Fine-Tuning
A new 270M-parameter model transforms Arabic AI agents, reducing structural failures by 86 percentage points.
A large team of 25 researchers, primarily from AISA, has developed a breakthrough framework for reliable Arabic function-calling in AI agents. Their model, AISA-AR-FunctionCall, is built on a 270M-parameter FunctionGemma backbone and addresses a critical weakness in existing systems: severe structural instability when processing Arabic. Through a meticulous, data-centric approach involving dataset auditing, schema repair, and tool-aware prompt restructuring, followed by full-parameter supervised fine-tuning, they achieved dramatic improvements. On a held-out test set, the rate of parse failures—where the model fails to produce valid structured output—plummeted from 87% to below 1%. Function name accuracy improved more than eightfold, and argument alignment saw substantial gains across various Arabic dialects and domains.
This research marks a significant step toward practical, agentic AI for the Arabic-speaking world. The team's error analysis revealed that after solving the structural collapse problem, the remaining challenges shifted to semantic misalignment, suggesting that serialization stability and decision-level reasoning are separable issues. They also explored a reasoning-augmented LoRA (Low-Rank Adaptation) variant that introduces explicit intermediate reasoning steps before tool invocation, a promising direction for further refinement. All datasets and models have been publicly released under the AISA framework, providing a valuable open-source resource for developers and researchers aiming to build robust Arabic AI applications that can reliably translate natural language commands into actionable code or API calls.
- Cuts parse failure rate from 87% to under 1% through data-centric fine-tuning on a 270M-parameter model.
- Improves function name accuracy by more than eightfold and enhances argument alignment across Arabic dialects.
- Publicly releases all datasets and models, enabling robust Arabic AI agents for tool calling and automation.
Why It Matters
Enables reliable, production-ready AI agents for 400+ million Arabic speakers, unlocking automation and complex task execution.