Open Source

Build an Agent That Thinks Like a Data Scientist: How We Hit #1 on DABStep with Reusable Tool Generation

NVIDIA's new AI agent automates complex data analysis with reusable tools, achieving top benchmark performance.

Deep Dive

A research team from NVIDIA, the KGMON (Kaggle Grandmasters) LLM Agent Research Team, has developed a new architecture for autonomous data analysis agents called the NVIDIA KGMON (NeMo Agent Toolkit) Data Explorer. The system is designed to tackle the challenge of analyzing structured, tabular data that is often inaccessible to standard web-search-based AI agents. Its key achievement is establishing a new state-of-the-art performance on the Data Agent Benchmark for Multi-step Reasoning (DABStep), ranking first place with a reported 30x speedup over the Claude Code baseline. This validates the team's core strategy of separating foundational knowledge building from rapid inference.

The architecture is built on the NVIDIA NeMo Agent Toolkit and employs different agent 'loops' for specific use cases. For open-ended exploratory data analysis (EDA), it uses a ReAct agent paired with Jupyter Notebook tools for bi-directional interaction. For complex, rule-based tabular Q&A—like the tasks in the DABStep benchmark—it uses a Tool Calling Agent. This agent interacts with a specialized suite of tools including a stateful Python interpreter, a retriever, and a file structure detector. A notable feature is the integration of a Vision-Language Model (VLM) to automatically interpret and suggest improvements for data visualizations, turning plots into textual insights.

Key Points
  • Achieved 1st place on the DABStep benchmark with a 30x speedup over the Claude Code baseline.
  • Uses a dual-agent architecture: ReAct agent for open-ended EDA and a Tool Calling Agent for structured tabular Q&A.
  • Integrates a Vision-Language Model (VLM) to automatically generate textual descriptions and improvement suggestions for data plots.

Why It Matters

Automates complex, multi-step data science workflows, significantly accelerating data exploration and analysis for professionals.