Developer Tools

SQL-Commenter: Aligning Large Language Models for SQL Comment Generation with Direct Preference Optimization

Researchers' new method trains LLaMA-3.1-8B to generate expert-level SQL comments using human feedback.

Deep Dive

A research team led by Lei Yu has introduced SQL-Commenter, a novel method for training large language models to generate high-quality, natural language comments for complex SQL queries. The system is built on Meta's LLaMA-3.1-8B model and addresses two core challenges in automated SQL documentation: the lack of datasets representing real-world, complex queries, and LLMs' insufficient grasp of SQL-specific semantics. The researchers first constructed a comprehensive dataset of complex SQL queries paired with expert-verified comments, then performed continual pre-training on a large SQL corpus to enhance the model's syntax understanding.

After supervised fine-tuning, the team applied Direct Preference Optimization (DPO), a technique that uses human feedback to align the model's outputs with preferred, high-quality comments. This preference-based loss function enables fine-grained semantic learning. In evaluations on the Spider and Bird benchmarks, SQL-Commenter significantly outperformed state-of-the-art baselines, including the powerful Qwen3-14B model. It achieved average improvements of 9.29, 4.99, and 13.23 percentage points on the BLEU-4, METEOR, and ROUGE-L metrics, respectively. Human evaluators also rated its comments as superior in correctness, completeness, and naturalness.

Key Points
  • Built on LLaMA-3.1-8B and trained with a new dataset of expert-verified SQL comments
  • Uses Direct Preference Optimization (DPO) with human feedback to align model outputs
  • Outperforms Qwen3-14B by up to 13.23 points on ROUGE-L and wins in human evaluations

Why It Matters

Automates a critical but tedious task for data engineers, improving code readability, maintainability, and team onboarding.