Research & Papers

Adaptive Table Retrieval Beats Fixed Top-k for Text-to-SQL

New method dynamically selects tables per query, outperforming rigid top-k retrieval.

Deep Dive

Retrieving the right tables for natural language queries is a critical bottleneck in text-to-SQL systems. Current approaches retrieve a fixed k tables with highest similarity, but the optimal number varies wildly per query—some need just one table, others require many. This fixed-top-k strategy often underserves complex queries (missing essential tables) or overserves simple ones (dragging in irrelevant noise), degrading downstream SQL accuracy. To solve this, researchers from KAIST (Taehee Kim, Seungbin Yang, Jihwan Kim, Jaegul Choo) developed an adaptive table retrieval method that dynamically decides how many tables to retrieve per query.

The core innovation combines an adaptive thresholding mechanism that selects tables based on similarity scores exceeding a query-dependent cutoff, with a sliding-window reranking algorithm that efficiently processes large table corpora without brute-force scanning. This eliminates the need to pre-define k. Evaluated on three major benchmarks—Spider, BIRD, and Spider 2.0—the method consistently outperforms top-k baselines in both retrieval precision/recall and downstream text-to-SQL execution accuracy. The work, accepted at ACL 2026 Findings, shows that query-aware adaptive retrieval is both practical and performant, offering a drop-in improvement for any text-to-SQL pipeline.

Key Points
  • Replaces fixed top-k retrieval with an adaptive threshold and sliding-window reranking
  • Outperforms fixed-k baselines on Spider, BIRD, and Spider 2.0 benchmarks
  • Accepted at ACL 2026 Findings; code and data are publicly available

Why It Matters

Enables more robust text-to-SQL systems by retrieving exactly the right tables per query, reducing both missing evidence and noise.