Research & Papers

[P] ibu-boost: a GBDT library where splits are *absolutely* rejected, not just relatively ranked[P]

New gradient-boosting library replaces relative ranking with absolute thresholds, eliminating min_gain_to_split tuning.

Deep Dive

ibu-boost is an experimental gradient-boosted decision tree (GBDT) library that fundamentally changes how splits are selected during tree construction. Instead of always choosing the candidate with the highest gain—which can lead to useless splits on noisy data—it implements a 'screening transform' based on the "Screening Is Enough" paper. This approach uses absolute thresholds to reject all split candidates when none provide meaningful improvement, automatically creating leaves without requiring the traditional min_gain_to_split hyperparameter that needs per-dataset tuning.

The library currently supports MSE regression and binary log-loss, with both standard per-node splits and CatBoost-style oblivious splits. It includes Triton GPU kernels that deliver 3.15x speedup on an RTX 4060 Ti compared to CPU execution, achieving 51x faster kernel-level performance over NumPy references. While benchmarks on the California Housing dataset show a 12% RMSE gap compared to LightGBM, the developer hypothesizes ibu-boost will excel on high-dimensional or noisy data where traditional GBDTs tend to overfit through spurious splits.

ibu-boost represents an early alpha implementation with plans to make threshold parameters learnable in future releases. The library includes built-in diagnostics like accept_rate monitoring and parameter search tools, positioning it as both a practical tool and research platform for exploring alternative split selection methodologies in gradient boosting.

Key Points
  • Replaces relative split ranking with absolute-threshold rejection using screening transform from 'Screening Is Enough' paper
  • Eliminates min_gain_to_split hyperparameter tuning—nodes automatically become leaves when no candidate meets threshold
  • Delivers 3.15x GPU speedup on RTX 4060 Ti with Triton kernels and 51x faster kernel operations versus NumPy

Why It Matters

Could reduce overfitting in GBDTs on noisy data while eliminating a major hyperparameter tuning burden for data scientists.