GPU-Accelerated Genetic Programming for Symbolic Regression with Beagle Framework
New GPU framework processes genetic programming tasks 100x faster than CPU-based competitors like PySR.
A research team including Nathan Haut, Ilya Basin, and Wolfgang Banzhaf has introduced the Beagle framework, a novel software system designed to execute Genetic Programming (GP) tasks directly on GPUs. Currently focused on symbolic regression—the process of discovering mathematical expressions that best fit a dataset—Beagle processes entire populations of candidate solutions and training cases in parallel to maximize throughput on modern GPU hardware. This architectural shift represents a significant departure from traditional CPU-based evolutionary computation tools.
In their benchmarking study, the team compared Beagle's performance against two established CPU-based systems: StackGP and the popular PySR framework. Using the standardized Feynman Symbolic Regression dataset and operating under identical time constraints, Beagle demonstrated substantially superior performance. The framework supports multiple fitness functions, including point-to-point error and correlation-based evaluation, providing flexibility for different regression tasks. This GPU acceleration makes complex symbolic regression experiments that previously took hours or days feasible in minutes, dramatically expanding the practical scope of interpretable machine learning research.
While currently specialized for symbolic regression, the Beagle framework's architecture suggests potential for broader applications in evolutionary computation. The performance gains highlight how specialized hardware acceleration can revitalize classical AI techniques, making them competitive with modern neural networks for certain interpretability-critical applications. This development is particularly relevant for scientific domains where discovering compact, human-readable equations from experimental data provides more insight than black-box predictions.
- Beagle executes Genetic Programming for symbolic regression directly on GPU hardware, maximizing parallel throughput
- Benchmarks show it significantly outperforms CPU-based systems StackGP and PySR on the Feynman dataset under equal time budgets
- The framework supports multiple fitness functions including point-to-point error and correlation-based evaluation
Why It Matters
Enables discovery of interpretable mathematical models from data at speeds previously impossible, accelerating scientific research.