Research & Papers

A Benchmark of Classical and Deep Learning Models for Agricultural Commodity Price Forecasting on A Novel Bangladeshi Market Price Dataset

Study finds Prophet fails catastrophically while simple models beat complex AI for commodity forecasting.

Deep Dive

A research team from Bangladesh has published a comprehensive benchmark comparing classical and deep learning models for agricultural commodity price forecasting. Their work introduces AgriPriceBD, a novel dataset containing 1,779 daily retail mid-prices for five key Bangladeshi commodities—garlic, chickpea, green chilli, cucumber, and sweet pumpkin—spanning from July 2020 to June 2025. The dataset was extracted from government reports using an LLM-assisted digitization pipeline, addressing a critical scarcity of machine-learning-ready data for South Asian agricultural markets.

The researchers evaluated seven forecasting approaches, ranging from classical models like naïve persistence, SARIMA, and Prophet to deep learning architectures including BiLSTM, Transformer, Time2Vec-enhanced Transformer, and Informer. Their findings, validated with Diebold-Mariano statistical significance tests, reveal surprising results: simple models often outperformed complex AI. Naïve persistence dominated for near-random-walk commodities, while Facebook's Prophet failed systematically due to incompatible assumptions about price dynamics. More alarmingly, the Informer model produced erratic predictions with variance up to 50 times the ground truth, highlighting that sparse-attention Transformers require substantially larger training sets than small agricultural datasets can provide.

One particularly striking finding was the catastrophic performance of Time2Vec temporal encoding on green chilli prices, which increased mean absolute error by 146.1% with statistical significance (p<0.001). This demonstrates that advanced temporal embeddings don't necessarily translate to better performance in real-world, noisy economic data. The study concludes that commodity price forecastability is fundamentally heterogeneous, and researchers must carefully match model complexity to dataset characteristics rather than defaulting to state-of-the-art deep learning approaches.

Key Points
  • Introduced AgriPriceBD: 1,779 daily price points for 5 Bangladeshi commodities (garlic, chickpea, etc.) from 2020-2025, created via LLM-assisted digitization.
  • Found Prophet fails systematically and Informer produces erratic predictions (50x variance), while simple naïve persistence often beats complex AI models.
  • Released all code, models, and data publicly to support agricultural forecasting research in developing economies.

Why It Matters

Provides crucial evidence that simple models often outperform complex AI for real-world economic forecasting with limited data.