Research & Papers

CN-Buzz2Portfolio: A Chinese-Market Dataset and Benchmark for LLM-Based Macro and Sector Asset Allocation from Daily Trending Financial News

A new dataset tests if LLMs like GPT-4 can turn daily trending news into profitable ETF portfolios.

Deep Dive

A team of researchers from Tsinghua University and the Shanghai Artificial Intelligence Laboratory has introduced CN-Buzz2Portfolio, a novel dataset and benchmark designed to rigorously evaluate Large Language Models (LLMs) as autonomous financial decision-making agents. The core challenge the benchmark addresses is the lack of reproducible testing for AI in finance—live trading is risky and irreproducible, while existing datasets are limited to simple stock-picking. CN-Buzz2Portfolio instead focuses on macro and sector asset allocation, requiring AI agents to analyze a realistic stream of daily trending financial news from the Chinese market and allocate capital to broad asset classes like Exchange-Traded Funds (ETFs), thereby reducing noise from individual stock volatility.

The dataset simulates a rolling investment horizon from 2024 to mid-2025, forcing models to distill actionable investment logic from high-exposure public narratives, not pre-filtered corporate news. The researchers propose a structured "Tri-Stage CPA Agent Workflow" (Compression, Perception, Allocation) to standardize evaluation. In extensive experiments, they tested nine leading LLMs, uncovering major differences in how models like GPT-4 and Claude interpret macroeconomic trends and translate them into portfolio weights. All data, code, and experimental results are publicly released to advance the field of sustainable and reproducible AI-driven finance research.

Key Points
  • Benchmarks 9 LLMs (including GPT-4, Claude) on translating news into ETF portfolio allocations.
  • Uses a "Tri-Stage CPA Agent Workflow" for Compression, Perception, and Allocation of capital.
  • Provides a public dataset of Chinese trending financial news from 2024 to mid-2025 for reproducible testing.

Why It Matters

Provides a standardized, reproducible test for AI financial agents, moving beyond risky live trading or simplistic stock-picking benchmarks.