Research & Papers

CrossAlpha benchmark enables cross-market factor research from annual reports

New benchmark uses PCA-whitened firm-pair scores from 10,700 annual reports across 5 markets

Deep Dive

CrossAlpha, a new benchmark from researchers at the National University of Singapore, addresses the challenge of cross-market factor research—using firm-level signals from one or more markets to predict returns in a target market. Existing benchmarks don't support cross-market disclosure-to-return evaluation because filings differ across languages and regulations, and similarity measures can be biased by common reporting components. CrossAlpha tackles this with three components: Disclosure Distillation (standardizes heterogeneous filings into 10-category English business descriptions), Residual Schema Graph Construction (builds PCA-whitened cross-market firm-pair scores from schema-level disclosures), and Timing-Aligned Evaluation (pairs the graph with 11 years of daily OHLCV data under feasible cross-market execution protocols). The benchmark covers approximately 3,600 firms and 10,700 firm-year reports from the United States, Japan, Taiwan, South Korea, and Hong Kong, releasing about 19 million directed firm-pair scores.

In experiments, disclosure-derived cross-market peers significantly outperformed domestic text, industry-code, and return-correlation peers in the US-to-Japan setting, achieving an ICIR of 0.39 compared to 0.07–0.18 for baselines. CrossAlpha is open-sourced and designed to be reusable, providing a return-grounded benchmark for cross-market financial NLP. It enables quantitative analysts and researchers to systematically evaluate how annual report disclosures from one market can predict stock returns in another, opening new avenues for global factor investing and cross-border portfolio strategies.

Key Points
  • CrossAlpha covers 3,600 firms and 10,700 annual reports from US, Japan, Taiwan, South Korea, and Hong Kong
  • It releases 19 million directed firm-pair scores using PCA-whitened residual schema graphs
  • Cross-market disclosure-derived peers achieved ICIR 0.39 vs 0.07–0.18 for domestic baselines in US-to-Japan setting

Why It Matters

Enables systematic cross-market factor investing using annual report disclosures across different languages and regulatory regimes.