Research & Papers

ICML 2026 - Heavy score variance among various batches? [D]

Researchers report wildly different average scores between batches, questioning the fairness of AI's top conference.

Deep Dive

The peer review process for ICML 2026, one of artificial intelligence's most prestigious conferences, is facing criticism over significant inconsistencies in scoring between different batches of submitted papers. Researchers on platforms like Reddit report that while some batches have average scores as high as 3.75, other batches struggle to get papers above the 3.5 threshold. This variance suggests potential systemic issues in how papers are distributed to reviewers and how scoring standards are applied across different domains within AI research.

The core concern centers on whether ICML's review process adequately accounts for what researchers are calling 'batch effects'—where papers assigned to certain reviewer groups receive systematically different scores due to reviewer harshness, expertise mismatch, or domain-specific standards. This isn't just an academic debate; acceptance at top-tier conferences like ICML directly impacts researchers' careers, funding opportunities, and the dissemination of important AI breakthroughs. The community is questioning whether the current system creates an uneven playing field where a paper's fate depends as much on which batch it lands in as on its actual scientific merit.

Conference organizers historically face the challenge of calibrating thousands of reviewers across diverse AI subfields, from theoretical machine learning to applied computer vision. While some variance is expected, the reported magnitude of differences between batches suggests calibration mechanisms may be insufficient. Researchers are calling for greater transparency in how ICML handles score normalization and whether post-review adjustments are made to ensure fairness across all submitted work.

Key Points
  • Researchers report batch averages varying from below 3.5 to above 3.75 on ICML's scoring scale
  • Questions center on reviewer calibration, domain differences, and conference normalization procedures
  • Acceptance decisions at top AI conferences directly impact careers and research dissemination

Why It Matters

Fair peer review determines which AI research gets published, funded, and ultimately shapes the field's direction.