Leaderboards now support multiple metrics per benchmark, e.g., WER and RTFx for ASR, FPS and mAP for object detection?

Leaderboards now support multiple metrics per benchmark, e.g., WER and RTFx for ASR, FPS and mAP for object detection.

Papers can be submitted from non-Arxiv sources like GitHub, blog posts, and BiorXiv, with AI auto-enrichment for tasks and methods?

Papers can be submitted from non-Arxiv sources like GitHub, blog posts, and BiorXiv, with AI auto-enrichment for tasks and methods.

Over 3,000 evaluations have been added, covering all models supported in Hugging Face's Transformers library?

Over 3,000 evaluations have been added, covering all models supported in Hugging Face's Transformers library.

Research & Papers

PapersWithCode revival adds multi-metric leaderboards and paper lineage

r/MachineLearning May 24, 2026

⚡Track SOTA across AI domains with leaderboards now supporting WER, FPS, and more.

Deep Dive

One week after his revival of paperswithcode.co, Hugging Face's Niels Rogge has rolled out a slate of updates that make the SOTA tracking platform more powerful and flexible. The biggest addition is support for multiple metrics per benchmark: the Open ASR Leaderboard now shows both Word Error Rate (WER) and Inverse Real-Time Factor (RTFx), while the Object Detection leaderboard reports frames-per-second (FPS) alongside mean average precision (mAP). This allows researchers to compare models on more nuanced performance dimensions. The platform also now accepts paper submissions from sources beyond Arxiv, including GitHub repos, blog posts, and BiorXiv. When a paper is submitted, AI automatically enriches it with task tags, method tags, and links to evaluations.

Another key feature is paper lineage, which displays a banner above the abstract showing predecessor or follow-up papers—visible for entries like Mamba-3, DINOv2, and GLM-4.5. New popular methods such as Gated DeltaNet, Kimi Delta Attention, and Mamba-2 have been added, each with a list of citing papers. For social sharing, each benchmark includes a "copy image" button for scatter plots and tables. Finally, over 3,000 evaluations have been loaded, starting with all models supported in the Transformers library, appearing at the bottom of each paper page (e.g., Qwen 3.6). Rogge plans to continue adding features and has opened a Discord channel for feedback.

Key Points

Leaderboards now support multiple metrics per benchmark, e.g., WER and RTFx for ASR, FPS and mAP for object detection.
Papers can be submitted from non-Arxiv sources like GitHub, blog posts, and BiorXiv, with AI auto-enrichment for tasks and methods.
Over 3,000 evaluations have been added, covering all models supported in Hugging Face's Transformers library.

Why It Matters

Makes comparing AI models across diverse tasks more flexible and accessible, aiding research and deployment decisions.

Read Original Article

PapersWithCode revival adds multi-metric leaderboards and paper lineage

Why It Matters

Related Articles

🚀 Stay Ahead in AI