Same resume scored 66–99 across 100 runs at temperature 0.1; cutoff 85 leads to 65% failure rate without changes?

Same resume scored 66–99 across 100 runs at temperature 0.1; cutoff 85 leads to 65% failure rate without changes.

Project scoring varied wildly (LLM judges 'architectural complexity' unreliably); experience scoring maxed out even for one internship?

Project scoring varied wildly (LLM judges 'architectural complexity' unreliably); experience scoring maxed out even for one internship.

Open source contributions (35 pts) and projects (30 pts) dominate over experience (25 pts), potentially undervaluing seasoned engineers?

Open source contributions (35 pts) and projects (30 pts) dominate over experience (25 pts), potentially undervaluing seasoned engineers.

Developer Tools

HackerRank's open-source ATS gives wildly inconsistent resume scores (66-99)

Hacker News June 29, 2026

⚡Same resume, six different scores from 66 to 99 in 100 runs.

Deep Dive

HackerRank open-sourced its ATS 'hiring-agent' on GitHub. An experiment found the same resume scored between 66 and 99 across 100 runs—with a cutoff of 85, the resume failed 65% of the time. Technical skills were highly consistent (8/10 in 98 runs), but project scores varied wildly due to LLM non-determinism, even at temperature 0.1 or 0. Experience scores were perfectly consistent (25/25 every run)—but useless because the rubric has no differentiation. The tool overweights open source (35 pts) and projects (30 pts) over experience (25 pts), and the author warns it acts as a luck filter, not a quality filter.

Key Points

Same resume scored 66–99 across 100 runs at temperature 0.1; cutoff 85 leads to 65% failure rate without changes.
Project scoring varied wildly (LLM judges 'architectural complexity' unreliably); experience scoring maxed out even for one internship.
Open source contributions (35 pts) and projects (30 pts) dominate over experience (25 pts), potentially undervaluing seasoned engineers.

Why It Matters

Automated hiring tools using LLMs risk replacing bias with randomness instead of fairness.

Read Original Article

HackerRank's open-source ATS gives wildly inconsistent resume scores (66-99)

Why It Matters

Related Articles

🚀 Stay Ahead in AI