How did Anthropic measure AI's "theoretical capabilities" in the job market?
A viral chart suggests LLMs could handle 80% of tasks across many jobs, but the data is speculative.
Anthropic's recent report on AI's labor market impact featured a viral chart comparing 'observed exposure' to 'theoretical capability' of LLMs across 22 job categories. The striking blue 'theoretical' area suggests LLM-based systems could perform at least 80% of individual tasks in fields from Arts & Media to Management, Legal, and Finance. This graphic has fueled widespread discussion about AI's potential to automate vast swaths of the economy.
However, the 'theoretical capability' metric is based on a speculative August 2023 study co-authored by OpenAI and University of Pennsylvania researchers, not Anthropic's own testing. The study used human annotators familiar with AI—not with the specific jobs—to judge if the most powerful 2023 LLM (like GPT-4) or future LLM-powered software could reduce task time by 50%. The researchers acknowledged fundamental limitations, including the subjectivity of labeling and an 'unclear logic for aggregating tasks.' The metric represents a guess about where AI could boost productivity, not a prediction of full job takeover.
- Chart based on 2023 study, not current empirical testing of models like Claude 3.
- Metric assumes future 'LLM-powered software' could reduce task time by 50%, not replace workers.
- Human annotators were AI experts, not job professionals, creating a 'fundamental limitation' in the data.
Why It Matters
Highlights the gap between AI hype and measurable impact, urging professionals to scrutinize sensational claims.