LLMs Compress 'Slightly' and 'Drastically' into Just 5 Numeric Levels
Claude Haiku maps 10 intensity words to only 5 distinct numeric outputs in resource allocation tasks.
A new study from Georgia Tech reveals that language models struggle to translate nuanced intensity words into distinct numeric actions. Daniel Tabach designed a controlled experiment where Claude Haiku received natural-language instructions containing one of 10 English degree modifiers (from slightly to drastically) and had to allocate a numeric resource. Across 6,620 runs at two temperatures, the model compressed the 10 words into just 5 distinct median outputs, with lower-tier words like slightly, a bit, and somewhat all mapping to the same value. Spearman correlation between word rank and numeric output was moderate (ρ=0.845), but the real story lies in the model's dependence on context.
When the current system state was provided, the model's interpretation became almost entirely state-driven: grouping by starting allocation captured 78.2% of variance, while the word itself accounted for only 7.9%. Near operational boundaries, distinct behavioral modes emerged: weak words produced small hedging adjustments, strong words like drastically pushed to the local maximum, and very strong words like enormously caused the model to abstain entirely. These patterns persisted across temperature settings, confirming that the compression and state-dependence are structural, not just sampling noise. The findings have direct implications for any application requiring LLMs to execute numeric instructions with precision.
- 10 intensity words collapsed into 5 median outputs; lower-tier words (slightly, somewhat) were indistinguishable in numeric actions.
- State context explained 78.2% of variance vs. just 7.9% for word choice; near capacity, word distinctions vanished.
- At feasibility limits: weak words hedge, strong words abstain, and 'drastically' hits the ceiling – three distinct behavioral modes.
Why It Matters
If LLMs can't distinguish slight from somewhat in numeric tasks, precision-dependent applications need guardrails.