Research & Papers

[R] I probed 6 open-weight LLMs (7B-9B) for "personality" using hidden states — instruct fine-tuning is associated with measurable behavioral constraints

r/MachineLearning February 11, 2026

⚡New research shows your AI assistant has a hidden personality fingerprint you can't change.

Deep Dive

A new study probed 6 open-source LLMs (7B-9B parameters) and found each has a measurable, consistent behavioral fingerprint across 7 axes like warm/cold and verbose/concise. Instruct fine-tuning significantly reduces a model's behavioral variability, making it less steerable. Llama 3.1 8B Instruct was the most constrained (60% benchmark pass rate), while DeepSeek LLM 7B Chat was the most independent. The method achieved high reproducibility (ICC > 0.75).

Why It Matters

This reveals a hidden trade-off: fine-tuning for safety and instruction-following may permanently limit an AI's creative flexibility and adaptability.

Read Original Article

[R] I probed 6 open-weight LLMs (7B-9B) for "personality" using hidden states — instruct fine-tuning is associated with measurable behavioral constraints

Why It Matters

Stay Ahead in AI