Researchers from NYU and Google replicated Picbreeder using VLMs instead of humans, testing AI's capacity for open-ended creativity?

Researchers from NYU and Google replicated Picbreeder using VLMs instead of humans, testing AI's capacity for open-ended creativity.

The AI system showed 'qualitative differences' from human outputs, with lower diversity and novelty under metrics like phylogenetic complexity?

The AI system showed 'qualitative differences' from human outputs, with lower diversity and novelty under metrics like phylogenetic complexity.

Adding exploratory noise, agent diversity, and memory improved results, but AI still lagged behind human-driven evolution?

Adding exploratory noise, agent diversity, and memory improved results, but AI still lagged behind human-driven evolution.

Research & Papers

Researchers test if AI can match human creativity in Picbreeder replication

arXiv cs.AI May 26, 2026

⚡Can frontier VLMs like GPT-4V replace humans in open-ended creative tasks?

Deep Dive

A team of researchers from New York University, Google DeepMind, and other institutions published a study on arXiv that asks a provocative question: *Can AI-driven agents replicate human open-ended creativity?* The team — led by Sam Earle and including authors from Google Research and NYU — took Picbreeder, a well-known interactive platform where users evolve images through generational selection, and replaced the human participants with frontier vision-language models (VLMs). The goal was to test whether artificial agents could generate novel, meaningful outputs without human guidance, a property known as *open-endedness*.

The team found *qualitative differences* between AI-generated outputs and the historical human baseline. To quantify these differences, they used metrics like phylogenetic complexity (measuring diversity of evolutionary paths), visual and semantic salience, and novelty. They also tested the impact of adding exploratory noise, behavioral diversity among agents, and memory mechanisms (narrative momentum) to simulate long-term creative trajectories. While the AI system produced images, the results lacked the diversity and depth characteristic of human-driven Picbreeder. The findings suggest that current VLMs may not inherently possess the same kind of open-ended creative capacity as humans — at least not without human-like guidance or richer interaction models.

Key Points

Researchers from NYU and Google replicated Picbreeder using VLMs instead of humans, testing AI's capacity for open-ended creativity.
The AI system showed 'qualitative differences' from human outputs, with lower diversity and novelty under metrics like phylogenetic complexity.
Adding exploratory noise, agent diversity, and memory improved results, but AI still lagged behind human-driven evolution.

Why It Matters

This study challenges assumptions about AI autonomy in creative domains and highlights current gaps between machine and human open-ended discovery.

Read Original Article

Researchers test if AI can match human creativity in Picbreeder replication

Why It Matters

Related Articles

🚀 Stay Ahead in AI