DesignSense: A Human Preference Dataset and Reward Modeling Framework for Graphic Layout Generation
New vision-language model trained on 10,235 human-annotated layout pairs beats proprietary models by wide margins.
A research team including Varun Gopal, Rishabh Jain, and others has introduced DesignSense, a comprehensive human preference dataset and reward modeling framework specifically for AI-generated graphic layouts. The work addresses a critical gap where current text-to-image preference models fail to evaluate the nuanced spatial arrangements that define layout quality. Their solution includes DesignSense-10k, a dataset of 10,235 human-annotated preference pairs created through a five-stage curation pipeline that generates coherent layout transformations. Human annotators used a 4-class scheme (left better, right better, both good, both bad) to capture the subjective nature of aesthetic judgment.
The team trained a specialized vision-language model (VLM) classifier on this data, which substantially outperforms existing open-source and proprietary models, achieving a 54.6% improvement in Macro F1 score over the strongest proprietary baseline. The analysis reveals that even frontier VLMs are unreliable for this specific four-class task, highlighting the need for domain-specific preference models. Practically, using DesignSense as a reward model during reinforcement learning (RL) training improves a layout generator's win rate by about 3%. Additionally, inference-time scaling—generating multiple candidates and selecting the best via DesignSense—provides a 3.6% quality boost, demonstrating tangible improvements for real-world AI design tools.
- Introduces DesignSense-10k, a dataset of 10,235 human-annotated preference pairs for evaluating graphic layouts
- The trained DesignSense VLM classifier beats proprietary models with a 54.6% improvement in Macro F1 score
- Using the model improves downstream layout generation quality by 3-3.6% via RL training and candidate selection
Why It Matters
Enables AI to generate graphic designs that better match human aesthetic preferences, improving tools for marketers and designers.