Research & Papers

TactileEval: A Step Towards Automated Fine-Grained Evaluation and Editing of Tactile Graphics

New system achieves 85.7% accuracy on 30 tasks and can generate targeted corrections via GPT-powered image editing.

Deep Dive

A research team led by Adnan Khan has introduced TactileEval, a novel AI pipeline designed to automate the quality assessment and editing of tactile graphics—raised-line images used by blind and visually impaired (BVI) learners. Currently, these graphics require meticulous expert validation, a bottleneck in educational material production. TactileEval addresses this by establishing a five-category quality taxonomy (covering view angle, part completeness, background clutter, texture separation, and line quality) based on expert feedback from the TactileNet dataset. The team then gathered 14,095 structured annotations via Amazon Mechanical Turk, spanning 66 object classes organized into six families, to train their evaluation model.

A Vision Transformer (ViT-L/14) model trained on this data achieves an impressive 85.70% overall test accuracy across 30 distinct evaluation tasks. The consistent difficulty ordering across tasks suggests the taxonomy successfully captures meaningful perceptual structure. Building on this automated evaluation, the researchers present an editing pipeline that routes classifier scores through family-specific prompt templates. These prompts are then used to generate targeted corrections via a GPT-powered image editing model (gpt-image-1), moving from simple diagnosis to actionable repair. The complete code, data, and models are publicly available, representing a significant step toward scalable, high-quality tactile graphic production.

Key Points
  • Automates expert validation of tactile graphics with a ViT-L/14 model achieving 85.70% accuracy on 30 tasks.
  • Trained on 14,095 structured annotations across 66 object classes, organized using a novel five-category quality taxonomy.
  • Includes an automated editing pipeline that uses classifier scores and prompt templates to generate corrections via GPT-powered image editing.

Why It Matters

This technology could dramatically scale up the creation of accessible educational materials for blind and visually impaired learners worldwide.