Research & Papers

Doc To The Future: Infomorphs for Interactive, Multimodal Document Transformation and Generation

A new research paper introduces 'infomorphs'—steerable AI modules that transform documents across formats with user control.

Deep Dive

A new research paper titled 'Doc To The Future: Infomorphs for Interactive, Multimodal Document Transformation and Generation' introduces a novel framework for AI-assisted knowledge work. Authored by Balasaravanan Thoravi Kumaravel and posted to arXiv, the work addresses a key limitation in current Generative AI tools: their lack of user control over multimodal inputs and outputs during document synthesis. The paper proposes 'infomorphs' as a solution—modular, AI-augmented transformations that users can steer to restructure information across formats like reports, slides, and spreadsheets. This represents a shift from monolithic AI generation to composable, transparent workflows.

The research concretely instantiates this design space with DocuCraft, a canvas-based interface that allows users to visually compose chains of infomorphs. Each infomorph performs a specific operation—such as extracting pages from a PDF, summarizing content with an LLM, or reformatting data for a spreadsheet—leveraging Generative AI at each stage while keeping the human in the loop. The paper demonstrates DocuCraft through example scenarios spanning common knowledge tasks, highlighting its support for fluid, cross-document, and cross-modal transformations. This work opens avenues for more transparent and modular interaction paradigms in AI-assisted information work, moving beyond black-box generation to user-directed synthesis.

Key Points
  • Introduces 'infomorphs'—modular, user-steerable AI transformations for document synthesis across formats (PDFs, slides, spreadsheets)
  • Presents DocuCraft, a canvas-based interface for visually chaining infomorph operations like extraction, summarization, and reformatting
  • Enables fluid, human-in-the-loop workflows that combine Generative AI with user intent and context, addressing control limitations in current tools

Why It Matters

It provides a blueprint for more controllable, transparent AI-assisted knowledge work, moving beyond black-box generation to user-steerable synthesis.