Research & Papers

Grandes Modelos de Linguagem Multimodais (MLLMs): Da Teoria \`a Pr\'atica

A leaked academic chapter details the future of AI that sees, hears, and reasons.

Deep Dive

A Portuguese-language book chapter, accepted for Webmedia 2025 and leaked on arXiv, provides a comprehensive practical guide to Multimodal Large Language Models (MLLMs). It covers core fundamentals, major models, and hands-on techniques for building multimodal AI pipelines using tools like LangChain and LangGraph. The authors include supplementary online material for developers, positioning this as a key resource for moving MLLMs from theoretical research into applied, functional systems.

Why It Matters

This guide lowers the barrier for developers to build the next generation of AI that can process and reason across text, images, and audio.