Image & Video

Ostris/AI-Toolkit now supports HiDream O1 training with token-only input

No text embeddings needed — tokens go straight into the model via Ostris toolkit.

Deep Dive

Ostris's GitHub repo for HiDream-O1-Image includes a note: disable caching text embeddings because "There are not text embeddings. Tokens go directly in." ComfyUI checkpoint versions are available, and a test ComfyUI workflow exists, but as of the post, there is no official workflow template.

Key Points
  • Ostris AI-Toolkit now supports HiDream O1 training with direct token input, skipping text embeddings
  • Disabling cached text embeddings reduces memory overhead, tokens feed straight into the model
  • ComfyUI checkpoints and a test workflow are available, but no official template yet

Why It Matters

Eliminating text embeddings simplifies fine-tuning pipelines and could lower hardware requirements for custom image model training.