Inpainting with LTXV 2.3. Results after two weeks of R&D.
Two weeks of testing show 5K resolution but stubborn flickering and precision issues.
DOGMA, a design firm specializing in AI-assisted video production for TV ads, shows, and movies (including a recent Netflix release), reports on two weeks of R&D using LTXV 2.3 for inpainting—a process that makes up 50% of their work and 100% of their Netflix deliverables. They praise the open-source model's democratization efforts and note it can achieve up to 5K resolution locally, with an HDR LoRA exceeding expectations. The results, they claim, have “very little to envy from closed-source Seedance 2.”
Two inpainting approaches were tested. Method 1 uses Multi Guide with two reference frames (first and last) and an LTXV latent mask, but introduces flickering and mismatch near the reference frames—essentially acting as a denoise tool without precision. Method 2 employs a specialized LoRA (ltx23_inpaint_masked_r2v_rank32_v1_3000steps) that takes a video with the area to be inpainted marked in magenta and a tiny 200px reference frame. While it works well for faces (e.g., replacing Trump's face), it fails for precise details due to the minuscule reference window. Adding Multi Guide partially solves this but reintroduces flickering. The team concludes the model is powerful but requires careful setup for professional-grade results.
- LTXV 2.3 can be pushed to 5K resolution locally, rivaling closed-source Seedance 2.
- Method 1 (no LoRA) uses multi-guide frames but causes flickering and mismatch near reference frames.
- Method 2 (inpainting LoRA) works well for faces but fails for precise details due to a tiny 200px reference window.
Why It Matters
Open-source video inpainting inches closer to production viability but still needs workarounds for precise edits in movies and ads.