Image & Video

Trying to accomplish realism with Ernie Turbo - here's what I learned

User finds Ernie Turbo's default workflow translates prompts to Chinese, skewing outputs toward Asian subjects.

Deep Dive

A hands-on test of Baidu's Ernie Turbo image generation model has uncovered significant quirks in its standard implementation. The user employed the default workflow from ComfyUI, a popular node-based interface for AI image generation. This workflow includes a built-in 'Prompt Enhancer' that, among other functions, automatically translates user prompts into Chinese. This translation step introduces a strong bias, causing the model to generate images featuring Asian subjects even when a different ethnicity is explicitly specified in the original English prompt. To counter this, the tester had to completely bypass the enhancer and feed prompts directly to the sampler in plain English.

Beyond the bias issue, the evaluation highlighted several technical limitations. The model tends to produce a 'plasticky' look, which can be mitigated by adding specific photographic terms to the prompt, such as 'point-and-shoot film camera' or '35mm film.' It also exhibits a grid-pattern artifact reminiscent of early versions of other models and struggles with intricate patterns like bike wheels or guitar inlays. While it shows good recognition of brands and logos, its seed variance is extremely low—batches of eight images generated at once looked nearly identical. When used as a refiner for images from other models, results were merely 'okay,' leading the user to conclude that for now, competing models like Z-Image-Turbo (ZIT) or Klein deliver superior and more predictable performance.

Key Points
  • The default ComfyUI workflow for Ernie Turbo translates prompts to Chinese, creating a strong bias toward Asian subjects in outputs.
  • The model exhibits technical flaws like a 'plasticky' look, grid-pattern artifacts, and very low seed variance, making generated batches look identical.
  • For practical use, the tester currently prefers rival open-source models Z-Image-Turbo or Klein for more reliable and controllable results.

Why It Matters

This reveals how hidden workflow steps can introduce bias and that cutting-edge AI image models still struggle with consistency and fine detail.