An actual example of "If you dont run it, you dont own it" and Gemma 4 beats both Chat GPT and Gemini Chat
Open-source model outperforms closed rivals with 100% pass rate in Chinese novel translation test
A Reddit user conducted a rigorous translation test on Chinese novel chapters, comparing open-source and closed-source AI models. The test required models to maintain consistent character names across chapters, a task that demands strong contextual reasoning. The results were surprising: Google's open-source Gemma 4 31B (at Q4 quantization) achieved a 100% pass rate with natural-sounding translations, beating GPT-5.3, Gemini Chat, and Qwen models.
GPT-5.3 failed 20% of queries by mixing character names and producing unnatural translations, a degradation from earlier GPT-4o versions that passed all tests. Qwen models (3 Max, 3.6 Plus) were censored despite no NSFW content, automatically deleting responses. Gemini Chat partially passed but misgendered characters (e.g., calling a lady a lord). The test underscores how closed models can degrade over time due to updates and A/B testing, while local open-source models like Gemma 4 maintain consistent quality.
- Gemma 4 31B achieved 100% pass rate in Chinese novel translation, beating GPT-5.3's 80%
- GPT-5.3 degraded from earlier GPT-4o versions, with 20% failure rate on character name consistency
- Qwen 3 Max and 3.6 Plus were censored and auto-deleted responses despite no NSFW content
Why It Matters
Open-source models like Gemma 4 can match or exceed closed models, offering reliable, uncensored alternatives for specialized tasks.