70B parameter model with Hopper-optimized FP8 training runs 3x faster than Nemotron 2?

70B parameter model with Hopper-optimized FP8 training runs 3x faster than Nemotron 2.

Inference cost reduced by 40% using novel quantization; accuracy remains above 98%?

Inference cost reduced by 40% using novel quantization; accuracy remains above 98%.

Open-source weights released; scores 92% on GSM8K, 89% on HumanEval, 95% on MMLU.

Open Source

r/LocalLLaMA June 01, 2026

⚡The new open-source model beats GPT-4 on math benchmarks with 70B parameters.

Deep Dive

An article was submitted by Reddit user themixtergames.

Key Points

70B parameter model with Hopper-optimized FP8 training runs 3x faster than Nemotron 2.
Inference cost reduced by 40% using novel quantization; accuracy remains above 98%.
Open-source weights released; scores 92% on GSM8K, 89% on HumanEval, 95% on MMLU.

NVIDIA brings enterprise-grade open-source AI with GPT-4-class performance at half the cost per token.