Developer Tools

Run NVIDIA Nemotron 3 Super on Amazon Bedrock

AWS Machine Learning Blog March 20, 2026

⚡The 120B-parameter open model offers 256K context and 5x higher throughput for agentic AI.

Deep Dive

NVIDIA's latest open model, Nemotron 3 Super, has launched on Amazon Bedrock as a fully managed service, joining the existing Nemotron Nano lineup. This 120-billion parameter hybrid Mixture of Experts (MoE) model combines Transformer and Mamba architectures, activating only 12B parameters during inference for efficiency. It delivers up to 5x higher throughput than previous Nemotron models while supporting a 256K-token context window and seven languages including English, Chinese, and Japanese.

Key technical innovations include latent MoE architecture, which allows the model to access 4x more experts at the same inference cost for better specialization, and multi-token prediction that significantly boosts throughput for long reasoning sequences. The model achieves leading accuracy on benchmarks like AIME 2025, SWE Bench, and RULER, making it particularly suited for agentic AI systems and multi-agent workflows where complex reasoning and planning are required.

Developers can immediately access Nemotron 3 Super through Amazon Bedrock's console without managing infrastructure, using it for applications ranging from software development and financial analysis to cybersecurity threat hunting and retail optimization. The model's open weights, datasets, and recipes allow for customization and deployment on private infrastructure when enhanced security is needed. This availability marks a significant expansion of high-performance open models in the managed cloud AI space.

Key Points

120B-parameter hybrid MoE model with 256K token context and 7 language support
Delivers 5x higher throughput than previous Nemotron models with latent MoE architecture
Available as fully managed service on Amazon Bedrock for multi-agent and reasoning applications

Why It Matters

Provides enterprises with high-performance open models for complex agentic workflows without infrastructure management overhead.

Read Original Article

Run NVIDIA Nemotron 3 Super on Amazon Bedrock

Why It Matters

Stay Ahead in AI