Viral Wire

OpenAI Temporarily Resets Codex Rate Limits to Boost GPT-5.5 Building

OpenAI temporarily boosts Codex capacity, fueling GPT-5.5 building frenzy.

Deep Dive

OpenAI made a strategic move by temporarily resetting Codex rate limits to accelerate the development of GPT-5.5, signaling an aggressive push to advance its model capabilities. This development was highlighted in the latest AINews roundup, which also noted a relatively quiet day in AI news but featured significant model releases from NVIDIA, Poolside, and Alec Radford. The reset aims to provide developers with increased capacity to build and test on the Codex platform, potentially speeding up the iteration cycle for GPT-5.5.

In other news, NVIDIA launched Nemotron 3 Nano Omni, a 30B parameter multimodal MoE model with 256K context, designed for agentic workloads across text, image, video, audio, and documents. The model saw immediate distribution across platforms like OpenRouter, LM Studio, and Ollama. vLLM 0.20 shipped with major improvements including TurboQuant 2-bit KV cache for 4× KV capacity and fused RMSNorm for 2.1% latency improvement. Poolside released Laguna XS.2, a 33B total/3B active MoE coding model under Apache 2.0, capable of running on a single GPU. DeepSeek V4 MegaMoE showed B300 being up to 8× faster than H200 for serving workloads.

Key Points
  • OpenAI temporarily reset Codex rate limits to boost GPT-5.5 development efforts
  • NVIDIA's Nemotron 3 Nano Omni is a 30B multimodal MoE with 256K context, available across 10+ platforms
  • vLLM 0.20 introduces TurboQuant 2-bit KV cache for 4× KV capacity and 2.1% latency improvement

Why It Matters

OpenAI's Codex reset signals accelerated GPT-5.5 development, while new models and infra improvements push AI capabilities forward.