Google Caps Gemini 3.1 Pro Quota, Makes Flash-Lite Free After User Complaints
Complex prompts won't drain your quota as fast with Google's new cap on single requests.
Google has revised its compute-based Gemini usage limits just weeks after introducing them at I/O 2026. Under the new system, each prompt’s complexity, model, tools, and chat length affect quota consumption. Users reported that complex prompts — especially those with large files or Deep Research — drained their daily allowance too fast, prompting Vice President Josh Woodward to announce emergency changes.
Key adjustments include a cap on the amount of quota a single Gemini 3.1 Pro request can consume, ensuring that heavy tasks don't eat up the entire budget. All Gemini 3.1 Flash-Lite prompts are now free and won't count against usage. Google also clarified that failed requests will not be charged, and promised more granular usage dashboards to help users track which tasks consume the most quota. Additionally, Omni video generation issues have been fixed, and AI Ultra users now get double the Omni output.
- Single Gemini 3.1 Pro requests now have a quota cap to prevent complex prompts from draining users' allowances.
- All Gemini 3.1 Flash-Lite prompts are free and don't count against usage limits.
- Failed requests won't be charged, and Google will add detailed usage breakdowns to the dashboard.
Why It Matters
Makes Gemini usage predictable and fair for professionals who rely on complex AI tasks without unexpected quota exhaustion.