Qwen 3.6 35B A3B requires 10–20GB VRAM depending on quantization, pushing most users to cloud GPUs?

Qwen 3.6 35B A3B requires 10–20GB VRAM depending on quantization, pushing most users to cloud GPUs.

Estimated cloud costs?

$0.20–$1.60/hour on platforms like Vast.ai or RunPod using A100 or H100 GPUs.

For continuous use (24/7), monthly costs range from $300–$600; per-token serverless APIs are cheaper for sporadic use.

Open Source

r/LocalLLaMA May 04, 2026

⚡As hardware evolves, devs ask: how much to rent cloud GPUs for Qwen 3.6 35B?

Deep Dive

A Reddit user asks how much it will cost to host the Qwen 3.6 35B A3B model for coding until they upgrade their hardware.

Key Points

Qwen 3.6 35B A3B requires 10–20GB VRAM depending on quantization, pushing most users to cloud GPUs.
Estimated cloud costs: $0.20–$1.60/hour on platforms like Vast.ai or RunPod using A100 or H100 GPUs.
For continuous use (24/7), monthly costs range from $300–$600; per-token serverless APIs are cheaper for sporadic use.

Affordable cloud hosting for 35B MoE models opens coding AI to users without high-end GPUs.