Reddit user considers 8x 3090s upgrade for better LLM hosting
A hobbyist AI builder seeks advice on 192GB VRAM setups for models beyond Qwen 3.6 27B.
A Reddit user (anitamaxwynnn69) posted a detailed hardware upgrade query from their current 4x 3090s setup hosting a Qwen 3.6 27B 128K model in full precision. They're looking for a middle-tier upgrade path that yields noticeable model performance improvements without breaking the bank on a $10k+ B6000. The main candidates are adding another 4x 3090s (total 8 cards, 192GB VRAM, ~$4k) or buying a single RTX B5000 (48GB VRAM, ~$4,200). They question whether the B5000's VRAM-per-dollar math makes sense compared to 4 more 3090s, and whether model providers (like those behind DSv4 or MiniMax M2.7) are targeting the 192GB tier for future releases.
Beyond cost, the user highlights key technical constraints: running DSv4 on Ampere architecture (3090s) may be painful, and with 8 cards the slowest PCIe link would be 4.0 x8. Their use case is personal tinkering—coding for a living and enjoying building rigs—not heavy production. They plan to power the expanded setup from two separate circuits and power-limit each card to 220W. The post has sparked community discussion on whether 192GB VRAM setups are future-proof for open-source models like Qwen, Llama, and potential MoE architectures.
- Adding 4 more 3090s costs ~$4k for 96GB VRAM; an RTX B5000 costs $4,200 for only 48GB.
- DSv4 may underperform on Ampere (3090s) due to architecture limitations.
- PCIe 4.0 x8 becomes the bottleneck with 8 cards, affecting inter-GPU communication.
Why It Matters
This decision reflects the broader trade-off between VRAM capacity and architecture efficiency for hobbyist AI builders.