Media & Culture

Skymizer's HTX301 runs 700B LLMs locally at 240W using old DDR4 tech

A budget PCIe card from a tiny company can handle 700-billion-parameter AI models...

Deep Dive

Skymizer, a relatively unknown player, has unveiled the HTX301, a PCIe AI accelerator that turns conventional wisdom on its head. The card can handle language models with up to 700 billion parameters—rivaling Nvidia's H100 or AMD's MI300X—yet consumes just 240 watts and relies on decade-old 28-nanometer process technology and standard LPDDR4 or LPDDR5 memory. This is a stark contrast to the industry norm of using cutting-edge nodes and expensive HBM (High Bandwidth Memory) to achieve similar throughput. The HTX301 plugs into a standard PCIe slot, making it accessible for workstations or servers without specialized cooling or power infrastructure.

The implications are significant for the AI hardware landscape. While Nvidia and AMD battle over premium, high-power accelerators, Skymizer targets a neglected segment: cost-sensitive inference workloads. The HTX301's use of commodity memory and mature silicon dramatically lowers the bill of materials, potentially making local, private inference of massive models viable for small teams or edge deployments. However, benchmarks and real-world performance remain unverified; the trade-off for such low power may be reduced throughput or latency compared to HBM-based rivals. If Skymizer delivers on its claims, it could force incumbents to reconsider their pricing and architecture strategies for the growing demand of on-premises AI.

Key Points
  • Supports LLMs with up to 700 billion parameters locally on a single PCIe card.
  • Consumes only 240W, using 28nm chips and LPDDR4/DDR5 memory instead of expensive HBM.
  • Challenges Nvidia and AMD with a low-power, low-cost alternative for large-scale inference.

Why It Matters

Enables cost-effective local inference of massive LLMs, reducing dependence on expensive, power-hungry data center GPUs.