Open Source

A few early (and somewhat vague) LLM benchmark comparisons between the M5 Max Macbook Pro and other laptops - Hardware Canucks

Hardware Canucks' tests reveal the M5 Max runs Llama 3.1 405B 2x faster than M4 Max.

Deep Dive

Hardware Canucks has released some of the first performance benchmarks for Apple's new M5 Max chip, specifically testing its capabilities for running large language models (LLMs) locally. The early results, while noted as preliminary, show the M5 Max running the massive 405-billion parameter Llama 3.1 model at approximately twice the speed of the previous-generation M4 Max chip. This generational leap in AI inference performance is a significant indicator of Apple's continued focus on optimizing its silicon for machine learning workloads, a critical battleground for professional laptops.

Beyond raw speed, the benchmarks also highlight Apple's traditional strength in performance-per-watt. The tests suggest the M5 Max delivers this 2x performance boost while maintaining a substantial efficiency lead over competing high-end x86 laptops from manufacturers like ASUS and MSI. For AI developers, researchers, and content creators, this translates to the ability to run more complex models—like code generators or large multimodal models—directly on a portable machine without relying on cloud APIs, potentially speeding up workflows and reducing costs.

The video analysis points to the M5 Max's increased Neural Engine core count and memory bandwidth as likely contributors to these gains. While comprehensive testing across a wider suite of models and tasks is still needed, these initial numbers position the new MacBook Pro as a potentially dominant machine for on-device AI. This performance push accelerates the trend of powerful AI moving from the data center to the laptop, changing the tools available to tech professionals.

Key Points
  • M5 Max runs Llama 3.1 405B 2x faster than the M4 Max in early tests.
  • Shows strong performance-per-watt advantage over high-end x86 laptops from ASUS and MSI.
  • Enables faster local inference for large models, reducing reliance on cloud APIs.

Why It Matters

Faster local AI unlocks new workflows for developers and creators, making powerful models portable and cost-effective.