Open Source

Apple unveils M5 Pro and M5 Max, citing up to 4× faster LLM prompt processing than M4 Pro and M4 Max

New neural engine handles 48 trillion operations per second, targeting on-device LLMs like GPT-4o.

Deep Dive

Apple has officially announced the M5 Pro and M5 Max, the next evolution in its Apple Silicon lineup, with a primary focus on dramatically accelerating on-device artificial intelligence workloads. The company cites up to a 4x improvement in Large Language Model (LLM) prompt processing speed compared to the previous-generation M4 Pro and M4 Max chips. This leap is powered by a completely redesigned Neural Engine, now capable of 48 TOPS (trillion operations per second), and enhanced CPU and GPU cores optimized for AI matrix operations. The announcement positions Apple's high-end Macs as premier platforms for developers and professionals working with local AI models, directly competing with AI-accelerated PCs from NVIDIA and Qualcomm.

The technical upgrades extend beyond raw TOPS. Apple has implemented a larger, more efficient memory subsystem and new machine learning accelerators within the CPU to handle inference tasks for models like Meta's Llama 3 and OpenAI's GPT-4o more efficiently. This performance boost means complex AI tasks—such as generating code, editing video with AI-powered tools, or running local chatbots—can be completed in a fraction of the time, all while maintaining user privacy by keeping data on the device. The M5 Pro and M5 Max are expected to debut in upcoming MacBook Pro and Mac Studio models this fall, signaling Apple's deep commitment to making the Mac a powerhouse for the burgeoning era of generative AI.

Key Points
  • Next-gen Neural Engine delivers 48 TOPS, a major leap from the M4's 38 TOPS.
  • Claims up to 4x faster LLM prompt processing vs. M4 Pro/Max for models like GPT-4o.
  • Targets on-device AI to reduce cloud dependency, enhancing speed and privacy for Mac pros.

Why It Matters

Enables professional-grade, private AI work directly on Macs, challenging cloud-dependent workflows and AI PC rivals.