I technically got an LLM running locally on a 1998 iMac G3 with 32 MB of RAM
A 260K parameter model runs on 25-year-old hardware using custom memory management and endian-swapping.
Developer Maddie Reese has achieved a remarkable technical feat by running a local large language model on a 1998 iMac G3 with only 32MB of RAM. The project uses Andrej Karpathy's 260K parameter TinyStories model, which implements a simplified version of Meta's Llama 2 architecture in a compact 1MB checkpoint. To make this work on the 25-year-old hardware, Reese cross-compiled the code from a modern Mac mini using Retro68 (a GCC-based toolchain for classic Mac OS), then transferred the files via FTP over Ethernet.
Several significant technical challenges had to be overcome. The PowerPC processor required endian-swapping the model weights from little-endian to big-endian format. Memory management proved particularly difficult—Mac OS 8.5 allocates tiny memory partitions by default, forcing Reese to use MaxApplZone() and NewPtr() functions from the Mac Memory Manager to secure enough heap space. Additionally, the model's grouped-query attention configuration (with 4 key-value heads instead of 8) caused pointer alignment issues that produced NaN outputs until corrected.
The final implementation uses static buffers for the KV cache and run state to avoid malloc failures on the limited 32MB system. While the output is necessarily short due to hardware constraints, the project successfully demonstrates that even extremely limited hardware can run modern AI models with sufficient optimization. The system reads prompts from prompt.txt, tokenizes them using byte-pair encoding, runs inference, and writes continuations to output.txt—all on hardware that predates the smartphone era by nearly a decade.
- Runs Andrej Karpathy's 260K parameter TinyStories model (Llama 2 architecture) in just 1MB
- Required endian-swapping from little-endian to big-endian for PowerPC processor compatibility
- Used custom memory management (MaxApplZone() + NewPtr()) to overcome Mac OS 8.5's 32MB RAM limitations
Why It Matters
Demonstrates AI's potential on edge devices and resource-constrained hardware, pushing boundaries of where models can run.