Monday, March 24, 2025

Thinking Different, Thinking Slowly: LLMs on a PowerPC Mac

There is something incredibly satisfying about breathing new life into old hardware. Vintage computing is one of my favorite hobbies. The challenge of coaxing modern software onto systems designed decades ago is a puzzle I cannot resist. I have been diving into the world of large language models (LLMs), and a question began to gnaw at me: could I bring the cutting-edge of AI to the nostalgic glow of my trusty 2005 PowerBook G4? Armed with a 1.5GHz processor, a full gigabyte of RAM, and a limiting 32-bit address space, I embarked on an experiment that actually yielded results. I have successfully managed to achieve LLM inference on this classic piece of Apple history, proving that even yesteryear's hardware can have a taste of tomorrow's AI.

PowerBook G4 running TinyStories 110M Llama2 LLM inference
I started by reviewing the llama2.c project from Andrej Karpathy. This brilliant project implements Llama2 LLM inference with just a single file of vanilla C. No accelerators here. Performance is traded for simplicity which makes it easy to understand how inference is carried out.

I forked the core implementation to a project that I have titled ullm. The core algorithm remains the same, but I spent time improving a few aspects of the code so that it would stand up to abuse a little better.