
Inference Without a Round Trip
WebGPU shipped, quantization matured, and small-but-useful models exist. The browser is an LLM runtime now, and the consequences are bigger than the novelty.
I design software systems for a living and build personal projects to stay sharp. This is where I write about both and whatever else has caught my attention. If that sounds interesting, stick around.
RECENT WRITING

WebGPU shipped, quantization matured, and small-but-useful models exist. The browser is an LLM runtime now, and the consequences are bigger than the novelty.

A 2017 in-place optimization in the kernel's crypto subsystem turned out to be a 4-byte page-cache write primitive. Disclosed last week, exploitable from any unprivileged shell, fixed by a single mainline commit.
SIDE PROJECTS