
Inference Without a Round Trip
WebGPU shipped, quantization matured, and small-but-useful models exist. The browser is an LLM runtime now, and the consequences are bigger than the novelty.
10 MAY 202612 min read
READTAGS // PERFORMANCE
3 posts filed under this tag.

WebGPU shipped, quantization matured, and small-but-useful models exist. The browser is an LLM runtime now, and the consequences are bigger than the novelty.

Why most websites running WordPress don't need it, how Hugo wins on performance, cost, security, and complexity, and why LLMs just closed the last gap that was keeping WordPress in play.

Cheng Lou's new 15KB library rewrites the rules of text measurement. Pure arithmetic, no DOM thrashing, and it's already going viral.