
Inference Without a Round Trip
WebGPU shipped, quantization matured, and small-but-useful models exist. The browser is an LLM runtime now, and the consequences are bigger than the novelty.
10 MAY 202612 min read
READTAGS // LLM
1 post filed under this tag.

WebGPU shipped, quantization matured, and small-but-useful models exist. The browser is an LLM runtime now, and the consequences are bigger than the novelty.