benchmarks.
2 writings found
Latest Archives
When Benchmarks Break: A Laptop Model Drew Better Pelicans Than Claude Opus
A quantized 21GB model running locally outperformed Anthropic's flagship on SVG generation. What this tells us about AI benchmarks and model comparison.
When Benchmark Performance Stops Meaning What We Think It Means
A quantized local model outdraws Claude Opus 4.7 at pelicans on bicycles. What does that tell us about AI benchmarks? Probably nothing good.
Prev
Page 1 of 1 Next