llm.
25 writings found
Latest Archives
Why Google's AI Still Can't Spell 'Strawberry' Right
The search giant's latest AI errors expose a fundamental flaw in how LLMs process text
Datasette Agent Is the Bridge Between LLMs and Your Data
A technical look at how Datasette Agent brings AI querying to SQLite databases
Datasette Agent: Where Three Years of LLM Development Finally Clicked
The convergence of conversational AI and data querying opens new possibilities for personal data assistants and extensible tooling.
Datasette Agent: When Data Tools Meet Conversational AI
Three years of LLM work finally converges with Datasette. What this means for building AI systems around your data.
LLM 0.32a0: Rethinking Abstractions for Modern Language Models
Simon Willison's LLM library gets a major refactor to handle multi-modal inputs, streaming typed responses, and the messy reality of frontier models.
LLM 0.32a0: Why Your Prompt/Response Mental Model is Already Obsolete
Simon Willison's LLM library gets a major refactor to handle multi-modal inputs, streaming message parts, and the messy reality of modern AI models
LLM 0.32a0: When Your Abstraction Meets Reality's Complexity
Simon Willison's LLM library gets a major refactor to handle the messy, multi-modal world of modern AI models. Here's why abstractions always break.
Permission Hungry Agents and the Return to First Principles
ThoughtWorks Radar 34 reveals AI's paradox: we're racing forward while rediscovering software fundamentals, and our security models aren't ready.
The Permission Hungry Dilemma: When AI Agents Want Access to Everything
ThoughtWorks Radar 34 highlights a fundamental tension: the most useful AI agents need broad access, but our security guardrails haven't caught up yet.
Epic's AI NPCs in Fortnite: When Your Quest Giver Can Go Off-Script
Epic Games lets developers create AI-powered Fortnite characters with conversation capabilities. But there are some very specific rules about what they can't be.
When Benchmarks Break: A Laptop Model Drew Better Pelicans Than Claude Opus
A quantized 21GB model running locally outperformed Anthropic's flagship on SVG generation. What this tells us about AI benchmarks and model comparison.
When Benchmark Performance Stops Meaning What We Think It Means
A quantized local model outdraws Claude Opus 4.7 at pelicans on bicycles. What does that tell us about AI benchmarks? Probably nothing good.
Meta's Muse Spark: A Tooled-Up Return to Frontier Models
Meta launches Muse Spark with 16 built-in tools, visual grounding, and Code Interpreter. But where's the open source promise?
Meta's Muse Spark: A Developer's First Look at the Tool Arsenal
Meta returns to frontier models with Muse Spark. I got my hands dirty with its 16 tools, from visual grounding to Python sandboxes, and here's what matters.
Meta's Muse Spark: A Tool-Heavy Return to the Frontier Model Race
Meta drops Muse Spark with 16 tools, Code Interpreter, visual grounding, and Meta content search. But is a hosted-only model what we really wanted?