Tag

llm.

21 writings found

Latest Archives

May 5, 2026

LLM 0.32a0: Rethinking Abstractions for Modern Language Models

Simon Willison's LLM library gets a major refactor to handle multi-modal inputs, streaming typed responses, and the messy reality of frontier models.

May 4, 2026

LLM 0.32a0: Why Your Prompt/Response Mental Model is Already Obsolete

Simon Willison's LLM library gets a major refactor to handle multi-modal inputs, streaming message parts, and the messy reality of modern AI models

May 3, 2026

LLM 0.32a0: When Your Abstraction Meets Reality's Complexity

Simon Willison's LLM library gets a major refactor to handle the messy, multi-modal world of modern AI models. Here's why abstractions always break.

Apr 26, 2026

Permission Hungry Agents and the Return to First Principles

ThoughtWorks Radar 34 reveals AI's paradox: we're racing forward while rediscovering software fundamentals, and our security models aren't ready.

Apr 21, 2026

The Permission Hungry Dilemma: When AI Agents Want Access to Everything

ThoughtWorks Radar 34 highlights a fundamental tension: the most useful AI agents need broad access, but our security guardrails haven't caught up yet.

Apr 20, 2026

Epic's AI NPCs in Fortnite: When Your Quest Giver Can Go Off-Script

Epic Games lets developers create AI-powered Fortnite characters with conversation capabilities. But there are some very specific rules about what they can't be.

Apr 17, 2026

When Benchmarks Break: A Laptop Model Drew Better Pelicans Than Claude Opus

A quantized 21GB model running locally outperformed Anthropic's flagship on SVG generation. What this tells us about AI benchmarks and model comparison.

Apr 16, 2026

When Benchmark Performance Stops Meaning What We Think It Means

A quantized local model outdraws Claude Opus 4.7 at pelicans on bicycles. What does that tell us about AI benchmarks? Probably nothing good.

Apr 15, 2026

Meta's Muse Spark: A Tooled-Up Return to Frontier Models

Meta launches Muse Spark with 16 built-in tools, visual grounding, and Code Interpreter. But where's the open source promise?

Apr 13, 2026

Meta's Muse Spark: A Developer's First Look at the Tool Arsenal

Meta returns to frontier models with Muse Spark. I got my hands dirty with its 16 tools, from visual grounding to Python sandboxes, and here's what matters.

Apr 11, 2026

Meta's Muse Spark: A Tool-Heavy Return to the Frontier Model Race

Meta drops Muse Spark with 16 tools, Code Interpreter, visual grounding, and Meta content search. But is a hosted-only model what we really wanted?

Apr 4, 2026

When Machines Write Code, Humans Must Learn to Judge

As LLMs generate more code, teams face cognitive surrender and debt proliferation. The future isn't about writing code, it's about verification.

Mar 29, 2026

Building macOS Apps Without Knowing Swift: What Vibe Coding Actually Teaches Us

I built two monitoring tools for my M5 MacBook using Claude and GPT without writing Swift myself. The results work, but should they?

Mar 11, 2026

LLMs Don't Actually Care About Your Tech Stack

Modern coding agents work surprisingly well with new tools and private codebases, challenging the assumption that they're biased toward mainstream tech.

Mar 10, 2026

Why Coding Agents Might Not Lock Us Into Boring Technology After All

Modern LLMs can learn new tools on the fly through documentation and examples. The feared training data bias might be less of an issue than we thought.

Prev Page 1 of 2 Next →

View all writings →