Tag

artificial-intelligence.

64 writings found

Latest Archives

Permission Hungry Agents and the Return to First Principles

ThoughtWorks Radar 34 reveals AI's paradox: we're racing forward while rediscovering software fundamentals, and our security models aren't ready.

Permission Hungry Agents and the Return to First Principles

ThoughtWorks' latest radar reveals AI isn't just pushing us forward, it's forcing us back to fundamentals like clean code and security basics.

DeepSeek V4: The Frontier Model That Costs Almost Nothing

DeepSeek drops V4 models that rival GPT and Gemini at a fraction of the cost. The efficiency gains are staggering, and they might run on my laptop.

Facebook's Hybrid Search: When Keywords Meet Neural Embeddings

Meta rebuilt Facebook Groups search by merging traditional keyword matching with dense vector embeddings, then used Llama 3 to validate the results at scale.

Reading Code in the Age of AI: Why Human Review Still Matters

ThoughtWorks Radar reveals AI's paradox: tools generate complexity faster than we can understand it. Time to revisit fundamentals.

The Permission Hungry Dilemma: When AI Agents Want Access to Everything

ThoughtWorks Radar 34 highlights a fundamental tension: the most useful AI agents need broad access, but our security guardrails haven't caught up yet.

Meta's AI Agents Are Now Fixing Their Own Performance Problems

How Meta built a unified AI platform that automates finding and fixing performance issues, recovering hundreds of megawatts without scaling headcount.

Claude's System Prompt Evolution: What Opus 4.7 Tells Us About AI Behavior Design

Anthropic's latest system prompt reveals a shift toward proactive AI behavior. I dig into what these changes mean for developers building with Claude.

PyCon US 2026: Why This Matters for Python and AI Engineering

PyCon returns to California with new AI and security tracks. What this shift means for the Python community and the future of technical conferences.

When Benchmarks Break: A Laptop Model Drew Better Pelicans Than Claude Opus

A quantized 21GB model running locally outperformed Anthropic's flagship on SVG generation. What this tells us about AI benchmarks and model comparison.

When Benchmark Performance Stops Meaning What We Think It Means

A quantized local model outdraws Claude Opus 4.7 at pelicans on bicycles. What does that tell us about AI benchmarks? Probably nothing good.

The Virtue of Laziness: Why AI Threatens What Makes Us Good Engineers

LLMs lack the programmer's essential virtue of laziness. Without constraints, they generate complexity instead of elegant abstractions.

Meta's Muse Spark: A Tooled-Up Return to Frontier Models

Meta launches Muse Spark with 16 built-in tools, visual grounding, and Code Interpreter. But where's the open source promise?

Meta's Muse Spark: They're Back in the Frontier Game (And the Tools Are Wild)

Meta drops Muse Spark with 16 powerful tools including visual grounding, Python sandbox, and Meta content search. Are they back in the race?

AI-Assisted Development: The Taste Problem

Why coding with AI agents works brilliantly for implementation but falls apart for API design. Lessons from building real systems with Claude.

View all writings →