artificial-intelligence.
64 writings found
Latest Archives
Permission Hungry Agents and the Return to First Principles
ThoughtWorks Radar 34 reveals AI's paradox: we're racing forward while rediscovering software fundamentals, and our security models aren't ready.
Permission Hungry Agents and the Return to First Principles
ThoughtWorks' latest radar reveals AI isn't just pushing us forward, it's forcing us back to fundamentals like clean code and security basics.
DeepSeek V4: The Frontier Model That Costs Almost Nothing
DeepSeek drops V4 models that rival GPT and Gemini at a fraction of the cost. The efficiency gains are staggering, and they might run on my laptop.
Facebook's Hybrid Search: When Keywords Meet Neural Embeddings
Meta rebuilt Facebook Groups search by merging traditional keyword matching with dense vector embeddings, then used Llama 3 to validate the results at scale.
Reading Code in the Age of AI: Why Human Review Still Matters
ThoughtWorks Radar reveals AI's paradox: tools generate complexity faster than we can understand it. Time to revisit fundamentals.
The Permission Hungry Dilemma: When AI Agents Want Access to Everything
ThoughtWorks Radar 34 highlights a fundamental tension: the most useful AI agents need broad access, but our security guardrails haven't caught up yet.
Meta's AI Agents Are Now Fixing Their Own Performance Problems
How Meta built a unified AI platform that automates finding and fixing performance issues, recovering hundreds of megawatts without scaling headcount.
Claude's System Prompt Evolution: What Opus 4.7 Tells Us About AI Behavior Design
Anthropic's latest system prompt reveals a shift toward proactive AI behavior. I dig into what these changes mean for developers building with Claude.
PyCon US 2026: Why This Matters for Python and AI Engineering
PyCon returns to California with new AI and security tracks. What this shift means for the Python community and the future of technical conferences.
When Benchmarks Break: A Laptop Model Drew Better Pelicans Than Claude Opus
A quantized 21GB model running locally outperformed Anthropic's flagship on SVG generation. What this tells us about AI benchmarks and model comparison.
When Benchmark Performance Stops Meaning What We Think It Means
A quantized local model outdraws Claude Opus 4.7 at pelicans on bicycles. What does that tell us about AI benchmarks? Probably nothing good.
The Virtue of Laziness: Why AI Threatens What Makes Us Good Engineers
LLMs lack the programmer's essential virtue of laziness. Without constraints, they generate complexity instead of elegant abstractions.
Meta's Muse Spark: A Tooled-Up Return to Frontier Models
Meta launches Muse Spark with 16 built-in tools, visual grounding, and Code Interpreter. But where's the open source promise?
Meta's Muse Spark: They're Back in the Frontier Game (And the Tools Are Wild)
Meta drops Muse Spark with 16 powerful tools including visual grounding, Python sandbox, and Meta content search. Are they back in the race?
AI-Assisted Development: The Taste Problem
Why coding with AI agents works brilliantly for implementation but falls apart for API design. Lessons from building real systems with Claude.