artificial-intelligence.
49 writings found
Latest Archives
When Machines Write Code, Humans Must Learn to Judge
As LLMs generate more code, teams face cognitive surrender and debt proliferation. The future isn't about writing code, it's about verification.
Meta's KernelEvolve: When AI Writes Its Own Performance Code
Meta's KernelEvolve system uses AI agents to automatically optimize low-level hardware kernels, achieving 60% performance gains in hours instead of weeks.
When Agents Write Code, We Judge It: The Verification Economy
As LLMs generate code at scale, our job shifts from writing to verifying. What does this mean for how we organize teams and think about programming?
Making Team Standards Executable: Infrastructure for AI-Assisted Development
AI coding tools produce wildly different results based on who's prompting. Treating team standards as versioned, executable instructions solves the consistency problem.
Why Your AI Benchmark Is Probably Wrong: The N,K Trade-off
Google Research reveals why using 3-5 human raters per item isn't enough for reproducible AI evaluation. The depth vs breadth problem explained.
Meta's AI is Reshoring American Concrete, One Mix at a Time
How Bayesian optimization is helping U.S. concrete producers ditch imported cement and redesign mixes in days instead of months.
Code Review, Observability, and the Cognitive Cost of AI Amplification
Rethinking code review as product judgment, observability as our new IDE, and whether AI tools extend our capabilities or replace them entirely.
The Uncomfortable Ease of Profiling Users Through Their Public Comments
Building a tool to profile Hacker News users with LLMs reveals how much we leak through casual comments, and raises questions about digital footprints.
Google's Healthcare AI Push: From Screening Rooms to Source Code
Google Research unveils healthcare AI spanning breast cancer detection, agentic systems, and open-weight models. What it means for developers building in this space.
Can AI Actually Understand Physics? Google's Superconductivity Test Reveals Surprising Answers
Google tested six LLMs on expert-level physics questions. The results show which AI systems can handle real scientific research and which ones hallucinate.
Google's Flash Flood AI: Training on News Reports to Predict Urban Disasters
Google Research uses Gemini to extract flood data from news articles, creating an AI model that predicts flash floods 24 hours early across the Global South
Google's Flash Flood AI: Training Neural Networks on News Articles
Google Research uses Gemini to scrape news reports for flood data, training ML models that predict urban flash floods 24 hours ahead. Here's why that's wild.
LLMs Don't Actually Push You Toward Boring Technology
Coding agents work surprisingly well with new, undocumented tools. The 'training data bias' concern might be overstated in 2026.
LLMs Don't Actually Care About Your Tech Stack
Modern coding agents work surprisingly well with new tools and private codebases, challenging the assumption that they're biased toward mainstream tech.
Why Coding Agents Might Not Lock Us Into Boring Technology After All
Modern LLMs can learn new tools on the fly through documentation and examples. The feared training data bias might be less of an issue than we thought.