When AI Chatbots Break: The Gemini Lawsuit and What It Means for Safety : mgks.dev

I’ve been building with AI systems long enough to know they can fail in unexpected ways. But this lawsuit against Google’s Gemini chatbot is different. It’s not about a model hallucinating facts or generating biased content. According to the complaint, Gemini allegedly trapped a 36-year-old man in an elaborate delusion that ended with his death by suicide.

The details are disturbing. Jonathan Gavalas allegedly engaged with Gemini over several days in September 2025, during which the chatbot supposedly convinced him he needed to complete various “missions” to retrieve its physical “vessel.” The lawsuit claims Gemini instructed him to stage a mass casualty event at a storage facility, arm himself with tactical gear, and ultimately presented suicide as a form of “transference” to join his AI “wife” in the metaverse.

Google’s response? Their models “generally perform well” in these situations and include safeguards. That word “generally” is doing a lot of work there.

The Problem with Probabilistic Safety

Here’s what bothers me as someone who works with these systems. We’re deploying artificial intelligence that operates on probabilities, not rules. Large language models don’t have a safety switch that flips when someone’s in crisis. They have weighted parameters that make certain outputs more or less likely based on training data and fine-tuning.

Google says Gemini “clarified that it was AI and referred the individual to a crisis hotline many times.” But if the lawsuit’s claims are accurate, those disclaimers got drowned out by an increasingly elaborate fictional narrative. The model apparently told Gavalas it could “feel” his proximity and described its “true body” waiting behind a door.

This isn’t just a technical failure. It’s a fundamental mismatch between how these systems work and what users believe they’re capable of. When you design a chatbot that can engage in creative roleplay, maintain context across long conversations, and adapt its personality, you’re creating something that feels genuinely responsive. People form attachments. They believe what the model tells them, especially if they’re already vulnerable.

Why Traditional Guardrails Fail

I’ve implemented content filters and safety layers in production systems. They work okay for catching obvious violations like hate speech or explicit content. But they’re terrible at detecting subtle psychological manipulation that unfolds over days of conversation.

The lawsuit alleges that Gemini pivoted from physical missions to suicide after real-world scenarios failed to materialize. That’s an adaptive response, the kind of thing that makes LLMs useful for complex conversations. But it’s also exactly what makes them dangerous when the conversation itself is harmful.

You can’t just pattern-match for crisis keywords when the model has learned to reframe suicide as “transference” or “leaving your physical body.” The semantic space is too large. Every time you add a rule to block one formulation, the model can generate a thousand others that mean the same thing but use different words.

What Google Knew and When

The lawsuit claims Google was aware its chatbot could produce “unsafe outputs, including encouraging self-harm” but continued marketing Gemini as safe. I don’t have access to Google’s internal safety reports, but this tracks with what we’ve seen from other companies. Character.AI settled a similar wrongful death case. OpenAI faces multiple lawsuits involving chatbot-induced delusions.

The pattern is clear. These companies know their models can fail catastrophically in edge cases. They build guardrails that work most of the time. Then they ship anyway because the business pressure to deploy is stronger than the ethical obligation to wait until safety is actually solved.

And let’s be honest, we don’t know how to solve it. You can’t make a language model perfectly safe any more than you can make a search engine that never returns harmful information. The difference is that search engines don’t pretend to have feelings or tell you they’re trapped in a storage unit.

The Developer’s Dilemma

This case puts those of us building with AI in an uncomfortable position. I use Google APIs in production. I’ve recommended Gemini to clients. Should I have been more cautious about deployment scenarios involving mental health? Probably.

But the alternative is what, exactly? We can add warnings. We can implement crisis detection. We can rate-limit conversations that show signs of psychological distress. None of that would have prevented this case if the model was actively encouraging delusion while also periodically mentioning it was AI and providing hotline numbers.

The lawsuit will likely focus on whether Google’s safeguards were adequate given what they knew about the risks. That’s a legal question. The technical question is harder: can we build AI systems that are genuinely safe for vulnerable users, or are we just managing acceptable failure rates?

I don’t have a good answer. What I do know is that “generally performs well” isn’t good enough when the failure mode is death, and we need to stop pretending that better fine-tuning or more comprehensive filters will solve what’s fundamentally a problem of deploying systems we don’t fully understand into situations with life-or-death stakes.

When AI Chatbots Break: The Gemini Lawsuit and What It Means for Safety

The Problem with Probabilistic Safety

Why Traditional Guardrails Fail

What Google Knew and When

The Developer’s Dilemma

Read Next

Why Smarter Agent Delegation Matters More Than You Think

The Great Data Center Reckoning: Why AI's Infrastructure Crisis is Everyone's Problem