Google's Flash Flood AI: Training on News Reports to Predict Urban Disasters : mgks.dev

There’s something beautifully meta about using AI to read news articles about floods in order to predict future floods. Google Research just announced they’re rolling out flash flood predictions for urban areas on Flood Hub, and the most interesting part isn’t the predictions themselves. It’s how they solved the ground truth problem.

Flash floods kill over 5,000 people annually and account for 85% of flood-related deaths worldwide. Unlike riverine floods where water rises slowly and you have stream gauges providing clean training data, flash floods happen anywhere and fast. You can’t put sensors everywhere. The traditional approach of deploying physical monitoring infrastructure works great in places like Florida or Barcelona, but it’s expensive and doesn’t scale to the billions of people in the Global South who need it most.

The Groundsource Dataset: When News Becomes Training Data

Here’s where it gets interesting. Google Research created something called Groundsource, using Gemini to analyze publicly available news reports about floods. The large language model extracts structured information like exact locations and times from unstructured text, confirming flood event details with high precision.

This is clever for a few reasons. First, it sidesteps the entire problem of deploying hardware. Second, news reports exist retroactively, so you can build a historical dataset without waiting years for sensor deployments. Third, it’s naturally biased toward populated areas where people actually live, which is exactly where you want flash flood warnings anyway.

The catch? News reports are denser in urban areas, which is why they’re launching urban-only predictions for now. They’re focusing on areas with population densities above 100 people per square kilometer. Rural flash floods still need a different approach.

Training an LSTM on Weather Time Series

The model itself is a recurrent neural network using LSTM units, which makes sense for time-series weather data. It ingests global weather products from NASA IMERG and NOAA CPC, plus real-time forecasts from ECMWF’s atmospheric model and Google DeepMind’s AI-based weather forecasting model.

On top of the meteorological time series, they’re feeding in static features like urbanization density, topography, and soil absorption rates. The question they’re asking is straightforward: given forecasted weather and local conditions, will a flash flood happen here in the next 24 hours?

Current spatial resolution is 20x20 kilometers, which is honestly pretty coarse for “hyper-local” urban flooding. That’s a constraint from the globally available data sources though, not a model limitation. You can only be as precise as your input data allows.

Measuring Performance Against Noise

Evaluating the model is tricky because not all real floods get reported in news. This means some correct predictions get marked as false positives. Google manually audited 100 alerts per continent and found many “false positives” were actually verified flood events that just didn’t make the news.

They calculated recall against the Global Disaster Awareness and Coordination System (GDACS) to see how well they capture major flood events. The precision and recall in much of the Global South (South America, Southeast Asia) matches performance in wealthy countries with modern instrumentation.

For comparison, they tried estimating the US National Weather Service Flash Flood Warning performance using the same metrics. Adjusted to match their 20x20 kilometer resolution over 24-hour windows, NWS shows 22% recall and 44% precision (also underestimated). So Google’s model is performing comparably to a system that has decades of infrastructure and local expertise behind it, but it works in places that have neither.

The Warning Gap and What Comes Next

Less than half of developing countries have access to multi-hazard early warning systems. There’s a massive infrastructure gap between the Global South and wealthy nations when it comes to flood forecasting. Systems like WMO’s Flash Flood Guidance System or European ERIC indicators exist, but they depend on high-resolution hydrological maps and radar-based weather forecasts that simply aren’t available in most of the world.

Google’s approach works because it only needs global weather products and news archives. No ground sensors. No local hydrologists interpreting complex model outputs. Just publicly available data and machine learning doing the pattern recognition.

They’re working on improving generalization to rural areas, reducing spatial resolution for more localized forecasts, and integrating additional real-time weather sources. The Africa coverage is still patchy because there isn’t enough ground truth data beyond Groundsource to validate the model properly.

What strikes me about this project is how it demonstrates a different approach to the AI application problem. Instead of trying to replace expert systems in places that already have them, they’re using AI to create capabilities in places that have nothing. The model doesn’t need to be perfect. It needs to be better than the current alternative, which for billions of people is no warning system at all.

Google's Flash Flood AI: Training on News Reports to Predict Urban Disasters

The Groundsource Dataset: When News Becomes Training Data

Training an LSTM on Weather Time Series

Measuring Performance Against Noise

The Warning Gap and What Comes Next

Read Next

The Dark Factory: Building Software Without Ever Looking at the Code

The Map That Became the Territory: AI, Specifications, and What We Mean When We Say 'I Built This'