Skip to content Where Legends Are Made
Cooperative Institute for Research to Operations in Hydrology

Developing and benchmarking data assimilation methods on a standardized testbed

Principal Investigator: Chaopeng Shen
Research Team: Grey Nearing, Martyn Clark, Andrew Wood, Ming Pan, Seann Reed
Insitution: Pennsylvania State University, University of California - Davis, University of Calgary, Colorado School of Mines, University of California - San Diego, Middle Atlantic River Forecast Center
Start Date: June 1, 2023 | End Date: May 31, 2025
Research Theme:

Data assimilation (DA) is a widely employed and hugely beneficial technique to improve forecast quality because it absorbs information from recent observations to update or correct model states so the model can better predict near-term forecasts. We currently do not have a standard community testbed to evaluate DA methods in hydrology for different models, across different climatic regimes. Streamflow data assimilation has traditionally been challenging due to time delay and poorly understood space-time and intervariable covariances. Recently, machine learning (ML) and physics-informed ML hydrologic models (more specifically, “differentiable” models) have outperformed in simulating hydrologic variables. However, ML and physics-informed ML methods have different mechanisms than traditional ones — they can produce novel gradient information that gives rise to new opportunities. Several ML-based DA methods proposed for streamflow, on the other hand, have also shown great promise. The project will develop DA methods for ML and differentiable hydrologic models, and benchmark them against traditional DA methods and traditional hydrologic process models. We will build out the current testbed project (led by Co-PI Wood) to encompass a focus on DA methods, where different hydrologic models can be coupled modularly to different DA methods to be benchmarked. We will address several outstanding research questions, including: (i) how to leverage observations of related watershed state variables at different scales, e.g., soil moisture, snow water equivalent, and water storage, to improve streamflow forecasts and vice versa; (ii) how to use in situ streamflow observations to improve forecasting of streamflow everywhere in the river network. The testbed setting consists of large-sample datasets and several regional-scale river networks (to be designed together with regional forecasting centers and coordinated with other projects) with inputs and evaluation cases prepared. We aim to develop approaches that integrate with the Next Generation Water Resources Modeling Framework (Nextgen) so it can be easily adapted to operations in future versions of the National Water Model (NWM). While the neural networks training must take place in a specialized, differentiable environment, the trained networks for the DA pipeline will be compliant with the standards of the Basic Model Interface (BMI).