Skip to content Where Legends Are Made
Cooperative Institute for Research to Operations in Hydrology

CIROH Training and Developers Conference 2026 Abstract

Authors: Leul Agonafer, Andy Wood, Qiushi Wei – Colorado School of Mines 

Title:   Benchmarking Meteorological Forcing Datasets for Hydrologic Applications using LSTM-based Streamflow and Snow Modeling  

Presentation Type:   Poster Presentation 

Abstract:  Gridded meteorological forcing datasets are a foundational input for hydrological models, but few studies have systematically benchmarked the available datasets against one another or characterized performance differences across hydroclimatic regimes. This study evaluates how forcing dataset choice affects model skill across two prediction tasks: streamflow at 671 CAMELS basins and snow water equivalent (SWE) at 800 SNOTEL stations in the western United States, using identically configured LSTM models trained within the NeuralHydrology framework to isolate forcing as the sole experimental variable. Evaluated datasets to date include PRISM, nClimGrid, GridMET, NLDAS-3, and GPEP, with a half dozen additional datasets currently being incorporated. Model skill was assessed using NSE, KGE, and their decomposed components, with results stratified by elevation and aridity quartiles to characterize regime-dependent performance. PRISM achieved the highest median streamflow NSE (0.78), followed by GridMET (0.74) and nClimGrid (0.70). PRISM and GridMET maintained consistent performance across all elevation quartiles, while nClimGrid, NLDAS-3, and GPEP improved monotonically with elevation, suggesting sensitivity to low-elevation data quality. Skill for all datasets degraded in arid basins, and low-elevation arid basins emerged as the regime where dataset choice matters most, with performance gaps between datasets widening substantially relative to humid conditions.GPEP underperformance in arid basins was driven primarily by reduced correlation rather than bias or variability errors, pointing to event timing as the primary weakness. For SWE prediction, PRISM and nClimGrid outperformed other datasets. The study provides evidence based guidance for forcing dataset selection in hydrologic applications, including streamflow forecasting.