Convolutional neural networks for precipitation phase determination to enhance western US snowpack evolution within NextGen
Research Team Members
Objective:
Incorrect estimations of precipitation phase yield cumulative errors in estimated snow water equivalent (SWE) and snow depth. Inaccurate estimates of SWE and snow depth subsequently create uncertainties in peak SWE, snow cover duration, downstream runoff, and other hydrological processes. As a notable proportion of the western Contiguous United States (CONUS) runoff occurs as warm-season snowmelt in the mountainous regions, accurate predictions of precipitation phase become vital. The National Water Model, designed to predict streamflow for 2.7 million river reaches, estimates the precipitation phase using surface temperatures. Numerous studies of precipitation phase determination from surface temperatures show significant spatial and temporal variability across the western CONUS; additionally, surface air temperature cannot account for physical processes above the surface. Here, we propose the prediction of precipitation phase with a random forest (RF) and the High-Resolution Rapid Refresh (HRRR) model output above-ground and at the surface to improve SWE and snow depth estimates in the National Water Model. Target precipitation phase values are formulated indirectly using SNOw TELemetry (SNOTEL) observations and are directly observed from the crowdsourced Mountain Rain or Snow (MRoS) project.
Approach:
Our research plan contains three phases: (1) Evaluate the seasonal atmospheric column characteristics between precipitation type -- rain, snow, and mixed; (2) Train, with hyperparameter tuning, Random Forest Models (RFs) based on observationally-derived estimates of precipitation phase. We use two sources for precipitation phase, one derived from The SNOw TELemetry (SNOTEL) network, and a second based on crowd sourced citizen science (the Mountain Rain or Snow; MRoS dataset); (3) compare the testing predictions of the varying RFs and the precipitation phase output of the HRRR model to the observed values.
RFs will be trained on the hourly HRRR output of the nearest grid point to each precipitation phase observation. The SNOTEL and MRoS precipitation phase data sets vary by their observation type and density by elevation. Therefore, we will train one RF on SNOTEL, one RF on MRoS, and another on a combination of both data sets. This will allow for an evaluation of the predictability of each data set and the viability of combining two observational precipitation phase data sets. Lastly, the testing results of the RFs and the HRRR model output will be compared to the observed precipitation phase. The RFs aim to predict precipitation phase found at SNOTEL sites and/or observed from MRoS with better accuracy than the HRRR model.
Impact:
Within the NWM framework, we anticipate that improvements in snow water storage will enhance medium to long-range forecasting skill to encourage the use of the NWM for water supply forecasting applications.Abstract:
Accurate precipitation phase determination remains a critical need for hydrologic models like the National Water Model (NWM), which tends to underestimate snow water equivalent (SWE) due in part to errors in the input data sets. Currently, precipitation phase for the National Water Model is based on semi-empirical formulations in a regional forecast model. To address this, we are developing machine learning models, including Random Forests (RFs), to enhance precipitation phase prediction across the western U.S., supporting better snowpack representation within the NextGen framework.
Our approach integrates atmospheric fields from NOAA’s 3-km High-Resolution Rapid Refresh (HRRR) model with target precipitation phase values derived from two complementary observational datasets. First, we apply the Snow Rain Ratio (SNRR) method to SNOw TELemetry (SNOTEL) site data (2008–2020), which quantifies precipitation phase on a continuous scale (0 = rain, 1 = snow). Second, we incorporate categorical precipitation phase observations, converted to SNRR values, from the citizen-science-based Mountain Rain or Snow (MRoS) project. Together, these datasets allow training over a wide range of elevations and weather regimes, with SNOTEL favoring high-altitude locations and MRoS capturing valley conditions.
Separate RF models are trained on each precipitation phase dataset and their combination, allowing evaluation of phase predictability and cross-dataset performance. RF predictions will be compared to HRRR model output and tested against held-out observations. Input features include temperature, humidity, wind, and vertical velocity across vertical levels. RF skill will be compared to more computationally elaborate Convolutional Neural Network (CNN) methods used in our prior work.
We will assess the impact of improved precipitation phase forcing on snowpack accumulation and melt using multiple modeling frameworks: Noah-MP (used by NWM), iSnobal, and the CIROH machine-learning-based snow model SWEMLv2.0. Case studies in the Sierra Nevada and Upper Colorado River basins will benchmark snow model outputs against high-resolution observations from the Airborne Snow Observatory (ASO), inc. Within the NWM framework, we anticipate that improvements in snow water storage will enhance medium to long-range forecasting skill to encourage the use of the NWM for water supply forecasting applications.
Figure 1. Adapted from Johnson et al. AGU Fall 2024 Conference. While seasonally accumulated precipitation has the greatest feature importance in the SWEMLv2.0 architecture, coarse-resolution precipitation products in the Sierra Nevada mountain range are the leading cause of prediction error. The above figure illustrates this with the error (right plot) showing grid lines that match the NLDAS precipitation grid, e.g., the blue area of overprediction aligning with the precipitation grid.