Authors: Daniel P. Ames, Guilherme da Silva, Jacob Anderson, Iman Maghami, Kel Markert – Brigham Young University; James Halgren, Arpita Patel – University of Alabama
Title: Leveraging Google Cloud Data Services for Improving Access to Current and Historical National Water Model Forecasts
Abstract: The US National Water Model generates massive quantities of data at 2.7M stream segments across the nation. These data include streamflow predictions at various lead times that are produced as temporally stacked NetCDF files with one national-scale file per prediction time step. Because of this data structure, extracting a single time series forecast for a single river reach is a non-trivial task, generally requiring the user to download gigabytes of NetCDF files and then, sequentially processing each file, retrieve one data value from the 2.7M values contained in each file. Cloud data services such as those provided by Google Cloud Services can be leveraged to simplify access to and retrieval of individual forecasts for specific study areas, stream segments, regions, etc. We are working with Google engineers to design a relational database table structure in Google BigQuery together with a workflow for importing NWM forecasts into this indexed, rapidly searchable database. We are also developing a REST interface for querying the data on BigQuery‚ ultimately providing researchers and water managers with the ability to pass a single reach ID, multiple ID’s, or a geographic region into the cloud and quickly retrieve data for a specified forecast period and type. These queries will be processed in the cloud, eliminating the need for downloading and processing NetCDF files and leveraging the cloud speed, computational efficiencies, and data storage.