Skip to content Where Legends Are Made
Cooperative Institute for Research to Operations in Hydrology

NextGen Simulation Workshop

NextGen Simulation Development Tools

Day 2 Session 2 (1:30 PM MDT)

Presenters:

Josh Cunningham
The University of Alabama

Jordan Laser
Lynker

Matching the hydrological model with the dominant regional process requires a framework in which different models can be controlled and communicate efficiently. The NextGen National Water Model Framework gives the hydrologic community finely-tuned configuration over orchestrating multiple models. We are excited to see community contribution to the development of models running within the framework, ultimately improving our capacity for national hydrologic prediction. However, the highly configurable nature of the framework implies a number of complex input options. This up-front complexity calls for a simple interface to provide a baseline simulation capacity. In addition, after exploratory simulations begin to indicate a research direction, creating the input files and managing potentially millions of files associated with each simulation, all of which must be versioned to ensure reproducibility, can be computationally expensive and time consuming if not done efficiently. The complexity of running the NextGen framework for in depth analysis motivates a compact tool that has features like ngen-datastream.

In order to prepare simple baseline NextGen framework configurations for exploratory simulation and in order to manage complex longer-term simulations for deep analysis in a relatively efficient and reproducible fashion, we have created a pair of complementary community accessible tools called NextGen_data_preprocessor and ngen-datstream. These software automates the process of collecting and formatting input data for NextGen, orchestrating the NextGen run through NextGen In a Box (NGIAB), and handling outputs for these two different by related cases. The preprocessor is designed for simple one-off explorations. The datastream supports more in depth metadata services that track configuration changes across a global network of NextGen runs. These metadata also inform operating cost through compute optimization. The datastream is designed to scale within cloud-based HPC architecture, while still being deployable on a laptop.

Tools Internal to ngen-datastream:

  1. Next Generation Water Resource Modeling Framework Hydrofabric – geopackages that contain all tables, spatial data, and lookups relevant to a hydrofabric data model. https://github.com/NOAA-OWP/hydrofabric
  2. hfsubset – Subsets a geopackage out of the Lynker-spatial conus hydrofabric for a user defined spatial domain. Maintained along side the hydrofabric.
    https://github.com/LynkerIntel
  3. nwmurl – python tool to provide forcing file URLs. https://github.com/CIROH-UA/nwmurl
  4. forcingprocessor – python tool to convert existing forcing netCDFs to NextGen catchment-level CSV’s. Collects impactful runtime metadata. Easy to install and use, cloud native, and scalable.
  5. NextGen in a Box (NGIAB) – ready-to-run, containerized and cloud-friendly version of NextGen framework, packaged with scripts to help prepare data and get you modeling quickly.
  6. merkdir – Merkle tree based hashing algorithm that efficiently signs and timestamps all files in a run directory. This allows for detailed versioning of NextGen runs.

Learning Outcomes:

  • Understand the process for producing a NextGen-based simulation package with the Ngen_data_preprocessor
  • Understand the motivation for a tool that conducts the flow of data from one subtool to the next, efficiently, and abstracted away so that the user can perform the quick, informed, and cost effective NextGen runs
  • Understand the inputs and running of the ngen-datastream. Also understand the outputs and their value
  • Understand conceptually each step in the datastream and its value
  • Have a use case in mind for ngen-datastream

Prerequisites:

Knowledge:

  • Can build on an earlier workshops: hydrofabric/hfsubset, NGIAB, NextGen, python, docker, AWS/Cloud

Software:

Accounts: