Skip to content Where Legends Are Made
Cooperative Institute for Research to Operations in Hydrology

CIROH Training and Developers Conference 2026 Abstract

Authors: Jordan Laser and Zach Wills – Lynker; Arpita Patel, Sai Harsha Vemula, and Quinn Lee Russell – Alabama Water Institute; James Halgren – Brigham Young University 

Title: CIROH Community Forecast Testbed: NextGen Research Datastream 

Presentation Type: Poster Presentation

Abstract:   The NextGen Research DataStream is an open-sourced and reproducible hydrologic prediction system that is designed to readily integrate research advancements in hydrology modeling. The basis of this system is the NextGen Water Resources Modeling Framework (NextGen), which gives the hydrologic community finely-tuned configuration over orchestrating multiple models. NextGen allows for matching both physics-based and AI/ML hydrologic models with the dominant regional process. By making the NextGen configurations public and mutable, hydrology community members are now able to contribute their regionally improved model selection and parameterizations to iteratively increase the overall accuracy of national hydrologic predictions. This system is currently deployed with AWS infrastructure using DataStreamCLI as the backend. DataStreamCLI automates the process of collecting and formatting input data for NextGen, orchestrating the NextGen run through NextGen In a Box (NGIAB), and managing outputs. DataStreamCLI supports both simple one-off explorations and more in-depth metadata services that track configuration changes across a global network of NextGen runs. These metadata also inform operating cost through compute optimization. DataStreamCLI is designed to scale within cloud-based HPC architecture, while still being deployable on a laptop. After exploratory simulations begin to indicate a research direction, creating the input files and managing potentially millions of files associated with each simulation, all of which must be versioned to ensure reproducibility, can be computationally expensive and time consuming if not done efficiently. The complexity of running the NextGen framework for in depth analysis motivates a compact tool that has features like DataStreamCLI and its usefulness is demonstrated in the NextGen Research DataStream.