Gradient-based Method for Automatically Generating Labelled Baseflow Dataset

Authors: Eniola Webster-Esho, T. Prabhakar Clement – The University of Alabama; Xueyi Li, Amin Aghababaei, Gustavious Williams, Norm Jones — Brigham Young University; Ryan van der Heijden, Donna M. Rizzo – University of Vermont

Presentation Type: Poster

Title: Gradient-based Method for Automatically Generating Labelled Baseflow Dataset

Abstract: Baseflow is a crucial aspect of understanding streamflow dynamics, and it is pivotal for sustaining human populations and ecosystems. Groundwater’s contribution is the only water source sustaining stream flow during drought. Therefore, precise knowledge of baseflow and understanding its role in streamflow is indispensable for effective water resource planning and management. The processing and automation of large baseflow datasets using machine learning algorithms will require large amounts of labeled training data (i.e., known baseflow segments) in the streamflow data for validation and prediction. Baseflow separation methods use subjective, time-intensive hand calculations susceptible to considerable errors. On the other hand, automated methodologies offer a compelling solution, ensuring consistency, efficiency, and accuracy through well-defined computational algorithms. In this study, we employ the local minimum method for detecting local minima of the receding limb of the hydrograph by applying a threshold based on the drainage area as a parameter. We used Pearl River and Chickasawhay River daily streamflow records, collected from January 2014 to December 2017, to demonstrate this approach. After separating the baseflow function, the baseflow gradient was derived from the difference in the computed value of baseflow values using a daily time step. This gradient exhibits low values only during baseflow and high values when the precipitation event highly impacts the baseflow. We applied two filtering conditions to process the data: first, a gradient threshold was used to determine whether a segment represents baseflow; second, a flow threshold eliminates data points near high flows, thereby avoiding the erroneous identification of low gradients during high flow events. A Python code was developed to implement this workflow. Adjusting the gradient threshold allowed us to fine-tune the algorithm’s sensitivity to data variations during the baseflow segment identification step. This method efficiently mitigates human error and is particularly helpful for handling extensively large datasets.

CIROH Training and Developers Conference 2024 Abstracts