Physics-Guided Recurrent Graph Networks for Predicting Flow and Temperature in River Networks
Xiaowei Jia, Jacob Zwart, Jeffrey Sadler, Alison Appling, Samantha Oliver, Steven Markstrom, Jared Willard, Shaoming Xu, Michael Steinbach, Jordan Read, Vipin Kumar
PPhysics-Guided Recurrent Graph Networks forPredicting Flow and Temperature in River Networks
Xiaowei Jia , Jacob Zwart , Jeffery Sadler , Alison Appling , Samantha Oliver , Steven Markstrom ,Jared Willard , Shaoming Xu , Michael Steinbach , Jordan Read , and Vipin Kumar University of Pittsburgh U.S. Geological Survey University of Minnesota [email protected], { jzwart,jsadler,aappling,soliver,markstro,jread } @usgs.gov, { willa099,xu000114,stei0062,kumar001 } @umn.edu Abstract —This paper proposes a physics-guided machinelearning approach that combines advanced machine learningmodels and physics-based models to improve the prediction ofwater flow and temperature in river networks. We first builda recurrent graph network model to capture the interactionsamong multiple segments in the river network. Then we presenta pre-training technique which transfers knowledge from physics-based models to initialize the machine learning model and learnthe physics of streamflow and thermodynamics. We also proposea new loss function that balances the performance over differentriver segments. We demonstrate the effectiveness of the proposedmethod in predicting temperature and streamflow in a subsetof the Delaware River Basin. In particular, we show that theproposed method brings a 33%/14% improvement over the state-of-the-art physics-based model and 24%/14% over traditionalmachine learning models (e.g., Long-Short Term Memory NeuralNetwork) in temperature/streamflow prediction using very sparse(0.1%) observation data for training. The proposed method hasalso been shown to produce better performance when generalizedto different seasons or river segments with different streamflowranges.
I. I
NTRODUCTION
Machine learning (ML) models, which have found immensesuccess in commercial applications, e.g., computer visionand natural language processing, are beginning to play animportant role in advancing scientific discovery [1]–[3]. Giventheir power in automatically learning from observation data,ML models are particularly promising in scientific problemsinvolving complex processes that are not completely under-stood by our current body of knowledge. However, scientificproblems often involve non-stationary relationships amongphysical variables which can change over space and time. Inthe absence of adequate information about the physical mech-anisms of real-world processes, traditional ML approaches areprone to false discoveries because it is difficult to capturethese complex relationships solely from data. Moreover, thedata available for many scientific problems is far smaller thanwhat is needed to effectively train advanced ML models.The focus of this paper is on modeling physical systems thathave multiple interacting processes. In particular, we considerthe application of predicting flow and temperature in rivernetworks for both observed and unobserved river segments. Inthis problem, segments in the river network can show differentflow and thermodynamic patterns driven by differences in catchment characteristics (e.g. slope, soil characteristics) andmeteorological drivers (e.g. temperature, precipitation). Thesesegments also interact with each other through the wateradvected from upstream to downstream segments. Moreover,there are often only a handful of river segments in a networkthat are monitored; thus there is limited data to train ML mod-els. Accurate prediction of streamflow and water temperatureaids in decision making for resource managers, establishesrelationships between ecological outcomes and streamflow orwater temperature, and helps predict other biogeochemical orecological processes. For example, drinking water reservoiroperators in the Delaware River Basin need to supply safedrinking water to New York City while also maintainingsufficient streamflow and cool water temperatures in the rivernetwork downstream of the reservoirs [4]. Accurate predictionof streamflow and water temperature helps managers optimizewhen and how much water to release downstream to maintainthe flow and temperature regimes.In scientific domains, physics-based models are often usedto study engineering and environmental systems. Even thoughthese models are based on known physical laws that govern re-lationships between input and output variables, most physics-based models are necessarily approximations of reality dueto incomplete knowledge of certain processes or omissionof processes to maintain computational efficiency. In partic-ular, existing physics-based approaches for predicting rivernetworks simulate the internal distribution of target variables(e.g., streamflow and temperature) based on general physicalrelationships such as energy and mass conservation. However,the model predictions still rely on qualitative parameterizations(approximations) based on soil and surficial geologic classi-fication along with topography, land cover and climate input.Hence, such models can only provide sub-optimal predictionperformance. Furthermore, calibration of physics-based mod-els for river networks is often extremely time intensive dueto interactions among parameters within segments and acrosssegments and also requires expert knowledge of the systemand model to calibrate successfully. The limitations of physics-based models cut across discipline boundaries and are wellknown in the scientific community; e.g., see a series of debatepapers in hydrology [5]–[7].Intuitively, we can model each river segment independently a r X i v : . [ phy s i c s . g e o - ph ] S e p y an individual ML model such as a Recurrent NeuralNetwork (RNN). However, this approach has two major lim-itations: 1) In a river network, there exist many differenttypes of river segments with diverse catchment characteristics(e.g., slopes, elevation, etc.). Note that most of the segmentsare poorly observed or completely unobserved, which makesit impossible to build a purely data driven model for eachsegment separately. 2) The individual models may ignore therich spatial and temporal contextual information, e.g., howthe streamflows from upstream segments impact the watertemperature in downstream segments in the next few days.The first issue could be partly addressed by pre-trainingthe ML model using simulation data produced by physics-based models, but such pre-trained ML models still needsome observations for refinement [8], [9]. In particular, forunobserved river segments, the performance of pre-trained MLmodels can be no better than physics-based models that canhave rather poor performance. Addressing the second issuewill require the development of sophisticated ML architecturethat can leverage latent information that is transferred acrossriver segments.In this paper, we propose a global model, Physics GuidedRecurrent Graph Networks (PGRGrN), which is applied to allthe river segments. The architecture of PGRGrN is based onRecurrent Neural Networks (RNN) and Graph ConvolutionalNeural Networks (GCN), which explicitly captures the spatialinteractions among different river segments as well as theirtemporal dynamics. Modeling of the spatial and temporal con-text is critical for the global ML model as it enables learningof different behavior patterns for different river segments evenwhen they have similar input features on certain dates.Design of such an architecture for this application faces twochallenges. First, existing GCN-based models extract abstractlatent variables (i.e., graph embeddings) to propagate over thenetworks but do not explicitly incorporate the prior physicalknowledge about the interactions among different nodes. Suchlatent variables can become less informative when they arelearned from sparser and less representative observation data,which can make the GCN model not generalizable. To addressthis challenge, we propose to utilize the intermediate variablessimulated by the physics-based model to guide the learningprocess of the graph neural networks. This approach aimsto enforce a physical interpretation to latent variables learnedfrom each river segment by transferring the prior knowledgeencoded by the physics-based model to the proposed MLmodel. Our experimental evaluation shows that this strategyis effective in initializing the ML model, which can then befine-tuned using observations from the river network.The second challenge is that target variables can varydrastically across different processes of a complex system. Forexample, streamflow can vary by several orders of magnitudeacross segments in a river network. When we train a global MLmodel over the entire river network, the training process usingtraditional loss functions (e.g., mean squared loss) tends tooptimize the overall performance over training data availablefor the entire river networks, and thus can be dominated by river segments that contribute most to the overall error (e.g.,segments with high streamflow). However, it is also importantto accurately predict river segments with lower streamflow,as accurate prediction for these segments provides importantinformation regarding the habitat for aquatic life and aquaticbiogeochemical cycling. To address this challenge, we designa new loss function to ensure that the global ML model cansimultaneously capture the local patterns of all the differentsegments. The local patterns of each segment can be extractedusing an individual ML model trained only for this segmentusing simulation data (which is plentiful). Then during thetraining of the global ML model, we use a distance-basedloss function, the contrastive loss function, to enforce itsconsistency with the extracted local patterns.We implement our proposed method in a real-world datasetcollected over 36 years from the Delaware River Basin locatedin the Northeast United States and demonstrate our method’ssuperior predictive performance over existing methods. More-over, we show that the proposed method produces much betterprediction performance using sparse observations and also hasbetter generalizability.Our contributions can be summarized as follows: • We introduce a new recurrent graph network architectureto model a river network with interacting river segments. • We leverage knowledge from physics-based models toguide ML models for extracting latent variables, whichhelps initialize the model while enforcing consistencywith known physical relationships amongst different pro-cesses. • We propose a new contrastive loss function that balancesthe prediction performance over different river segments. • We evaluate the utility in the context of an ecologicallyand societally relevant problem of monitoring river net-works. II. R
ELATED W ORK
Recent works have shown the promise of integrating physicsinto ML models in improving the predictive performanceand generalizability in scientific problems. This is commonlyconducted in several ways, including developing new modelarchitectures [10], [11], applying additional loss functions [8],[9], and modeling prediction residuals [12], [13]. When ap-plied to systems with interacting processes, ML models areexpected to have sufficient capacity to model such interactions.New ML architectures have been designed to enforce knownphysical relationships among multiple internal processes thatjointly convert inputs to outputs [10], [11], [14], thus reducingthe space for searching parameters. ML models have alsoshown great potential in modeling river networks [15], [16].For example, Moshe et al. [16] propose HydroNets, whichcombines the information from each river segment and itsupstream segments for improving streamflow predictions. Italso learns local patterns for each basin by introducing basin-specific model layers in addition to the global model. Thismethod focuses on predicting basins that are well monitoredand it remains limited in generalizing to different scenariosr learning with less data. In contrast, we leverage the priorphysical knowledge to learn latent variables that make themodel more generalizable.The Graph Convolutional Networks (GCN) model hasproven to be effective in automatically extracting latent factorsthat influence the neighbors in a graph. The use of GCN hasalso shown improved prediction accuracy in several scientificproblems [17]–[19]. However, the information propagatedamongst nodes in GCN is essentially an abstract representationlearned by end-to-end training. Such abstract representationsare not meant to enforce consistency with known physical rela-tionships among different processes, such as in river networks.Simulation data have been used to assist in training MLmodels. Since many ML models require an initial choice ofmodel parameters before training, researchers have exploreddifferent ways to physically inform a model starting state.Poor initialization can cause models to anchor in local minima,which is especially true for deep neural networks. One wayto harness physics-based modeling knowledge is to use thephysics-based model’s simulated data to pre-train the MLmodel, which also alleviates data paucity issues. Jia et al. ex-tensively discuss this strategy [8]. They pre-train their Physics-Guided Recurrent Neural Network (PGRNN) models for laketemperature modeling on simulated data generated from aphysics-based model and fine-tune it with little observeddata. They show that pre-training can significantly reduce thetraining data needed for a quality model. In addition, Read etal. [20] show that such models are able to generalize better tounseen scenarios than pure physics-based models.III. P
ROBLEM D EFINITION AND P RELIMINARIES
A. Problem definition
Our objective is to model the dynamics of temperature andstreamflow in a set of connected river segments that togetherform a river network. The connections amongst these river seg-ments can be represented in a graph structure G = {V , E , A } ,where V represents the set of river segments and E representsthe set of connections amongst river segments. Specifically, wecreate an edge ( i, j ) ∈ E if the segment i is anywhere upstreamof the segment j . The matrix A represents the adjacency levelbetween each pair of segments, i.e., A ij = 0 means there is noedge from the segment i to the segment j and a higher valueof A ij indicates that the segment i is closer to the segment j in terms of the river distance. More details of how we generatethe adjacency matrix are discussed in Section V-A.For each river segment i , we have access to its input featuresat multiple time steps X i = { x i , x i , ..., x Ti } . The input features x ti are a D -dimensional vector, which include meteorologicaldrivers, geometric parameters of the segments, etc. (moredetails can be found in Section V-A). We also have a set ofobserved target variables Y = { y ti } but they are only availablefor certain time steps t ∈ { , ..., T } and certain segments i ∈ { , ..., N } . B. Physics-based Streamflow and Temperature Model
The Precipitation-Runoff Modeling System (PRMS) [21]and the coupled Stream Network Temperature Model(SNTemp) [22] is a physics-based model that simulates dailystreamflow and water temperature for river networks, and othervariables. PRMS is a one-dimensional, distributed-parametermodeling system that translates spatially-explicit meteoro-logical information into water information including evap-oration, transpiration, runoff, infiltration, groundwater flow,and streamflow. PRMS has been used to simulate catchmenthydrologic variables at regional [23] to national scales [24]in support of resource management decisions, among otherapplications. The SNTemp module for PRMS simulates meandaily stream water temperature for each river segment bysolving an energy mass balance model which accounts forthe effect of inflows (upstream, groundwater, surface runoff),outflows, and surface heating and cooling on heat transfer ineach stream segment. The SNTemp module is driven by thesame meteorological drivers used in PRMS and also driven bythe hydrologic information simulated by PRMS (e.g. stream-flow, groundwater flow). Calibration of PRMS-SNTemp isextremely time-consuming because it involves a large numberof parameters (84 parameters) and the parameters interact witheach other both within segments and across segments.
C. Recurrent Neural Networks and Long-Short Term Memory
The RNN model has been widely used to model thetemporal patterns in sequential data. The RNN model definestransition relationships for the extracted hidden representationthrough a recurrent cell structure. In this work, we adopt theLong-Short Term Memory (LSTM) cell which has proven tobe effective in capturing long-term dependencies. The LSTMcell combines the input features x t at each time step and theinherited information from previous time steps. Here we omitthe subscript i as we do not target a specific river segment.Each LSTM cell has a cell state c t , which serves as amemory and allows preserving information from the past.Specifically, the LSTM first generates a candidate cell state ¯ c t by combining x t and the hidden representation at previoustime step h t − , as follows: ¯ c t = tanh ( W hc h t − + W xc x t + b c ) . (1) Then the LSTM generates a forget gate f t , an input gate g t , and an output gate via sigmoid function σ ( · ) , as follows: f t = σ ( W hf h t − + W xf x t + b f ) , g t = σ ( W hg h t − + W xg x t + b g ) , o t = σ ( W ho h t − + W xo x t + b o ) . (2) The forget gate is used to filter the information inheritedfrom c t − , and the input gate is used to filter the candidatecell state at t . Then we compute the new cell state and thehidden representation as follows: c t = f t ⊗ c t − + g t ⊗ ¯ c t , h t = o t ⊗ tanh ( c t ) , (3) where ⊗ denotes the entry-wise product.ccording to the above equations, we can observe that thecomputation of h t combines the information at current timestep ( x t ) and previous time step ( h t − and c t − ), and thusencodes the temporal patterns learned from data.IV. M ETHOD
In this section, we describe the details of the PGRGrNmethod. We start with introducing the model architecture.Then we discuss a strategy to help enforce physical rela-tionships by leveraging the physical knowledge embedded inphysics-based models. Finally, we introduce a contrastive lossfunction that attempts to ensure that the model performance onindividual river segments is not compromised while optimizingthe performance on the entire set of segments. (a) (b)Fig. 1. (a) The RGrN architecture for an example river network with threesegments (b). The segment a and the segment b are upstream of the segment i .The grey arrows indicate the modeling components for upstream segments. A. Recurrent Graph Network
In a river network, most river segments are poorly ob-served or completely unobserved. To this end, we introducea global ML model architecture, Recurrent Graph Network(RGrN), which is trained using data collected from all theriver segments. Effective modeling of river segments requiresthe ability to capture their temporal dynamics and the influencereceived from upstream segments. Hence, we incorporate theinformation from both previous time steps and neighbors (i.e.,upstream segments) when modeling each segment (Fig. 1).Here we describe the recurrent process of generating thehidden representation h t from h t − , and we repeat this processfor the entire sequence from t = 2 to T ( h learned from anLSTM model). For each river segment i at time t − , the modelextracts latent variables which contain relevant informationto pass to its downstream segments. We refer to these latentvariables as transferred variables. For example, the amount ofwater advected from this segment and its water temperaturecan directly impact the change of water temperature for itsdownstream segments. We generate the transferred variables q t − i from the hidden representation h t − i , as follows: q t − i = tanh ( W q h t − i + b q ) , (4) where W q and b q are model parameters that are used toconvert the hidden representation to transferred variables.After gathering the transferred variables for all the seg-ments, we develop a new recurrent cell structure for eachsegment i that integrates the transferred variables from itsupstream segments into the computation of the cell state c ti .This can be expressed as follows: c ti = f ti ⊗ ( c t − i + (cid:88) ( j,i ) ∈E A ji q t − j ) + g ti ⊗ ¯ c ti (5) We can observe that the forget gate not only filters theprevious information from the segment i itself but also from itsneighbors (i.e., upstream segments). Each upstream segment j is weighted by the adjacency level A ji between j and i .When a river segment has no upstream segments (i.e., headwater), the computation of c ti is the same as with the standardLSTM. In Eq. 5, we use q t − j from the previous time stepbecause of the time delay in transferring the influence fromupstream to downstream segments (the maximum travel time isapproximately one day according to PRMS). We also discussthe impact of increasing the time delay in Section V-G.After obtaining the cell state, we can compute the hiddenrepresentation h ti by following Eq. 3. Finally, we generate thepredicted output from the hidden representation as follows: ˆ y ti = W y h ti + b y , (6) where W y and b y are model parameters.After applying this recurrent process to all the time steps,we define a loss using true observations Y = { y ti } that areavailable at certain time steps and certain segments, as follows: L RGrN = 1 | Y | (cid:88) { ( i,t ) | y ti ∈ Y } ( y ti − ˆ y ti ) . (7) B. Transferring knowledge from physics-based models
The RGrN architecture has the capacity to model the latentinformation that is transferred across river segments. However,training RGrN directly in an end-to-end fashion can onlylearn an abstract representation for transferred variables whileignoring their physical interpretation. The transferred variablescan become less informative when they are learned fromsparser and less representative observation data. To this end,we introduce a new strategy to enforce the prior physical rela-tionships amongst different river segments which are encodedby physics-based models. It helps make RGrN model moregeneralizable and also reduces the amount of observation datarequired to train a high-quality model. This strategy can beapplied to a wide range of scientific problems that are modeledas a set of interacting processes.We use the river temperature modeling as an example inthis section to better illustrate the proposed strategy. Foreach river segment in the network, the temperature changeis driven by energy exchanges caused by solar radiation,rainfall, evaporation, conductive and convective heat transfer,and net heat advected into river segments (e.g., groundwaterflow, upstream flow, downstream flow). These energy fluxesan be summarized into three categories and the process oftemperature change conforms to the equation as follows: ∆ Temperature ∝ F in − F out + F up , (8) where F in denotes the incoming energy fluxes from solarradiation, rainfall and other natural sources, F out denotesthe outgoing energy fluxes including long-wave emission,evaporation, conductive and convective heat transfer, and F up denotes the net heat advected into the river segment fromupstream segments. The term F up can be estimated by a setof intermediate physical variables from upstream segments,which include upstream flow, upstream water temperature,relative humidity, and other physical characteristics. Theseintermediate physical variables can be simulated by PRMS-SNTemp internally.Our goal is to ensure that the transferred variables q ti ateach segment contain sufficient information to represent theseintermediate physical variables so that downstream segmentscan gather all the information needed for capturing F up .For each segment i at time t , we first simulate the set ofintermediate variables s ti by running PRMS-SNTemp. Thenwe use s ti to add supervision on the transferred variables q ti such that we can extract s ti from q ti . More formally, we definea loss function on transferred variables as follows: L trans = 1 NT (cid:88) i (cid:88) t || s ti − ( W s q ti + b s ) || , (9) where W s and b s are model parameters that convert trans-ferred variables to the intermediate physical variables. Wecall this model as Physics-Guided Recurrent Graph Networks(PGRGrN) because we leverage the physical relationshipsbetween different river segments.Since the computation of loss L trans does not require obser-vation data, we can use it to pre-train RGrN to enforce physicalrelationships. This pre-training method not only explicitlyenforces the physical relationships among river segments, butalso enables full usage of physical intermediates obtained fromphysics-based models to enhance the representation learnedfor q ti and its previous layer h ti . In particular, the intermediatephysical variables used in this work include streamflow, streamtemperature, relative humidity, cloud cover, groundwater andshallow subsurface flow, and surface runoff.It is noteworthy that the model extracts the intermediatevariables from transferred variables q ti rather than forcing thetransferred variables to be exactly the intermediate physicalvariables. Also, we optimize the loss L trans in pre-trainingrather than using it as a regularizer in supervised learning.In other words, the ML model is guided but not constrainedby the physics-based model output, which allows for moreflexibility to automatically learn information in q ti that ispoorly known or not yet discovered while also remaininghelpful for modeling the interactions among river segments.At the same time, we can run PRMS-SNTemp to simulatethe final target variables corresponding to y ti . Here we use ˜ y ti to represent the simulated target variables by PRMS-SNTemp.Although the simulated data are not accurate reflection of Fig. 2. Error back-propagation during pre-training stage. the observation data, we can generate adequate simulationson every day and for every segment. The simulation dataalso follow many general physical relationships used to buildthe physics-based model. Given that observation data is oftenscarce, we can use simulated target variables to initialize themodel via pre-training. Hence, we define another pre-trainingloss on target variables as follows: L tar = 1 NT (cid:88) i (cid:88) t (˜ y ti − ˆ y ti ) , (10) Combining Eqs. 9 and 10, we get the final pre-training lossas follows: L pre = L tar + λ L trans , (11) where λ is a hyper-parameter to balance two losses. The back-propagation of two losses is shown in Fig. 2.In summary, our pre-training strategy helps initialize PGR-GrN using both simulated intermediate physical variablesand simulated target variables. By enforcing known physicalrelationships between segments, the model becomes more gen-eralizable. Besides, the ML model pre-trained with adequatesimulated ˜ y can get much closer to its final optimal solution,and thus require fewer observation data for fine-tuning. Wecall this model which is pre-trained using simulation data andthen fine-tuned with true observations as PGRGrN ptr . C. Segment-wise contrastive loss
The relationship between input features and target variablescan be very complex in environmental systems, e.g., slightchanges in segment slope and catchment size can drasticallyalter the streamflow. Traditional loss functions for regressionproblems, such as mean squared loss, tend to be dominatedby river segments with larger errors while degrading theperformance on other segments with smaller errors. This issuecan be further exacerbated given limited observation data onmost river segments, especially low-flow segments. Althoughimproving the segments with smaller errors does not contributemuch to reducing the overall error, accurately predictingstreamflow at these segments provides important informationregarding habitat for aquatic life and biogeochemical cycling.e introduce a new loss function to balance the modelperformance over different segments, with the goal that theglobal ML model trained on all the river segments should alsobe consistent with the local patterns extracted from each riversegment. In particular, we train N individual LSTM models, M to M N , for each segment using the simulation data. Eachindividual model M i is trained to predict simulated targetvariables (i.e., ˜ y i ) for a specific segment i . Even though thereis a gap between simulation data and true observation data,these individual models have a better chance at capturing thelocal temporal patterns of each river segment.When we apply the global PGRGrN model to a specificsegment i , the patterns predicted by the PGRGrN modelshould be similar to the local patterns learned by the individualmodel M i . Specifically, we compute the hidden representation h ti from PGRGrN at each time t , which encodes dynamicpatterns that directly output target variables. Similarly, we runthe individual model M i to compute its hidden representation ˜ h ti , which encodes the local temporal patterns for this segment.The hidden representation h ti from PGRGrN should be close tothe corresponding local hidden representation ˜ h ti and differentfrom the local representation of other segments, i.e., ˜ h tj for j (cid:54) = i . More formally, define a contrastive loss as follows: L ctr = − NT (cid:88) i (cid:88) t log exp ( h tiT W ctr ˜ h ti ) (cid:80) j exp ( h tiT W ctr ˜ h tj ) , (12) where W ctr is model parameters for computing the similarityof hidden representation. To ensure that the hidden represen-tation produced by PGRGrN ( h ) and by individual models ( ˜ h )are comparable, we use shared parameters { W y , b y } in the lastlayer (Eq. 6) for individual models and the global PGRGrNmodel and they are fitted when training individual models.By combining the contrastive loss and the loss of RGrN(Eq. 7), we get the fine-tuning loss as follows: L finetune = L RGrN + γ L ctr , (13) where γ is a hyper-parameter to balance the supervised lossof PGRGrN and the contrastive loss.The proposed contrastive loss provides an alternative wayin which different segments are comparable with each other.Traditional loss functions do not perform well for everysegment because they are defined on target variables whichmay vary drastically across different segments. Instead, thecontrastive loss matches temporal patterns encoded in thespace of hidden representation, which alleviates this issue.V. E XPERIMENTAL R ESULTS
We evaluate the proposed method for predicting streamtemperature and streamflow using real-world data collectedfrom the Delaware River Basin, which is an ecologicallydiverse region and a societally important watershed alongthe east coast of the United States as it provides drinkingwater to over 15 million people [25]. We first describe ourdataset and baselines. Then we discuss the results about thepredictive performance using sparse data, the effectiveness of pre-training, the spatial distribution of errors, and modelgeneralization.
A. Dataset and baselines
The dataset is pulled from U.S. Geological Survey’s Na-tional Water Information System [26] and the Water QualityPortal [27], the largest standardized water quality data set forinland and coastal waterbodies [27]. Observations at a specificlatitude and longitude were matched to river segments thatvary in length from 48 to 23,120 meters. The river segmentswere defined by the national geospatial fabric used for theNational Hydrologic Model as described by Regan et al. [24],and the river segments are split up to have roughly a one daywater travel time. We match observations to river segments bysnapping observations to the nearest stream segment withina tolerance of 250 meters. Observations farther than 5,000m along the river channel to the outlet of a segment wereomitted from our dataset. Segments with multiple observationsites were aggregated to a single mean daily streamflow orwater temperature value.We study a subset of the Delaware River Basin with 42river segments that feed into the mainstream Delaware Riverat Wilmington, DE. We use input features at the daily scalefrom Oct 01, 1980 to Sep 30, 2016 (13,149 dates). Theinput features have 10 dimensions which include daily averageprecipitation, daily average air temperature, date of the year,solar radiation, shade fraction, potential evapotranspirationand the geometric features of each segment (e.g., elevation,length, slope and width). Air temperature and precipitationvalues were derived from the Daymet gridded dataset. Otherinput features (e.g., shade fraction, solar radiation, potentialevapotranspiration) are difficult to measure frequently, andwe use values produced by the PRMS-SNTemp model asits internal variables. Water temperature observations wereavailable for 32 segments but the temperature was observedonly on certain dates. The number of temperature observationsavailable for each segment ranges from 1 to 9,810 with a totalof 51,103 observations across all dates and segments. Stream-flow observations were available for 18 segments. The numberof streamflow observations available for each segment rangesfrom 4,877 to 13,149 with a total of 206,920 observationsacross all dates and segments.We generate the adjacency matrix A based on the riverdistance between each pair of river segment outlets, repre-sented as dist ( i, j ) . We standardize the stream distance andthen compute the affinity level as A ij = 1 / (1+ exp ( dist ( i, j ))) for each edge ( i, j ) ∈ E .We compare model performance to multiple baselines,including the physics-based PRMS-SNTemp model, artificialneural networks (ANN), RNN with the LSTM cell, and thestate-of-the-art PGRNN method [8] which uses simulation datato pre-train an LSTM model and then fine-tunes it with true ob-servation data (represented as RNN ptr ). Since a region-specificcalibration PRMS-SNTemp is extremely time-consuming, aversion with default values of parameters is widely used in thehydrologic domain [28]. We provide a comparison with thisersion referred to as the PRMS-SNTemp model. We evaluatethree variants of the proposed method, RGrN (trained tominimize L RGrN ), PGRGrN ptr (pre-training using the strategyin Section IV-B and fine-tuning to minimize L RGrN , Eq. 7), andPGRGrN ptr,ctr (pre-training using the strategy in Section IV-Band fine-tuning to minimize L finetune , Eq. 13). All the MLmodels are trained and applied to all the river segments (i.e.,all models are global). In the following experiments, we traineach ML model using data from the first 24 years (Oct 01,1980 to Sep 30, 2004) and then test in the next 12 years(Oct 01, 2004 to Sep 31, 2016). The hidden representation inthese ML models is in 20 dimension (same for q ti ). We setthe learning rate to be 0.0005 and update the model for 200epochs for modeling water temperature and 300 epochs formodeling streamflow values. B. Overall prediction performance
We report the testing performance of different methods fortemperature prediction and streamflow prediction in Table Iand Table II, respectively. We also test the capacity of eachmodel to learn using less training data by randomly selecting0.1% and 2% labeled data from first 24 years for training themodel. For RNN ptr , PGRGrN ptr and PGRGrN ptr,ctr , we assumethe access to simulation data on every single date from Oct 011980 to Sep 20 2016 because they can be generated by simplyrunning PRMS-SNTemp model on input drivers. We repeateach experiment five times with random model initializationand random selection of sparser data (0.1%, 2%) and reportthe mean and standard deviation of the root mean square error(RMSE).We can observe that the proposed method outperforms base-lines by a considerable margin (Tables I and II). The improve-ment from ANN to RNN shows that the recurrent componentis helpful for capturing temporal patterns. RGrN performsbetter than RNN because it utilizes upstream-downstreamdependencies which are critical for an accurate accounting oftemperature and streamflow.We also observe that PGRGrN ptr has much better perfor-mance than RGrN using just 0.1% or 2% data. This is becausewe leverage the physical knowledge to add supervision onmultiple model components (i.e., target variables and trans-ferred variables) and thus the model can learn representativelatent variables without risking overfitting small amount ofobservations. PGRGrN ptr,ctr further improves the performanceby adjusting the pre-trained model to conform to local patternsof each segment. The standard pre-training method is alsohelpful for the RNN model, as we can see the improvementfrom RNN to RNN ptr using simulation data.In Fig. 3, we show several examples for predictions madeby PRMS-SNTemp (SIM), RNN, and the proposed model. Intemperature prediction (Figs. 3 (a) and (b)), PRMS-SNTempcaptures the overall dynamic patterns of temperature changebut always has a bias to true observations. RNN gets closer totrue observations but does not perform as well as PGRGrN ptr,ctr in capturing fluctuations of temperature changes.
TABLE IR
OOT M EAN S QURE E RROR ( ± STANDARD DEVIATION ) FORTEMPERATURE MODELING USING
AND
TRAININGLABELS . R
OWS IN GREY COLOR REPRESENT METHODS USINGSIMULATION DATA . H
ERE OUR METHOD IS COMPARED WITH A RTIFICIAL N EURAL N ETWORKS (ANN), R
ECURRENT N EURAL N ETWORKS (RNN)
AND THE
RNN
PRETRAINED USING SIMULATION DATA AND FINE - TUNEDUSING OBSERVATIONS (RNN
PTR ). Method ± ± ± ± ± ± ptr ± ± ± ± ± ± ptr ± ± ± ptr,ctr ± ± ± REDICTION
RMSE
FOR STREAMFLOW MODELING USING
AND
TRAINING LABELS . H
ERE OUR METHOD IS COMPARED WITH A RTIFICIAL N EURAL N ETWORKS (ANN), R
ECURRENT N EURAL N ETWORKS (RNN)
AND THE
RNN
PRETRAINED USING SIMULATIONDATA AND FINE - TUNED USING OBSERVATIONS (RNN
PTR ). Method ± ± ± ± ± ± ptr ± ± ± ± ± ± ptr ± ± ± ptr,ctr ± ± ± We can see similar results for predicting streamflow insegments with high flows (Fig. 3 (c)). Here PRMS-SNTempproduces a large bias and also a slow response. PGRGrN ptr,ctr performs much better than both PRMS-SNTemp and RNN.However, for segments with low streamflows (Fig. 3 (d)),PRMS-SNTemp better matches with observations than MLmodels. This is because ML models optimize the overall per-formance while low-flow stream segments (mostly headwaters)are a minority in the entire river networks and contribute lessto the loss function. In Fig. 3 (d), we show the results producedby variants of our method to study how the use of contrastiveloss and pre-training alter the predictions. We can observethat the predictions made by RGrN are almost constant overtime while PGRGrN ptr can better capture the dynamics bypre-training the model but has an even larger gap with trueobservations. By adopting the contrastive loss, PGRGrN ptr,ctr effectively reduces the bias on this low-flow segment.To better understand how the performance varies acrossdifferent types of river segments, we show the streamflowprediction errors for four types of segments, low ( < m /s ),medium low (0.5-2 m /s ), medium high (2-5 m /s ) and high( > m /s ) in Fig. 4. RGrN and PGRGrN ptr generally performbetter than other methods in medium-high and high-flow riversegments but perform much worse in low-flow segments. Inparticular, we can see RGrN has much larger errors than RNNover low-flow river segments. This is because the neighboringriver segments tend to have similar embeddings after graphconvolution and thus the training of RGrN pays even lessattention to low-flow segments given the fact that there areonly a few low-flow segments in the river network. As shown a) (b)(c) (d)Fig. 3. (a)(b) Examples of temperature predictions by different methods. (c)An example of streamflow predictions in a high-flow river segment. (d) Anexample of streamflow predictions by variants of the proposed method in alow-flow river segment. Here SIM represents the simulated data produced byPRMS-SNTemp model and OBS represents the true observation data. Here wecompare the performance of Recurrent Neural Networks (RNN), the proposedmodel Recurrent Graph Neural Networks (RGrN), the RGrN model using thepretraining strategy (PGRGrN ptr ) and the RGrN model using both pre-trainingand contrastive loss (PGRGrN ptr,ctr ).(a) (b)Fig. 4. Distribution of prediction errors in (a) low and medium-low segments,and (b) medium-high and high-flow segments. in Fig. 4 (a), this issue is partly addressed by using thecontrastive loss. An alternative solution to this issue is tointelligently select the most suitable model (e.g., PRMS-SNTemp or ML models) for different types of river segments,which we suggest as future work. C. Assessing performance on unobserved segments
Here we aim to test the performance of models for thesegments which have no observation data (Tables III and IV).Such segments commonly exist in a real-world basin system.Seg A to Seg E are five river segments which have sufficientobservation data for both stream temperature and streamflow.Each row shows the results for an individual experimentwhere we intentionally remove the temperature or streamflowobservations for a specific segment during the training period(Oct 01, 1980 to Sep 30, 2004). Then we report the predictionperformance of RNN, RGrN, and PGRGrN ptr,ctr only on this
TABLE IIIRMSE
OF TEMPERATURE PREDICTION ON INDIVIDUAL SEGMENTS AFTERREMOVING TRAINING OBSERVATION DATA . H
ERE WE COMPARE THEPERFORMANCE OF R ECURRENT N EURAL N ETWORKS (RNN),
THEPROPOSED R ECURRENT G RAPH N EURAL N ETWORKS (RG R N) AND THE RG R N MODEL USING BOTH PRE - TRAINING AND CONTRASTIVE LOSS (PGRG R N PTR , CTR ). Segment
Method With Obs Without ObsRNN 2.297 ± ± ± ± ptr,ctr ± ± ± ± ± ± ptr,ctr ± ± ± ± ± ± ptr,ctr ± ± ± ± ± ± ptr,ctr ± ± ± ± ± ± ptr,ctr ± ± segment during the testing period (Oct 01, 2004 to Sep 31,2016).We can observe larger errors produced by all the threemodels after we remove training data for a segment. This isexpected because segment-specific observation data has notbeen used for refinement of the model. However, we observethat the drop in performance of PGRGrN ptr,ctr is consistent andoften significantly smaller than that of the RNN model.We notice that RGrN may not produce better streamflowpredictions than RNN for every segment. For example, Seg Ais a head water and has no upstream river segments. Hence, thegraph convolution is not helpful for modeling Seg A. Besides,the information propagated from neighbors may not always behelpful if neighboring segments have very different streamflowpatterns. A potential area of future research is to develop newgraph operators so that each segment can only learn fromneighbors that are most relevant. (a) (b)Fig. 5. The predictions of pre-trained (PGRGrN pretrain ) and fine-tuned models(PGRGrN ptr ) for (a) temperature and (b) streamflow, using 2% training data.Here we compare the performance of the RGrN model that is only pre-trainedusing simulation data (PGRGrN pretrain ) and the RGrN model using pre-trainingand then fine-tuned with observation data (PGRGrN ptr ). D. Effectiveness of pre-training
In Fig. 5, we randomly select two example segments toshow how predictions change from the pre-trained model(PGRGrN pretrain ) to the fine-tuned model (PGRGrN ptr ) using
ABLE IVRMSE
OF STREAMFLOW PREDICTION ON INDIVIDUAL SEGMENTS AFTERREMOVING TRAINING OBSERVATION DATA . H
ERE WE COMPARE THEPERFORMANCE OF R ECURRENT N EURAL N ETWORKS (RNN),
THEPROPOSED R ECURRENT G RAPH N EURAL N ETWORKS (RG R N) AND THE RG R N MODEL USING BOTH PRE - TRAINING AND CONTRASTIVE LOSS (PGRG R N PTR , CTR ). Segment
Method With Obs Without ObsRNN 0.643 ± ± ± ± ptr,ctr ± ± ± ± ± ± ptr,ctr ± ± ± ± ± ± ptr,ctr ± ± ± ± ± ± ptr,ctr ± ± ± ± ± ± ptr,ctr ± ±
2% training data. Here the model PGRGrN ptr is the samewith the PGRGrN ptr model used in the previous resultsbut is only fine-tuned using 2% observations. In contrast,PGRGrN pretrain is only pre-trained using the strategy proposedin Section IV-B but without using observations for fine-tuning.Fig. 5 (a) shows that the pre-trained model match the simulatedtemperatures very well and thus can capture general tem-perature patterns even without using observation data. Thereis still a gap between true observations and PGRGrN pretrain since PGRGrN pretrain emulates PRMS-SNTemp, which hasinherent bias due to an incomplete representation of physics.Nevertheless, after learning general patterns from simulationdata, the model can be fine-tuned to match true observationsusing much less training data. This can be verified as weshow that PGRGrN ptr fine-tuned with 2% data closes the gapbetween PGRGrN pretrain and true observations. Similarly, Fig. 5(b) shows that pre-training helps capture general streamflowpatterns and fine-tuning effectively fixes the bias and the slowresponse of PRMS-SNTemp.
E. Spatial distribution of errors
In Fig. 6, we show the distribution across different seg-ments for ML temperature and streamflow prediction improve-ments over the physics-based PRMS-SNTemp simulations. InFig. 6 (a), we can observe that RGrN and PGRGrN ptr,ctr pro-duce smaller temperature error than RNN in many segments.We find one segment (in dashed circle) where RNN producesworse predictions than PRMS-SNTemp but RGrN greatlyimproves the prediction. This is the only segment in the dataset for which we have no training data but sufficient testingdata. The reason why RGrN can produce better predictions forthis unlabeled segment is that it leverages the dependencieswith other segments to learn the temperature patterns evenwithout training data from this specific segment.For streamflow prediction (Fig. 6 (b)), it can be seen thatRGrN and PGRGrN ptr,ctr have lower RMSE in several high- flow segments (e.g., the segments in red dashed circles).However, RGrN performs worse than RNN on headwatersegments (in black dashed circles) and these segments havelower streamflow. Compared with RGrN, it can be seen thatPGRGrN ptr,ctr alleviates this issue and produces smaller errorsin these low-flow segments by using the contrastive loss.
F. Generalization test
Generalizability is important for scientific problems becausemost observation data can be collected from certain periodsor locations for which it is easier to deploy sensors. Hence,we test model generazability for both streamflow and streamtemperature prediction. In particular, we test the model per-formance on streamflow prediction using training data onlyfrom segments with higher average streamflows ( > m /s ) inthe first 24 years (Oct 1980 to Sep 2004). Then we report theRMSE in in the next 12 years for segments with lower averagestreamflows ( < m /s ) in Table V. We also include the testingperformance of the model trained using all the observationsfrom first 24 years as a baseline in the second column ofTable. V. Similarly, it is challenging for traditional ML modelsto generalize to predicting temperatures in a season that wasnot included in training data. We train each model using dataonly from colder seasons (spring, fall and winter) in the first24 years and then test in summers in the next 12 years, asshown in Table VI.Note that PGRGrN ptr,ctr always performs better than othermethods because the incorporation of physical knowledge andminimizing the contrastive loss forces the model to learngeneralizable patterns for each segment. It can be seen thatANN has large errors (especially in Table V) when appliedto scenarios that are very different to training data. This isbecause ANN only focuses on the mapping from input featuresto target variables while physical phenomena are often drivenby spatial and temporal processes. Such a gap makes ANNprone to overfit the training data. In contrast, RGrN performsbetter than ANN and RNN in temperature prediction becauseit leverages the dependencies both over space and time, andthus has a better chance in learning generalizable patterns.However, the performance of RGrN becomes worse in predict-ing segments with lower streamflows. The graph convolutionmakes the extracted hidden representation ( h ti ) more similarfor neighboring segments and thus RGrN simulate streamflowswith similar dynamic patterns as in high-flow segments when itis trained just using observation data from high-flow segments. G. Sensitivity tests
Here we test the sensitivity of the model to different hyper-parameter settings. Also we test the performance of the modelwith different time delays (Eq. 5).In Figs. 7 (a) and (b), we show the prediction performanceusing 2% data using different values of λ (Eq. 11) and γ (Eq. 13). We can see that the model is relatively robustwith different values of hyper-parameters but there is a slightincrease of RMSE if we set γ and λ to be either too smallor too large. With a small γ , the model has less weight on a) (b)Fig. 6. Prediction errors across different segments for (a) temperature prediction and (b) streamflow prediction. Here we show the error as the RMSE of eachmethod minus the RMSE of PRMS-SNTemp simulation data to get a better contrast. Darker blue indicates better performance of the ML model over thephysics-based model. In (a), we do not discuss the red colored segment on the bottom (predicted poorly by all the three methods) and head water (75.7 ◦ W,39.95 ◦ N) predicted poorly by RNN because they just have one test observation.TABLE VS
TREAMFLOW
RMSE
ON SEGMENTS WITH STREAMFLOW < m /s FROM TO ACH MODEL IS TRAINED USING OBSERVATION DATAFROM SEGMENTS WITH STREAMFLOW > m /s ( COLUMN OR ALL THEOBSERVATIONS DATA ( COLUMN FROM O CT TO S EP EREOUR METHOD IS COMPARED AGAINST A RTIFICIAL N EURAL N ETWORKS (ANN), R
ECURRENT N EURAL N ETWORKS (RNN),
THE
RNN
THAT ISPRE - TRAINED USING SIMULATION DATA AND THEN FINE - TUNED USINGOBSERVATION DATA (RNN
PTR ). Method
Train on high-flow Train on all the dataANN 9.752 ± ± ± ± ptr ± ± ± ± ptr ± ± ptr,ctr ± ± TABLE VIT
EMPERATURE
RMSE
IN SUMMERS FROM TO ACH MODELIS TRAINED USING OBSERVATION DATA FROM COLDER SEASONS (C OLUMN OR ALL THE OBSERVATIONS DATA ( COLUMN FROM O CT TO S EP ERE OUR METHOD IS COMPARED AGAINST A RTIFICIAL N EURAL N ETWORKS (ANN), R
ECURRENT N EURAL N ETWORKS (RNN),
THE
RNN
THAT IS PRE - TRAINED USING SIMULATIONDATA AND THEN FINE - TUNED USING OBSERVATION DATA (RNN
PTR ). Method
Train on cold seasons Train on all the dataANN 2.138 ± ± ± ± ptr ± ± ± ± ptr ± ± ptr,ctr ± ± contrastive loss and thus produces worse performance usinglimited (2%) observation data. With a large γ value, we puta higher weight on the contrastive loss but the model is morelikely to be impacted by the bias in the simulation data. Onthe other hand, the smaller value of λ will make the modelignorant of the supervision on transferred variables while amuch larger value of λ results in less focus on the targetvariables. It is worth mentioning that our adopted hyper-parameters may not be optimal values in the testing set becausethey are tuned using the simulation data.In Figs. 7 (c) and (d), we use different lengths of time delay,from one time step to five time steps, in Eq. 5. The time delay indicates how many days prior the information from upstreamsegments influence the downstream segments. We can see ageneral increase of RMSE when we set a larger time delay. (a) (b)(c) (d)Fig. 7. Performance using different values of (a) γ and (b) λ . Performancewith different time delay for (c) temperature and (d) streamflow prediction.In (c) and (d) we sohw the performance of the proposed Recurrent GraphNeural Networks (RGrN) and the RGrN model using both pre-training andcontrastive loss (PGRGrN ptr,ctr ). VI. C
ONCLUSION
In this paper, we propose a novel method PGRGrN formodeling interacting segments in river networks. We lever-age the prior physical knowledge about segment-to-segmentinteractions embedded in physics-based models to enhancethe learning of latent representation in the proposed MLmodel. Moreover, we improve the loss function to optimizeboth the overall performance over the river network andthe local performance on each individual river segment. Wehave demonstrated the superiority of the proposed method inhandling the scarcity of labeled data and in generalizing tounseen scenarios.In addition to modeling variables in river networks, theproposed method can be adjusted to model other complexystems which involve interacting processes. For example, thismethod could be potentially used for material discovery, bio-logical research and quantum chemistry to capture interactionsbetween different atoms or molecules.While our method performs much better than existing mod-els, it remains limited in precisely predicting special segments(e.g., unobserved segments and segments with extremely lowstreamflows). To advance understanding, future ML modelingefforts may consider uncertainty of the global ML model anddetermine whether ML should be used to replace physics-based models in different situations.VII. A
CKNOWLEDGMENTS
Any use of trade, firm, or product names is for descriptivepurposes only and does not imply endorsement by the U.S.Government. R
EFERENCES[1] D. Graham-Rowe, D. Goldston, C. Doctorow, M. Waldrop, C. Lynch,F. Frankel, R. Reid, S. Nelson, D. Howe, S. Rhee et al. , “Big data:science in the petabyte era,”
Nature , vol. 455, no. 7209, pp. 8–9, 2008.[2] T. Jonathan, A. Gerald, B. Sandrine et al. , “Special online collection:dealing with data,”
Science , vol. 331, no. 6018, pp. 639–806, 2011.[3] T. J. Sejnowski et al. , “Putting big data to good use in neuroscience,”
Nature neuroscience , vol. 17, no. 11, p. 1440, 2014.[4] A. Ravindranath, N. Devineni, and P. Kolesar, “An environmentalperspective on the water management policies of the upper delawareriver basin,”
Water Policy , vol. 18, no. 6, pp. 1399–1419, 2016.[5] U. Lall, “Debatesthe future of hydrological sciences: A (common)path forward? one water. one world. many climes. many souls,”
WaterResources Research , vol. 50, no. 6, pp. 5335–5341, 2014.[6] H. V. Gupta and G. S. Nearing, “Debatesthe future of hydrologicalsciences: A (common) path forward? using models and data to learn:A systems theoretic perspective on the future of hydrological science,”
Water Resources Research , 2014.[7] J. J. McDonnell and K. Beven, “Debatesthe future of hydrological sci-ences: A (common) path forward? a call to action aimed at understandingvelocities, celerities and residence time distributions of the headwaterhydrograph,”
Water Resources Research , vol. 50, no. 6, pp. 5342–5350,2014.[8] X. Jia, J. Willard, A. Karpatne, J. Read, J. Zwart, M. Steinbach, andV. Kumar, “Physics guided rnns for modeling dynamical systems: Acase study in simulating lake temperature profiles,” in
Proceedings ofthe 2019 SIAM International Conference on Data Mining . SIAM, 2019,pp. 558–566.[9] J. S. Read, X. Jia, J. Willard, A. P. Appling, J. A. Zwart, S. K. Oliver,A. Karpatne, G. J. Hansen, P. C. Hanson, W. Watkins et al. , “Process-guided deep learning predictions of lake water temperature,”
WaterResources Research , 2019.[10] B. Anderson, T. S. Hy, and R. Kondor, “Cormorant: Covariant molec-ular neural networks,” in
Advances in Neural Information ProcessingSystems , 2019, pp. 14 510–14 519.[11] N. Muralidhar, M. R. Islam, M. Marwah, A. Karpatne, and N. Ra-makrishnan, “Incorporating prior domain knowledge into deep neuralnetworks,” in . IEEE, 2018, pp. 36–45.[12] J.-X. Wang, J.-L. Wu, and H. Xiao, “Physics-informed machine learningapproach for reconstructing reynolds stress modeling discrepanciesbased on dns data,”
Physical Review Fluids , 2017.[13] D. Liu and Y. Wang, “Multi-fidelity physics-constrained neural networkand its application in materials modeling,”
Journal of MechanicalDesign , vol. 141, no. 12, 2019.[14] J. Park and J. Park, “Physics-induced graph neural network: An appli-cation to wind-farm power estimation,”
Energy , 2019.[15] F. Kratzert, D. Klotz, M. Herrnegger, A. K. Sampson, S. Hochreiter,and G. S. Nearing, “Toward improved predictions in ungauged basins:Exploiting the power of machine learning,”
Water Resources Research ,2019. [16] Z. Moshe, A. F. Metzger, F. Kratzert, G. Elidan, S. Nevo, and R. El-Yaniv, “Hydronets: Leveraging river structure for hydrologic modeling,”2020.[17] Y. Qi, Q. Li, H. Karimian, and D. Liu, “A hybrid model for spatiotem-poral forecasting of pm2. 5 based on graph convolutional neural networkand long short-term memory,”
Science of the Total Environment , 2019.[18] T. Xie and J. C. Grossman, “Crystal graph convolutional neural networksfor an accurate and interpretable prediction of material properties,”
Physical review letters , vol. 120, no. 14, p. 145301, 2018.[19] D. Zhu et al. , “Understanding place characteristics in geographiccontexts through graph convolutional neural networks,”
Annals of theAmerican Association of Geographers , 2020.[20] J. S. Read et al. , “Process-guided deep learning predictions of lake watertemperature,”
Water Resources Research , 2019.[21] S. L. Markstrom et al. , “Prms-iv, the precipitation-runoff modelingsystem, version 4,”
USGS Techniques and Methods , no. 6-B7, 2015.[22] M. J. Sanders, S. L. Markstrom, R. S. Regan, and R. D. Atkinson, “Doc-umentation of a daily mean stream temperature modulean enhancementto the precipitation-runoff modeling system,” US Geological Survey,Tech. Rep., 2017.[23] J. H. LaFontaine et al. , “Application of the precipitation-runoff modelingsystem (prms) in the apalachicola-chattahoochee-flint river basin in thesoutheastern united states,”
USGS Scientific Investigations Report , 2013.[24] R. S. Regan, S. L. Markstrom, L. E. Hay, R. J. Viger, P. A. Norton, J. M.Driscoll, and J. H. LaFontaine, “Description of the national hydrologicmodel for use with the precipitation-runoff modeling system (prms),”US Geological Survey, Tech. Rep., 2018.[25] T. N. Williamson, J. G. Lant, P. Claggett, E. A. Nystrom, P. C. Milly,H. L. Nelson, S. A. Hoffman, S. J. Colarullo, and J. M. Fischer,“Summary of hydrologic modeling for the delaware river basin usingthe water availability tool for environmental resources (water),” USGeological Survey, Tech. Rep., 2015.[26] U. G. Survey, “National water information system data available on theworld wide web (usgs water data for the nation),” 2016.[27] E. K. Read et al. , “Water quality data for national-scale aquatic research:The water quality portal,”
Water Resources Research , 2017.[28] R. S. Regan et al.et al.