Evaluation of the Delta Simulation Model-2 in Computing Tidally Driven Flows in the Sacramento–San Joaquin Delta

Author(s): Sridharan, Vamsi K.; Monismith, Stephen G.; Fringer, Oliver B.; Fong, Derek A. | Abstract: We investigate the fidelity of the Delta Simulation Model-2 (DSM2), a one-dimensional branched network hydrodynamics solver, which is used to model water quality and ecology in the Sacramento–San Joaquin Delta estuary. We find that while DSM2 reproduces the total flows well, it does not accurately represent the harmonic components of the tides and tidal modulation of subtidal flow. The inaccurate representation of tidal dynamics affects prediction of subtidal flows, flow splits at key junctions, and salinity. These deviations are the result of coarse spatial and temporal representation of tides as well as unrepresented estuarine physical processes. We propose and evaluate two types of schemes intended to improve fidelity: modifying the model domain and specifying fine grid and boundary conditions, and incorporating and parameterizing more complex physical processes into the 1-D model. We also develop a comprehensive protocol to evaluate the model in which we assess the fidelity of model results. In this protocol, we also include a decomposition of the model error into a systematic component because of model representation, and an unsystematic component, which includes errors from both unmodeled physical processes and data precision. Our analysis reveals that these recommendations would be effective provided they can be incorporated with model recalibration. Both our proposed schemes and the model evaluation process will be useful in analyzing models of networked surface water systems such as the Delta in which the distribution of observations is spatially inhomogeneous.


C1. TIME PERIOD USED TO EVALUATE MODEL PERFORMANCE
CDWR recommended that the summer of 2015 should be used to evaluate the model performance only if challenging circumstances are being highlighted. During 2015, voluntary water withdrawal limitations were in effect and an emergency drought barrier was installed in the Delta that significantly altered tidal behavior in the central Delta. Since the model setup we received from CDWR accounts for the barrier, we did not feel it was necessary to explicitly discuss these special circumstances. However, in the interest of completeness, we acknowledge that the barrier placement and reduced exports were accounted for in the model setup and that the reduced exports add to uncertainty in the modeling of channel depletions and net flow. However, we are justified in evaluating the model performance during this period for two reasons: i) The extremely low-flow period is a stress-test of model ability to accurately represent the tides.
ii) The altered physical configuration of the Delta tests the ability of the model to reproduce tidal phenomena in a regime that differs from the one in which it was calibrated. The fact that the model needs to be recalibrated when a system change occurs reveals that the model is more phenomenological than mechanistic. We discuss the numerical considerations in DSM2 in Section 5.

C2. MODEL REPRESENTATION
We had incorrectly reported that the bottom friction term in DSM2 is linearized. The correct, nonlinear bottom friction term in the Saint-Vénant equation should be

C3. BOUNDARY CONDITION RESOLUTION FOR THE STANDARD MODEL SETUP
We had incorrectly reported that the standard configuration in which the DSM2 model is run with a 15-minute time-step size is forced at Martinez with hourly tidal stage from the California Data Exchange Center (CDEC). Instead, the model in its standard configuration is forced with CDWR data over 15-minute intervals at Martinez. For phasesensitive analysis, CDWR has started recently to recommend that this be the data of record at this station. This data has been recently made publicly available through a data liaison for the Division of Environmental Services (DES) listed on the CDEC web page. In Table 2 in the paper, the boundary condition time-step size should be 15 minutes for the G1T15B1 model configuration. Correspondingly, the resolution condition should be R = 1. In Section 5.1 on page 20 of the original manuscript, "A possible explanation is that model configurations generally produced the best overall results when R was close to 1." is not the likely explanation to justify the lower aggregate error with the G1T60B1 model configuration. Instead, this is better explained by the fact that the model calibration with a very diffusive value of θ behaves better, as it is more stable, with the larger Courant number condition for the configuration that employs a time-step size of one hour (See Section C5 for additional details).
One of our analyses specifically compared NOAA and CDWR stations at the Martinez Amorco pier. We have revisited our analysis using the QA/QC'd data supplied by CDWR Delta Modeling Section. This data is the same as the DES data cited above, but has received a small amount of screening that affects less than 1% of data points. Comparing data from the two stations shows that the two generally agree reasonably well. Therefore, we acknowledge that the difference between the frequency content in the NOAA data and the CDWR data is smaller than we originally reported, and that much of this difference is contained in the higher than semi-diurnal frequency bands.
Since the model is forced with 15-minute rather than hourly CDEC data at Martinez, the spectrogram of the 15-minute CDEC data (which has been quality controlled by CDWR), as opposed to hourly data as in the original paper is shown in Figure C1.
This figure replaces Figure 5 in the paper. There is still an obvious lack of tidal energy in the CDEC data when compared to the NOAA data that is reflected by a range of 0.1 to 0.7 feet in water level, particularly at frequencies higher than the semidiurnal band. The discrepancy is well above the instrument error and cannot be explained by relative instrument positioning between the NOAA buoy and CDEC data. This indicates that a time-step size of 15 minutes, both in the forcing and in the model, may not adequately resolve higher-frequency mechanisms that drive tidal residual circulation and tidal dispersion.
iii AUGUST 2018 https://doi.org/10.15447/sfews.2018v16iss2art6 Therefore, we recommend that a smaller time-step size be used and that the NOAA data be used to force DSM2's tidal boundary when the model domain ends at Martinez.

C4.1 Daylight Savings in the CDEC Data
The CDEC stage boundary conditions supplied by CDWR with the DSM2 setup files, as well as the RTC buoy and Amorco Pier NOAA buoy data are all corrected for daylight savings. Therefore, the DSM2 model output is also corrected for daylight savings. However, the data obtained directly from the CDEC website is not corrected for daylight savings. This has been fixed, and new results are presented in Section C4. 4. We note that while there are caveats in the CDEC website as to the preliminary nature of the data, there is no documentation indicating daylight savings time is not correctly accounted for in the CDEC data labeled as Pacific Standard Time.

C4.2 Quality Control of Datasets
Based on conversations with Jon Burau of the USGS and Deanna Sereno of the Contra Costa Water District, we discovered that the CDEC data is not quality controlled, i.e., it includes artefacts due to numerical operations on the raw data collected from instruments (discussed below) and does not include censoring of data beyond instrument ranges. Therefore, although it is generally deemed unsuitable for analyzing tidal processes, we performed a very rigorous cleanup of the dataset and subjected our model results to the same filters. Hence, barring the timestamp inconsistency (see Section C4.3), any other data quality issues should not appear in our original analysis. However, to avoid any further confusion, in this corrigendum, we decided to use the original quality controlled data from the USGS (NWIS, 2008) at the stations reported in Table C1.
Stages in the USGS datasets are reported as height above the gauge. At the stations where we do not report the stage, the height of the gauge above the NAVD88 reference datum was unavailable. We estimate the water column depth from the USGS stage data with ft. Figure 5] Spectrograms of the boundary condition time series at Martinez. Top panel: 15-minute CDEC data, and bottom panel: 6-minute NOAA buoy data. The energy content of various tidal harmonics (equivalent to the contribution of that harmonic to the tide height) is better resolved with the NOAA data set than with the CDEC data set.

Figure C1 [Replaces
iv VOLUME 16, ISSUE 2,ARTICLE 6 (2) When the gauge depth or gauge height above the datum were reported relative to the NGVD29 datum, we corrected the datum shift using the NOAA orthometric height correction (NGS, 2004). Because of variations in bed depths across the cross-section, this matching of local instrument depths to DSM2 depths provides only a rough placement of the station, accurate to perhaps +/-4 feet, at locations where the USGS did not directly measure the gage depth but used topographic maps. However, at other stations, the accuracy is +/ 0.02 feet (personal communication with USGS). This, the uncertainty is of secondary importance when comparing tidal amplitudes.
We have learned that the 15-minute flow data reported by the USGS utilizes the product of a rating curve for the average water velocity as a function of the instrument velocity, and another rating curve for the cross-sectional area as a function of the stage. These rating curves are often quadratic or cubic polynomials and, as a result, are likely to introduce artificial oscillations in flow at the frequencies of the tides and overtides being estimated. This can further complicate the analysis, particularly for nonlinear processes including overtides and residual flow where the signal to noise ratio can be low. We note that this information on how the flow data are reported is not documented clearly, and that model evaluations can only be performed up to the quality of the source data. We also note that we have been careful in the original paper to treat noise in the data conservatively (see Section 4.3

in Appendix A).
We urge CDWR to make such information explicitly available to model users.

C4.3 Timestamp Discrepancy in our Analysis
We noticed a timestamp discrepancy in our analysis that was referencing observations from different periods of time than the data to which it was compared. This was causing a significant discrepancy in the paper. For example, in Figure 15 in the paper, we had indicated that in the M 2 component of flow, there was a phase lag between the model and the data at Rio Vista of about 150° corresponding to a time lag of about 4-8 hours. This indicates almost a phase inversion between the model prediction and observations of the M 2 component of the tide. This would also have been evident in the total signal.
In this corrigendum, we correct both the daylight savings and our timestamp assignment error and report the comparisons for the USGS stations in Table C1. While the model fidelity is much better than we had originally reported, there is still a significant error in the amplitudes of the tides and a phase error that increases landward (for instance, the error in both water level and flow increases landward along the Sacramento River in the tidal bands in Tables C2 and C3). We explore the possible reasons for the remaining errors and provide potential solutions in Section C5 below.

C4.4 Revised model evaluation results
Corrections related to the daylight savings inconsistency and timestamp error resulted in changes to the model comparisons in Figures 14 and  15 in the paper and A.6.9 and A.6.10 in Appendix A.
We adjust the values reported in these figures in the following tables. We note that we do not expect the summary statistics of the total signals over the entire seven year period we investigated or the subtidal quantities such as the mean flow, Stokes' drift and spring-neap oscillations to deviate from the values we presented in the original paper.
From Tables C2 and C3, we see that the phase error in DSM2 is not as large as we reported in the original paper. However, there is a trend of increasing error in the various tidal constituents as the tide propagates landward. With these new data, there are no significant differences in the errors among the various model configurations, although configurations G1T60B1 and G1T15BB produced the least error over the maximum number of comparisons. These issues are endemic to the model construction, as we discuss in Section C5.

C5. DSM2 ERROR ANALYSIS AND RECOMMENDATIONS
Linear stability theory dictates that the FourPt scheme used in DSM2 is unconditionally stable when the implicitness parameter satisfies θ ≥ 0.5. Ideally, one would employ a value of θ = 0.5 since this ensures no numerical damping and leads to secondorder accuracy in time. However, unless they are damped with θ > 0. 5 (DeLong et al., 1997), nonlinear effects lead to high-frequency, spurious oscillations due to the inherent dispersive error of the FourPt scheme. The dispersive error leading to the spurious oscillations is a function of the Courant number where Δx is the grid resolution, Δt is the time-step size, and c 0 is the shallow-water wave speed, and the wave resolution factor, WRF = kΔx (4) where k is the wavenumber of the process of interest. The Courant number can be thought of as a ratio of the time-step size, Δt, to the shortest time scale on the grid, Δx/c 0 , while WRF is the analog to the Courant number in space, since it is a measure of the ratio of the grid resolution, Δx, to the length scale of the process of interest, k -1 . Interestingly, the ratio of the time-step size to the time scale of the process of interest, ω 0 -1 , is given by ω 0 Δt = C(kΔx) after noting that the dispersion relation implies ω 0 = c 0 k. Therefore, resolution in time depends both on a small Courant number and WRF. In DSM2, a typical Courant number is C = 6 based on a shallow-water wave speed of c 0 ≈ 10 ms -1 (assuming a depth of 10 m), and time-step size of Δt = 15 minutes = 900 s, and a grid resolution of Δx ≈ 1500 m. Depending on the process (discussed below), WRF can range between a small number and the Nyquist limit (the resolution limit of at least two grid cells spanning each wave) or maximum value of WRF = O(1). Despite relatively large values of the Courant number and WRF, which limit accuracy and stability on reach-length waves between junctions, DSM2 is reported to be stable with values of θ= 0.50625 (based on conversations with Drs. Prabhjot Sandhu and Eli Ateljevich). However, there are potential scenarios associated with floods or sudden discontinuities due to manmade structures where instabilities can arise with such low values of θ. As a result, use of large values of the Courant number and WRF lead to spurious oscillations that must be damped with a relatively large value of θ = 0.6 as suggested in DSM2. While stable, this large value leads to excessive damping of the tides as exhibited by the increased error in the tidal constituents moving landward, as indicated in Tables C2 and C3. The need to damp the FourPt scheme with a large value of θ is not surprising given that the unsteadiness associated with the tides is not accurately resolved when the Courant number and WRF are of O(1).
The potential effect of WRF on the dispersive error in DSM2 can be quantified if we consider a typical M 4 tide with period T = 6.21 hours propagating with wave speed c 0 = 10 ms -1 . This tide has a wavelength λ = c 0 T = 224 km and a wavenumber k = 2π/λ = 2.8 × 10 -5 m -1 , giving WRF = 0.04  (assuming Δx =1500 m). Such a small WRF implies that this wave is accurately resolved by DSM2, as it should given that its dynamics are linear and it is long relative to the grid spacing. However, as such waves propagate through the Delta they encounter multiple junctions with length scales of O(10 km), producing waves that are created and destroyed between these junctions, as described by Schwartz (2015) and sketched in Figure C2. These waves have wavelengths of O(10 km), giving WRF ≈ 1, which is close to the Nyquist WRF. The result is that these junction-scale waves are not being resolved in DSM2 and are likely one of the main reasons the model requires a large value of θ to be stabilized. We note that weakly nonlinear effects lead to higher harmonics and short wavelength waves throughout the Delta, and these waves are likely to be propagated with less accuracy than the main tidal constituents. Therefore, we should not expect processes that depend on weakly nonlinear effects, such as tidal residual flows, to be as accurately resolved in DSM2.
Given that errors in DSM2 depend on both the Courant number and WRF, one might expect the results to improve with a decreased time-step size on the same grid. However, tests show no significant xi AUGUST 2018 https://doi.org/10.15447/sfews.2018v16iss2art6 differences in model results for the phase and amplitude of the M 2 and S 2 constituents with time step sizes of 60 min, 15 min, and 5 min. This suggests that errors associated with the larger reach-scale WRF and θ = 0.6, or associated with other more routine model accuracy limiters (bathymetry and calibration) dominate the solution, preventing it from improving with decreased time-step size. It is important to note that, with decreased time-step size, the temporal resolution of boundary forcing should also increase. One alternative is to use 6-minute NOAA Amorco Pier buoy data.
It would be relatively straightforward to reduce the errors associated with computing the tides in DSM2 simply because these errors depend so heavily on the Courant number and WRF and not on any inherent deficiency associated with DSM2. On the contrary, the FourPt scheme and the other numerical methods in DSM2 are well suited to simulate tidal flows in the Delta. We suggest running DSM2 with a resolution of ~200 m, giving an average junction-scale WRF = O(0.1). The corresponding time step needed to ensure C = O(0.1)would require Δt ≈ 20 s. Incorporating higher resolution would have the additional advantage of necessitating updated, higher-resolution bathymetry. We note that this time and space refinement together represent an increase in the computational burden for DSM2 of roughly two orders of magnitude (45 times as many time steps and 8 times as many grid cells) which may limit its usability on small personal computers. However, DSM2 with this resolution would still represent a significant performance gain over multidimensional models on desktop or parallel computers. Finally, data sources with higher temporal resolution would be needed at the boundaries. Ultimately, higher resolution and a smaller Courant number would then enable use of a smaller implicitness parameter, thus preventing the FourPt scheme from overdamping the tides.

Figure C2
Schematic of the formation of junction-trapped waves due to the different impedances between different reaches of an estuary. The resulting reflected junction-tapped waves typically have wavelengths equal to twice the junction separation and identical phase as the incident wave (Schwartz, 2015).

ABSTRACT
We investigate the fidelity of the Delta Simulation Model-2 (DSM2), a one-dimensional branched network hydrodynamics solver, which is used to model water quality and ecology in the Sacramento-San Joaquin Delta estuary. We find that while DSM2 reproduces the total flows well, it does not accurately represent the harmonic components of the tides and tidal modulation of subtidal flow. The inaccurate representation of tidal dynamics affects prediction of subtidal flows, flow splits at key junctions, and salinity. These deviations are the result of coarse spatial and temporal representation of tides as well as unrepresented estuarine physical processes. We propose and evaluate two types of schemes intended to improve fidelity: modifying the model domain and specifying fine grid and boundary conditions, and incorporating and parameterizing more complex physical processes into the 1-D model. We also develop a comprehensive protocol to evaluate the model in which we assess the fidelity of model results. In this protocol, we also include a decomposition of the model error into a systematic component because of model representation, and an unsystematic component, which includes errors from both unmodeled physical processes and data precision. Our analysis reveals that these recommendations would be effective provided they can be incorporated with model recalibration. Both our proposed schemes and the model evaluation process will be useful in analyzing models of networked surface water systems such as the Delta in which the distribution of observations is spatially inhomogeneous.

KEY WORDS
Tides, estuarine dynamics, shallow water wave equations, numerical modeling, Delta Simulation Model II, DSM2, error analysis, model performance evaluation, San Francisco Bay-Delta.

INTRODUCTION
The success of water management and ecosystem restoration efforts in the Sacramento-San Joaquin Delta (hereafter, the Delta) depend on correctly understanding the system-scale scalar transport and fate dynamics (Cloern 2007). Often, this understanding is gained through the use of numerical models. The California Department of Water Resources' (CDWR) one-dimensional (1-D) Delta Simulation Model II (DSM2)-the primary hydrodynamic model used for planning and management purposes in the Delta (e.g, Liang et al. 2011;Rose et al. 2013;Tu 2012;Smith 2014)predominantly reproduces only instantaneous total flows and water levels with high accuracy (see Sridharan 2015). However, it does not represent four aspects of the tidal dynamics of the system well (see the Appendix A for a description of the terms used herein): (1) the subtidal or riverine component of the flow; (2) astronomical tidal harmonic components (such as K 1 , M 2 , etc.) as well as overtides (e.g, M 4 ) and compound tides (e.g, MK 3 ) (Nidzieko 2010); (3) modulation of the subtidal flows by tides (MacCready 1999), and (4) variability in the water column depths and flows over the fortnightly spring-neap lunar cycle (Monismith 2016). (Hereafter, we refer to (1)-(4) collectively as tidal effects). Inaccurate representation of the tidal effects affects the fidelity of the model solution. This limitation in representing the tidal effects, previously unreported in both the DSM2 methods literature (e.g, CH2MHill 2009), as well as the applications literature (e.g., Kimmerer and Nobriga 2008;Perry et al. 2015), may have significant implications for water management in the Delta.
In this paper, we evaluate DSM2's hydrodynamic and water quality solvers, so that appropriate inferences may be drawn from its results. Specifically, we investigated its reproduction of (1) instantaneous hydrodynamic and water quality fields, (2) tidal effects, and (3) key regulatory flow parameters. The latter is important because Delta water managers routinely use DSM2 as a decision support tool (Fleenor and Bombardelli 2013).
To perform this investigation, we developed a comprehensive model-evaluation protocol in which we incorporated (1) robust aggregate metrics of overall performance; (2) a decomposition of the spatially localized Root Mean Squared (RMS) errors between model results and observations into systematic or model representation-induced components (biases), and unsystematic (rather than non-systematic or purely random noise [Taylor 1997]) or unmodeled physical processes-induced components (biases) and data precision errors (random noise) (Willmott 1981); and (3) succinct target diagrams (Jolliff et al. 2009) and field difference maps between observed and modeled quantities that describe spatial and temporal trends.
Given the continued utility of simple 1-D models (e.g., Savenije 2005;Savenije et al. 2008), maximizing the likely accuracy of models such as DSM2 is worthwhile. To this end, we proposed a suite of improvements based on a review of the literature on more complex as well as better-performing models, and evaluated these improvements using the protocol we described above. The first set of these recommendations improve model fidelity; the second set are useful for informing management actions. These recommendations, which we implemented without recalibrating the model, significantly improved the model's error response characteristics. This indicates that with proper recalibration, these recommendations would improve the fidelity of the model, as well. We briefly describe these recommendations below.
First, it is good modeling practice to meet a Courant-Freidrichs-Lewy (CFL) condition of about 1 (Ferziger and Peric´ 1997), and to match the temporal resolution of the tidal boundary conditions and the model time-step size (hereafter, the resolution criterion, R) (Sobey 2001;Fringer et al. 2006). For information propagating at speed u in a time-step of size Δt through a grid of resolution Δx from a boundary condition with resolution Δt BC , these conditions are In the Delta, as a typical tidal wave propagates at a speed of c gH 10ms 1 (for a typical water column depth of 10 m; e.g., see Kundu and Cohen 2002), and in 15 minutes, the typical time-step size https://doi.org/10.15447/sfews.2018v16iss2art6 in DSM2, it moves about 10 km inland, a significant portion of the Delta. Therefore, a smaller model timestep size is required to resolve the tide adequately.
Second, the addition of an idealized representation of San Francisco Bay (hereafter, the Bay) to allow tidal information to radiate past Martinez is advisable in systems such as the Delta (Sobey 2001). An idealized Bay would also be expected to allow the model to respond to spring-neap oscillations in the sea-level of the coastal ocean more accurately, particularly when the model is being used as a forecasting tool. We also propose some ad hoc parameterizations to represent unmodeled physical processes: viz, flows in shoalchannel systems and intertidal mudflats, and windand density-driven flows. Finally, we also propose simple modifications in the numerical approximations in the model. These recommendations, as well as the model evaluation protocol, would be useful for evaluating and improving 1-D models in general.
This paper is organized as follows: in Section 2, we describe the Delta and its hydrology; in Section 3, we provide a brief overview of estuarine hydrodynamics; in Section 4, we describe the DSM2 model and outline the scope of our evaluation of the model; in Sections 5 through 7, we present the results, synthesize them, and report findings that apply to future applications of this model as well as to other 1-D shallow-water models.

SACRAMENTO-SAN JOAQUIN DELTA
The Delta is a critical component of water management for agriculture, fisheries, ecosystems, transport, industry, and recreation in California. Historic land use patterns and water operations have resulted in the ongoing advection of sediment and dissolution of various compounds in the Delta (Barnard et al. 2013). It features diverse ecosystems (Lucas et al. 2002) that are home to several native as well as invasive species (Feyrer and Healey 2003). More urgently, aging infrastructure in the Central Valley (Mount and Twiss 2005) and rising sea levels resulting from climate change (Barnard et al. 2013) are serious concerns. The combination of these hydrological, ecological, and social factors has made this one of the most extensively studied surface water systems in the world.

Delta Geography and Morphology
The Delta is an inverted fan estuary at the confluence of the predominantly rain-fed Sacramento River and the snowmelt-fed San Joaquin River, and smaller rivers such as the Mokelumne in the east (Fleenor et al. 2010) (Figure 1). The Delta converges at Martinez in the Carquinez Strait, which links to the Bay and then the Pacific Ocean via the mouth of the estuary at the Golden Gate. The average daily river flow ranges from about 150-2,200 m 3 s -1 . While flows in both the Sacramento and San Joaquin rivers are heavily regulated, the Sacramento River contributes almost 90% of the inflow to the Delta (Kimmerer 2004). The tides are semi-diurnal in nature (Monsen 2000) with a net tidal flow at Martinez of about 5,000-13,000 m 3 s -1 and a mean tidal range of 1.8 m (Kimmerer 2004). Both the tidal signal and subtidal transport attenuate rapidly eastward into the Delta (Monsen 2000;Sridharan 2015).
The Delta is morphologically complex, with numerous shallow and wide channels ) and leveed and submerged islands (Monsen 2000), and confluences at its oceanward end into a shallow embayment (Suisun Bay), which is bordered by an intertidal brackish wetland (Suisun Marsh). In the northwest, the Yolo Bypass floodplain is maintained to divert water from the Sacramento River when it floods and overtops a series of weirs upstream of the Delta (Sommer et al. 2011). The Sacramento River bifurcates into its mainstem, and three sloughs-Sutter, Steamboat, and Georgiana-in the North Delta. The San Joaquin River bifurcates into its mainstem and the Old and Middle rivers (OMR) in the South Delta ( Figure 1). The Old and Middle rivers are linked by a series of canals that divert water to the pumps ( Figure 1). This region is known as the OMR corridor (Fleenor et al. 2010).
Flow, sediment, salt, and biota are channeled through several critical flow-course junctions. Various species of endangered fish migrate through these junctions via different routes; rigorous quantitative evidence is available through numerous studies of the migration patterns of Chinook salmon (e.g., Perry et al. 2010;Newman and Brandes 2010;Buchanan et al. 2013;Perry et al. 2013). These include the bifurcation of the Old River from the San Joaquin River near Mossdale (Brandes and McLain 2000), 4 VOLUME 16, ISSUE 2,ARTICLE 6 and the bifurcations of the Sutter, Steamboat, and Georgiana sloughs from the Sacramento River (Perry et al. 2013). Moreover, the phasing of the tides between the Sacramento River and the San Joaquin River at Threemile Slough has implications for the local transfer of sediments between the two rivers (Wright and Schoellhammer 2005), as well as systemscale scalar transport (Sridharan 2015). This relative phasing of the tides also governs the tide-phasedependent migration patterns of the endangered Delta Smelt (Bennett and Burau 2015).
Water is managed within the Delta through (1) the Delta Cross Channel (DCC), which is opened to supply freshwater from the Sacramento River to the South Delta; (2) the Central Valley Project (CVP) and State Water Project (SWP) export pumps in the South Delta, and, at much lower levels, the Contra Costa Water District (CCWD c2018) and North Bay Aqueduct (SCWA c2018) exports, which supply drinking water to parts of central and northern California; and (3) a gate system, which is operated to prevent salinity intrusion into Suisun Marsh (Monsen 2000) ( Figure 2). Several temporary barriers are also erected at various locations in the Delta to aid salmon passage (at the head of Old River) and to maintain water levels to facilitate agricultural diversions (Monsen 2000;Kimmerer 2004;Monsen et al. 2007;Kimmerer 2008) (Figure 2). In addition, there are nearly 2,000 uncharted private water withdrawals, which are not well quantified ( Figure 2). These operations significantly alter the flow patterns in the Delta.

Delta Hydrology, Water Quality, and Operations
In this study, we simulated the hydrodynamics and water quality in the Delta between October 1, 2008 and October 1, 2015: a period spanning the relatively wet years of 2010 and 2011, and the drought years of 2013 to 2015 (CDWR c2017) ( Figure 3). During this period, high-resolution tidal water-level data was available at Martinez. In addition, the calibration window of the most recent version of DSM2 overlaps this period (CDWR c2014). We chose the duration of the data sets so as to include seasonal variations between the low inflow summer and autumn months and the high inflow winter months, as well as interannual variability in the hydrology.
In these 7 years, the flow in the Sacramento River compared to that in the San Joaquin River was nearly three to seven times higher in the wet years, and nearly ten to fifteen times higher in the critically dry years. The total annual inflow ranged between 6.7 and 29.6 million acre feet (maf) with periodic high-flow events in the winter months. The flow was minimal in Yolo Bypass except in April 2011 (CDEC c2016) ( Figure 3A).
The mean water level at Martinez above the North American Vertical Datum of 1988 (NAVD88; see Figure 2 The DSM2 grid. For clarity, in this map we show 1-D DSM2 channels that follow the true centerlines of the flow courses. We indicate the Carquinez Strait extension by a heavy line. The actual channels in DSM2 are straight links between nodes with a sinuosity factor to account for the true lengths of the flow courses (BDO c2002). The agricultural diversions and return flows into the system are distributed over all the DSM2 nodes.
6 VOLUME 16, ISSUE 2, ARTICLE 6 NGS c2004) was 0.33 m, and the daily tidal range fluctuated between 1 m and 2.3 m (CDEC c2016) ( Figure 3B). Both the mean water level and tidal range had relatively stationary statistics during this period.
The pumps exported about 20-35% of the total inflow during this period, with typically higher exports in the wet years than in the dry years. Export flows generally peaked during summer to late autumn (CDEC c2016) ( Figure 3C). The export flows strongly influenced the net subtidal flow in the OMR corridor-known as the OMR flow-and even reversed the prevailing flow direction when they exceed the river flow (Kimmerer 2011).
To aid salmon passage during this period, the DCC was typically closed intermittently in the summers and late autumns, and every December between 2009 and 2015 (USBR c2018). The salmon passage barrier In (A) and (C), we demarcate water years and total annual inflow or exports as critical, dry, below normal, above normal, and wet (dark to light). For clarity, in (D) we slightly shifted each operation policy vertically. In (E), we indicate the electrical conductivities in the Sacramento and San Joaquin rivers on the left axis, and the electrical conductivity at Martinez on the right axis. In (G), we indicate NDOI on the right axis, and X 2 on the left axis. We show NDOI ˜ and NDOI in 2014 with the moon phases, i.e, full moon (filled circle), waning (left filled circle), new moon (circle), or waxing (right filled circle) in the inset. https://doi.org/10.15447/sfews.2018v16iss2art6 at the head of Old River was typically closed in the spring and autumn of each year, except in 2009, 2010, and 2013, when experiments were conducted with non-physical barriers such as strobe lights, noise-makers, and bubble curtains to discourage salmon in the San Joaquin River from swimming into the Old River (BDO c2018). Temporary barriers to aid agricultural water withdrawals were typically installed in the summers and late autumns every year (BDO c2018). The salinity control gates in Montezuma Slough were typically operated in autumn and winter every year (CDWR c2018) ( Figure 3D).
The mean daily electrical conductivity in the South Delta was about 555 μScm −1 , and peaked periodically to about 1,300 μScm −1 between late autumn and spring each year during this period. In the North Delta, it was lower (165 μScm −1 ) and less variable. The oceanic electrical conductivity at Martinez ranged between 1,150 μScm −1 and 32,000 μScm −1 with a mean of 20,700 μScm −1 . There was a significant dip in Delta salinity coincidental with the extreme high flow event in February and March of 2011 (CDEC c2016) ( Figure 3E).
The east-west wind patterns at Pittsburg ( Figure 1) are representative of atmospheric conditions in the whole Delta (Monismith 2016). Generally, easterly winds prevailed in the winter months, and westerly winds prevailed the rest of the year. On average, the easterlies were about one-fifth the strength of the westerlies. These westerlies typically sustained for longer than the easterlies ( Figure 3F).
The total observed and predicted subtidal outflow from the Delta, using the CDWR's water balance analysis, DAYFLOW (CDWR c2016), is called the Net Delta Outflow Index (NDOI). This water account uses the gauged flow in rivers upstream of the Delta, estimates of minor ungauged flows, flows at the pumping facilities in the southern Delta, and within-Delta precipitation measurements and consumptive use estimates to arrive at the at Chipps Island (Monismith et al. 2002). The NDOI closely follows trends in Delta inflow ( Figure 3G). X 2 , the distance from the ocean to the location where the salinity is 2 psu, is a measure of salinity intrusion, and varies inversely to the NDOI between downstream of Carquinez Strait (<60 km), and upstream of Rio Vista (>90 km) ( Figure 3G).

HYDRODYNAMIC MODELING IN THE DELTA
Estuaries are surface water systems in which complex physical mechanisms interact with one another.
In this section, we briefly survey the literature on the challenges of modeling a system such as the Delta and the typical modeling effort expended. We particularly focus on processes that can potentially be incorporated into DSM2. Kimmerer (2004) provides a more thorough review of the hydrodynamic processes occurring in the Bay-Delta system.

Physical Processes Involving Tides in Estuaries
Tidal effects in estuaries must be modeled accurately to reproduce the propagation (e.g., Buschman et al. 2010) and dissipation/intensification (e.g., Zhong and Li 2006) of the harmonic components of the tides, and to recover the spatial patterns of the tidal energy dynamics (e.g., Kowalik and Proshutinsky 1993). These patterns affect various transport and mixing processes such as tidal excursion (Fischer et al. 1979), tidal pumping (Becherer et al. 2016), trapping (Smith and Stoner 1993;Sassi et al. 2011), and the tidal random walk (Ridderinkhof and Zimmerman 1992) (see Appendix A for a review of these processes).
In the Delta, in addition to the transport and mixing processes described in Appendix A, subtidal winddriven flow and gravitation circulation (salinity gradient-driven flow), which adjust the river flow (henceforth referred to as subtidal adjustment flows), are also important (see Appendix A).
The water surface set-up caused by wind shear in the western Delta is important in the fortnightly oscillatory exchange flow in the Delta (Monismith 2016). The spring-neap water level change within the Delta, in part, drives this flow (the moon-phases in the inset in Figure 3G indicate that spring tides [full and new moons] correspond roughly to the filling of the Delta while the neap tides [quarter moons] correspond roughly to its emptying). In addition, the water surface set-up by the wind (Monismith 2016), and the response of the water surface to the coastal ocean also, in part, drive this flow. We approximate 8 VOLUME 16, ISSUE 2,ARTICLE 6 this as a daily exchange flow based on Monismith's (2016) analysis (see page 3 of Appendix A) as, Froude number, which is the ratio of the turbulent kinetic energy supplied by the wind and the potential energy of the wind-driven surface set-up, computed at times i + 1 and i − 1, in which <H> is the subtidal water column depth of about 12 m (Monismith 2016), ρ 0 =1,000 kgm −3 and ρ A = 1 kgm −3 are, respectively, the background density of water and air, C D,W ≈ 1.3 × 10 −3 is the drag coefficient at the airwater interface (Fischer et al. 1979), U W is the instantaneous wind speed 10 m above the water surface at Pittsburg (Figure 1), A SB ≈ 2 × 10 8 m 2 and L SB ≈ 20 km are, respectively, the surface area and length of Suisun Bay (Monismith 2016), and is the number of time-steps of duration Δt in 1 day. Written in this form, Equation 2 simply states that the wind induces an exchange flow, which is just the volume of water dragged by the wind with a body force. This force is amplified because virtually all the work done by the wind is used to move the water rather than to mix the air into the water. After substitution of the numerical values, Equation 2 becomes in which the coefficient has units of m 3 . With the time-series of U W at Pittsburg, we can estimate this flow.
The net subtidal flow out of the Delta induced by gravitational circulation is of regulatory interest, because an estimate of this flow may prove useful in controlling X 2 (Monismith et al. 2002). Moreover, although Monismith (2016) showed that the effect of gravitational circulation on the subtidal adjustment flow was small, his analysis was conducted for relatively dry months during which the Delta could be reasonably expected to be well-mixed (Geyer and MacCready 2014; s to Equation A-2.7 on page 4 of Appendix A). In wet months, however, the Delta is likely to encounter strong periodic stratification (see to Equation A-2.8 on page 4 of Appendix A). A simple, but effective estimate for the salinity-induced subtidal flow is the fraction of the subtidal flow that a transient response in the longitudinal salinity field induces, i.e., The caissons (crossed circles) close automatically when the water column depth in the shoals falls below a nominal threshold, which simulates wetting and drying.

AUGUST 2018
https://doi.org/10.15447/sfews.2018v16iss2art6 (4) where Q S is the gravitational circulation-induced flow, t is the current day, and we have applied the expressions for Q(t) and Q R (t) by rearranging the steady-state version of the auto-regressive X 2 model [Equation 9 in Monismith et al. (2002)]. We argue here that the gravitational circulationinduced flow would account for the difference between the flow that changes the salinity gradient, Q(t), and the mean river flow, Q R (t). In Equation 4, we can obtain the daily value of X 2 from the DAYFLOW program based on the autoregressive lag model of Jassby et al. (1995), a model similar to that of Monismith et al. (2002).
In addition to these flows, flood and ebb variations in water column depth and velocity produce a net subtidal landward flux of water known as Stokes' drift (Longuet-Higgins 1969). Channel junctions may asymmetrically distribute the flow from Stokes' drift. Stokes' drift also modifies the river flow in open channels and flow splits at junctions (Sassi et al. 2011). These effects cause tide-induced variation in the subtidal flow between the ebb and flood phases of the tide. With the assumption that the channel width is constant, we can write the Stokes' drift as (Monsen 2000;Sridharan 2015) (see "Derivations of Subtidal Adjustment Process Expressions" on page 3 in Appendix A), where, Q SD is the compensatory oceanward subtidal flow from Stokes' drift transport, is the cross-sectional area for a water surface fluctuation η′ about its tidally averaged value, W is the width of the channel, u′ η is the subtidal average of the tidal velocity at the water surface, ⟨ -U⟩ is the tidally-and cross-sectionally-averaged streamwise velocity, and -U′ is the tidal fluctuation about ⟨ -U⟩.
The non-linear interaction between tidal straining (or the difference in vertical shear between the flood and ebb phases of the tide) and subtidal adjustment causes periodic stratification and de-stratification, which can change the turbulent cross-sectional mixing between the ebb and flood phases of the tide by several orders of magnitude (Simpson et al. 1990). We discuss the case of the partially-mixed estuary, because it involves the most complicated interaction of physical processes, and also because a well-mixed estuary is subsumed within it, while a stratified salt-wedge estuary suppresses turbulent mixing. In Figures 4A and 4B, fresh river water and ocean salinity interact to establish a prevailing longitudinal salinity gradient. In Figure 4A, during the flood phase of the tide, saline ocean water flows through the channel much faster than on the shoals, and establishes a lateral salinity gradient and a flow phase difference. The flow of fresh river water over saline ocean water establishes a vertical shear. The lateral density gradient between the channel and the shoals drives a secondary flow toward the shoals at the bottom, which is balanced by a return flow of freshwater from the shoals into the channel near the water surface (Ralston and Stacey 2005).
In Figure 4B, during the ebb phase of the tide, the water column becomes strongly stratified and lateral mixing is suppressed, and water flows out of the system with an almost uniform vertical velocity profile and high frictional retardation in the shoals (Ralston and Stacey 2005). Moreover, if the ebb tide is sufficiently energetic, water surface drawdown can cause the shoals to dry during the ebb phase (Gourgue et al. 2009).
Modeling the tidal dynamics accurately is a precursor to reproducing the transport and mixing mechanisms discussed above and in Appendix A (e.g., Geyer et al. 2000), and the movement and fate of physical and biological scalars such as sediments and fish (e.g., Smith and Stoner 1993). We discuss the literature regarding various modeling approaches to resolve the tidal effects in estuaries below.
Predominantly, models such as DSM2 solve the crosssectionally-averaged Saint-Vénant equations (or shallow-water wave equations), to estimate the flows and water column depths (e.g.,  in estuarine systems (e.g., MIKE11 [Patro et al. 2009] or HECRAS [Brunner 1995]), before coupling the hydrodynamic results to other transport models (e.g., Mike Eco lab [Ma et al. 2011]  One-dimensional models, in general, accurately capture the streamwise propagation of the characteristics of tidal waves, i.e., the landward and oceanward propagation of information resulting from a disturbance in the water level (Kundu and Cohen 2002). However, although 1-D models can resolve the cross-sectionally-averaged tidal pumping, they cannot resolve the wind-driven circulation (Smith 1987) and gravitational circulation.
Another class of 1-D models resolves the vertical dimension while averaging over the lateral and streamwise directions (e.g., MacCready 2004; Ralston et al. 2008). This is only applicable near the mouths of estuaries where gravitational circulation is the dominant hydrodynamic process. We have only included references to such models here for completeness.
The attractiveness of 1-D models is that they have significantly shorter run times than do multidimensional models, and so entire-system scale simulations over very long periods (multi-year) are possible (e.g., Sridharan et al. 2018). However, these models sacrifice hydrodynamic fidelity for speed (Fleenor et al. 2010). This trade-off is acceptable so long as either the critical physical processes being studied are incorporated (e.g., the dead zone transport and mixing model described in Atkinson and Davis [2000]), or if the results are interpreted within the limitations of the model. Two-dimensional and 3-D models are the preferred choice when more complex sub-daily physical processes have to be represented.

Two-and Three-Dimensional Modeling
When more complex physical processes (see Section 3.1 and Appendix A) have to be represented, or the spatial variability in hydrodynamic fields is to be resolved (see the guide to model selection by Robson et al. 2008), 2-D and 3-D Saint-Vénant equation solvers are invoked. These are used primarily as research tools, or as high-resolution numerical laboratories (e.g., Nam 2018, unreferenced, see "Notes"), because their deployment for long-period studies is extremely resource-intensive.
Efforts to resolve the tidal effects using Saint-Vénant equation solvers have focused on either 3-D finite element (e.g., Dawson et al. 2006) or finite difference/ volume models (e.g., Louaked and Hanich 1998).
Of these, finite element models are theoretically superior in resolving different harmonic components for variable bathymetries and arbitrary domain shapes (Grotkop 1973). Because of the conflicting requirements of modeling different harmonic components, they are often modeled one by one (e.g., Walters et al. 2013). However, the interactions between the components cannot be captured with this approach; therefore, such results would not be very useful from a management perspective. Alternately, accurate resolution of the total tide involves additional effort such as solving the wave continuity equation (Kolar et al. 1994).
Successfully modeling tidal effects requires accurate numerical solution techniques and robust calibration processes in 3-D models. Typically, previous studies have utilized high-resolution spatial advection schemes such as total variance diminishing (e.g., Louaked and Hanich 1998), and more sophisticated spatial and temporal interpolations that are consistent with the advection scheme used (e.g., Wolfram and Fringer 2013 As a comparatively inexpensive alternative to 3-D models, planar 2-D models such as HECRAS 5.0 (Brunner 2016), RMA, RMA2 (RMA 2013), TELEMAC-2D (Galland et al. 1991), FaSTMECH (Nelson et al. 2003), and SRH-2D (Lai 2009) are often used to model estuarine hydrodynamics. Such models rely on the relative unimportance of vertical mixing of momentum compared with its horizontal balance. The philosophy underlying these models is that the geographic resolution of hydraulic features, rather than the representation of complex physical processes, is sufficient to explain most of the observed spatial and temporal variations in the hydrodynamic quantities. These models, consequently, do not fully represent the effects of baroclinic circulations, such as shoalchannel lateral exchange flows in which the isobars and isopycnals are not aligned (MacCready and Geyer 2010). They instead focus on solving the 2-D Saint-Vénant equations with finely-resolved domain bathymetry and bottom friction, finite element (e.g., TELEMAC-2D, FaSTMECH, and RMA2), finite difference (SRH-2D), or finite volume (e.g., HECRAS 5.0) approaches.
In the Delta, apart from DSM2, both 2-D and 3-D modeling approaches are routinely adopted. Twodimensional representations include RMA and RMA2 (RMA 2013). Three-dimensional representations include the TRIM structured grid (Gross et al. 1999) and

MATERIALS AND METHODS
We

Model
DSM2-Hydro, described in detail by Anderson and Mierzwa (2002) (see also Sridharan 2015) solves the cross-sectionally-averaged Saint-Vénant equations (Sobey 2001) on a network of 1-D channels, each with cross-sectional area A, hydraulic radius R H , quadratic bottom friction represented by a Manning's n and conveyance K, flow Q, water level η, and lateral seepage and drainage flows q, as (Kimmerer et al. 2008) Equation 6 is solved on the FourPt stencil with two adjacent spatial nodes and two time-steps ). The numerical scheme allows a range of settings from first-order upwinding to second-order Simulations are performed centered in space and with slightly more diffusive time-stepping than Crank-Nicolson for stability (Anderson and Mierzwa 2002).
In the DSM2 grid, the waterways of the Delta are represented as 1-D channels, and grid cell ends as well as river junctions as nodes ( Figure 2). The grid also includes water management structures (Table 1) (Anderson and Mierzwa 2002). Cross-sections at each node are represented by subgrid bathymetrycoarse grid cells that have the same cross-sectional area and hydraulic properties as the fine-scale bathymetry (Casulli 2009). These are implemented as elevation-width-wetted perimeter-area look-up tables (Anderson and Mierzwa 2002). Flooded islands are treated as continuously stirred tank reactors or "reservoirs." Boundary flows are generated using DAYFLOW (Anderson and Mierzwa 2002), and CDWR supplies time-series of gate operations (CDWR c2014). Lateral seepage and drainage flows at several nodes (Kadir 2006) are also incorporated in DSM2 as mass balances on a monthly time-scale.
DSM2-Qual (hereafter, Qual), described in detail by Liu and Ateljevich (2011), is typically used to model electrical conductivity as a proxy for salinity in the Delta. The ionic composition of water changes from sulfates of leached metals from soils caused by agricultural runoff in the south and central Delta, to light metal halides and bicarbonates in the North Delta, to chlorides and bromides from oceanic influence in the western Delta (Harader et al. 2007). As a result, salinity itself is a non-conservative tracer in the Delta, which makes its direct computation challenging.
Qual extends the Branched Lagrangian Transport Model (BLTM) of Jobson and Schoellhammer (1993), a 1-D finite volume water quality model in which a parcel of water with a specified concentration of a water quality constituent enters a channel and occupies a reach of length Q A t (set by the time-step size, flow rate, and channel cross-section area). This parcel is then followed over time as the constituent within it is mixed with its upstream and downstream neighboring parcels. The extension in Qual involves reformulating the original discretization scheme for the flux of constituents between parcels. The original scheme specified flux in terms of the concentration difference between neighboring parcels, which introduced spurious numerical diffusion in long parcels and reduced mixing in short parcels. The new discretization scheme specifies flux in terms of the gradients of the constituents, thereby altogether eliminating the dependence of mixing on the parcel length. The advection-dispersion equation being solved is (Liu and Ateljevich 2011) in which l 1 -called the "dispersion coefficient" for a given channel in DSM2-is actually the integral mixing length scale (Fischer et al. 1979), i.e., based on Fischer's approximation for the shear flow dispersion coefficient and Manning's formula for the mean river velocity (Liu and Ateljevich 2011).
A third model, DSM2-PTM, included in the DSM2 suite, is intended to stochastically simulate the movement of passive particles and aquatic organisms such as fish, fish eggs, and larvae within the Delta (Hutton 1995 We note that as we prepare this manuscript, CDWR is extending DSM2 to include the General Transport Model (GTM). CDWR proposes to incorporate realtime coupling between the momentum and transport equations in the GTM (Hsu et al. 2016). This has the potential to incorporate cross-sectionally-averaged effects of subtidal adjustment into the hydrodynamic solution as well. We also note that CDWR is currently improving its model of lateral seepage and drainage flows through the Delta Channel Depletion Model (DCDM) to include hyporheic flows with electrical conductivities that are associated with agricultural runoff, as well as to provide daily estimates of lateral sources and sinks into DSM2's models (Liang 2018, unreferenced, see "Notes"). The implementation of this model is also expected to improve DSM2's fidelity.

Calibration and Validation Status
DSM2 has been extensively calibrated and tested over 2 decades (CDWR c2011). It has been tested with a suite of simple conceptual set-ups (Zhou 2013) for model consistency. In these tests, Hydro and Qual have recovered expected solutions.
DSM2 has also undergone regular recalibration to reflect the current hydrologic and geomorphological state of the Delta (CDWR c2011). These efforts have typically involved (1)  In version 8.1.2, the formulation of the subgrid hydraulic look-up table and Hydro's standard usage of the spatial quadrature were modified to avoid spurious water level changes resulting from the incorporation of local bathymetric features into the channel bottom slopes . In addition, the definition of l I was updated in Qual (Liu and Ateljevich 2011). These model changes necessitated the most recent recalibration effort, which was undertaken between 2009 and 2011   CH2M Hill 2009), and rescaled the mixing length scale for all channels by a factor of 1425 . We briefly discuss the validation results in CDWR (c2014) subsequently.
Generally, DSM2 predicted water levels very accurately, with R 2 s in excess of 0.95 and phase lags between the model results and the observations of less than 20 minutes almost everywhere in the Delta. The longest phase lags of about 1 hour occurred in Sutter and Steamboat sloughs. The primary shortcoming was that the low-low water levels throughout the Delta were under-predicted.
In the case of flows, the model's primary shortcoming was predicting smaller peak floods. In the San Joaquin River and the south and central Delta, R 2 s typically worsened to between 0.3 and 0.9, and phase lags increased to between 20 minutes and 80 minutes. The primary shortcomings in these locations were under-prediction of peak floods, amplitude damping, and incorrect phase predictions.
In general, DSM2 predicted electrical conductivities reasonably well, with typical R 2 s of about 0.8 to 0.9. The primary mismatch was that the model missed high-salinity events associated with low river inflows.
CDWR achieved an optimal trade-off among accuracy, storage requirements, and run time for Hydro with the standard grid with a time-step of 15 minutes . The difference in accuracy between model runs with time-steps of 15 minutes and 5 minutes was typically about 1-2% . Similarly, for Qual, the difference in overall accuracy between model runs with timesteps of 15 minutes, 5 minutes, 3 minutes, and 1 minute was typically less than 1% (Liu and Sandhu 2014). For these reasons, the standard time-step size prescribed for Qual is also 15 minutes (Liu and Sandhu 2014).

Description of Experiments
We implemented several proposals both to improve model representation, and to incorporate some of the unmodeled physical processes. In the former category, we increased the spatial resolution of the model grid, increased the temporal resolution of the tidal boundary conditions, and reduced the time-step size so as to adhere to CFL and R values of about 1. It is also important to extend the model domain to allow for the radiation of the oceanwardpropagating tidal wave characteristic, as well as to improve the modeled water level response to oceanic water level variability. However, because complex 2-D circulation processes occur in the Bay, it is very difficult to incorporate an extension up to the Golden Gate ( Figure 1) without extensive calibration. We therefore extended the domain to include Carquinez Strait (thick line in Figure 2). In the latter category, we parameterized shoal-channel interactions and wetting and drying processes to capture important unmodeled physical processes. We describe these configurations below.
To evaluate the model's performance for typically used configurations and to evaluate our proposals, we ran Hydro with multiple time-step sizes and grid resolutions, and with tidal boundary conditions of varying temporal resolution (Table 2) (V.K. Sridharan, unpublished data, see "Notes"). We performed all the analysis using MATLAB (The MathWorks, Inc. 2016).
We first represented the typical configurations with which DSM2 is used in three runs (model configurations G1T*B1). We forced these runs with the California Data Exchange Center (CDEC) data at Martinez (CDEC c2016). Of these, we only ran the G1T15B1 configuration over the water years between 2009 and 2015 to analyze DSM2's response to interannual hydrologic variation. We ran the two other configurations for the water years between 2013 and 2015 to compare the results with those obtained using fine-resolution boundary conditions. In setting the duration of these runs, we were restricted by the availability of high-resolution data at Martinez (see "Data Processing" later in the text). In the next three runs (model configurations G*T5B*), we evaluated the model for several grid sizes and boundary condition resolutions. We forced DSM2 with a fine temporal resolution tidal boundary using National Buoy Data Center's Amorco Pier tide gauge buoy data at Martinez, available from the National Oceanic and Atmospheric Administration-Tides and Currents (NOAA-TC) database (NOAA-TC c2013). The buoy data resolved the energy transfers to the higher than semi-diurnal frequencies significantly better than the CDEC data ( Figure 5). We ran these configurations for the water years between 2013 and 2015 as well.
We varied the grid resolution from the standard DSM2 grid by adding one new piecewise trapezoidal cross-section at the midpoint between existing crosssections in each channel (Figure 6), provided the new cross-sections were at least 762 m (2,500 ft) away from either existing cross-section (see "Prescriptions for Model Recommendations" on page 5 of Appendix A). Using these new cross-sections, we generated a grid that was twice as fine as the original grid. We interpolated the hydraulic properties of The energy content of various tidal harmonics (equivalent to the contribution of that harmonic to the tide height) is much better resolved in the NOAA data set than in the CDEC data set. In each plot, the arrows indicate the prevailing direction of energy transfer between the tidal harmonics. In each case, we generated the spectrogram with 64 frequencies and a 15-day moving Hamming window (parameters α = 0.54, β = 0.46) with a 50-hour overlap.  the new sections linearly from the existing sections, proceeding from the highest elevation downward, with the number of elevations in the new crosssection equal to the number of elevations in its downstream cross-section ( Figure 6). This method of refining the grid is the best practice possible, because DSM2 has been calibrated such that model cross-sections often do not match true bathymetry (Smith and Enright 1995). Furthermore,  applied a similar interpolation scheme to insert new computational sections within Hydro, so that the method adopted here of resolving the grid is consistent. In addition to these runs, we performed several simulations-described subsequently-to explore some of our proposed model improvements (see Section 6.2).
We also performed two runs in which we extended the grid to include the Carquinez Strait (Figure 7; see "Prescriptions for Model Recommendations" on page 5 of Appendix A) as a surrogate for San Francisco Bay (model configuration G*T*BB). For these runs, we used the Romburg Tiburon Center (RTC) buoy data in Carquinez Strait (RTC c2017) to force the model in the water years between 2010 and 2015. In these runs, we did not rigorously calibrate the hydraulic parameters of the channel that represented Carquinez Strait; such an effort was well beyond the scope of this work. However, we did account implicitly in a spatially-averaged sense for the natural sill in Carquinez Strait; the cross-section c− c′ is 4.87 m deeper than the cross-section b − b′ in Figure 7, though the actual elevation differential at this sill is about 7 m (Schoellhamer 2001). We also did not run Qual with these configurations, because we did not expect to get useful water quality results with an uncalibrated model. Here, we demonstrate only the value of the approach. We then performed two "toy" runs of 3 months' duration in which we simulated an artificial M 2 tide forcing a shoal-channel system (model configurations GtT*BtSC*). The model grid for these runs was a simplified representation of the TRIM-3D circulation model of the Central Bay in Ralston and Stacey (2005) comprising a deep main channel and two connecting, wide, shallow side channels (Figures 4C and 4D; see "Prescriptions for Model Recommendations" on page 5 in Appendix A). This approach has been adopted earlier in HEC-RAS by prescribing wide shoal channels with large Manning's n ∼ 0 [(10 -100) × n Main channel ] corresponding to a floodplain parallel to the main channel (Brunner 2003). This implementation also necessitated incorporating a wetting and drying scheme for the shoals. A wetting and drying scheme in the Suisun Bay region of the grid would allow for better representation of the retardation of tides by friction. In reaches of a channel where the crosssectional area fluctuates markedly, DSM2 rather unrealistically restricts these fluctuations. Instead, wetting and drying can add realism here, too. However, while a wetting and drying scheme such as the one adopted by Chua and Fringer (2011) would require a significant rewrite of the code, here we proposed a very simple alternative (see "Prescriptions for Model Recommendations" on page 5 in Appendix A). We operated a series of numerical caissons, or buoyant radial gates, in the shoals, which closed off the flow when the water column depth fell below a 3-cm threshold ( Figure 4C).
In each run, we began the simulation on June 1 at midnight, to allow the model to spin-up over a period of 4 months. We also note that because of a restriction on the minimum length of a parcel of water entering a channel in DSM2 (Liu and Ateljevich 2011), the maximum time-step size Qual could be run with is 15 minutes. Therefore, we ran Qual with a time-step size of max (Δt Hydro ,15 minutes).

Data Processing
To force the model, as well as to assess its fidelity, we obtained hourly flow, water level, and electrical conductivity data at different locations within the Delta and Carquinez Strait from the CDEC website (CDEC c2016). We compared model results with field observations of flow at 46 locations (and one aggregate subtidal flow record called "OMR"); water levels at 60 locations; and electrical conductivity at 77 locations ( Figure 1, Table 3, and Tables A-4.1 through A-4.12 in Appendix A). We used the MATLAB-based KATANA data clean-up toolbox (Sridharan 2018, unreferenced, see "Notes") to quality control the data (see Appendix A). Because we wanted to investigate the effect of the temporal resolution of the tidal boundary condition on fidelity in many of these runs, we utilized both the coarse-and fine-resolution data sets described previously. In all cases, we used 15-minute tidal water levels interpolated with a cubic spline from the original data. Although the CDEC and RTC data sets were available for all water years between 2009 and 2015, the NOAA-TC data was available only from May 1, 2013.
To align the NOAA-TC data set with the CDEC data set, we advanced the NOAA-TC data by 0.086 hours and removed the difference in mean water levels between the CDEC data and the NOAA-TC data from the NOAA-TC data. To consistently align the RTC data set with the CDEC data set, we advanced the RTC data by 0.46 hours and removed the height of the tide gauge above the NAVD88 datum from the RTC buoy data.
To estimate the wind-driven circulation, we utilized quality-controlled wind speed and direction data available from between June 15, 2011 and October 1, 2015 at Pittsburg from the San Francisco Physical Oceanographic Real-Time System (NOAA c2013) at station 9415115 following Monismith (2016). To estimate the gravitational circulation, we obtained DAYFLOW estimates of the NDOI and X2 location (CDWR c2016).

Performance Analysis
We subsequently present an analysis of the following model results: (1) the instantaneous water column depths and flows and the daily averaged electrical conductivities; (2) the tidal effects including the subtidal flows, the harmonic components of the tidal water levels and flows, and the modulations of the subtidal flows by the tidal harmonics; and (3) a comparison of regulatory flow parameters such as the DSM2 estimate of NDOI with the DAYFLOW estimate modified by the subtidal adjustment, (see Appendix A), and the subtidal OMR flow, where Q OH4 and Q MDM are, respectively, the flows in the Old River at Highway 4 and in the Middle River. We analyze these results for all but the shoal-channel model configurations in Section 5.
The results of the shoal-channel process models are primarily proofs-of-concepts, and we discuss these in Section 6.
To study the tidal effects, we performed Short Term Harmonic Analysis (STHA) on the observations and model results using the package T_TIDE (Pawlowicz et al. 2002). STHA (Kukulka and Jay 2003) can account for the non-stationary propagation of tides in rivers that result from variable freshwater flows (Matte et al. 2013). Here, we partitioned the data sets into 3-month-long boxcar windows and performed STHA on the smaller time-series as recommended by Guo et al. (2015). We only analyzed a subset of the principal components of the tides in north San Francisco Bay with periods shorter than 1 fortnight, which were decomposed by T_TIDE with a signal to noise ratio ≥2 (Pawlowicz et al. 2002). Of these, we only report the K 1 (23.9-h period), M 2 (12.4-h period), MK 3 (8.2-h period), and M 4 (6.2-h period) components, because the other components behave similarly. We estimated the subtidal hydrodynamic quantities by low-pass filtering the total quantities with a 4 th order Butterworth filter with a cut-off period of 50 hours (using zero-phase filtering with MATLAB's filtfilt tool). We also report the mean difference in tidal range between the fortnightly spring and neap tides (14.7-day period) by computing the mean of the difference between the peaks and troughs of the range of the envelope of the tidal signal.
Quantifying model accuracy with standard aggregate metrics is challenging, particularly when observations of different quantities are distributed non-uniformly in space (Ganju et al. 2016). As an alternative, we developed two simple aggregate metrics based on the cumulative RMS error (Chua and Fringer 2011) in modeling flow and water level at multiple locations ( Figure 8A): the aggregate model error, ε Σ , and the aggregate skill score, MSS Σ as (see Appendix A), where, for a model configuration, p , we weight the contributions to the cumulative squared error, ε 2 φ,p , in φ by a location, i, by the reciprocal of the number of locations at which observations are correlated with those at this location, N φ,i , and then sum these over all locations where such measurements of φ are made. We finally divide the overall weighted RMS error by the overall weighted RMS value of the quantity φ, and aggregate the RMS errors over both the water column depth and flow. We compute the MSS Σ,p in a similar manner, but here we divide the overall weighted mean squared error by the overall weighted mean squared potential error, PE φ,p , and equipartition the skill between the water column depth and the flow. For simplicity, we chose N φ,i simply as the number of stations in each region specified in Table 3. An alternative for a more rigorous specification of N φ,i is to weigh the contribution of each location by the number of locations within the range of the semivariogram for the river-course within which that location is found (Curran 1988).
While such aggregate metrics provide an overview of modeling accuracy, analysis of the local RMS error at each location can reveal insights about the model's strengths and weaknesses, as well as the limits of possible improvements to the model. The cumulative RMS error at a location is where φ M and φ O are, respectively, the modeled and observed quantities at time i.
A close to ideal model would capture all the important physical processes, and hence ε φ should arise largely from the observation precision. For such a model, ε φ in the water column depth and flow should be independently and randomly distributed. To understand the nature of the error for different model configurations in greater detail, we decomposed ε φ in Equation 13 at each location into a systematic component and an unsystematic component (see Section 1 and "Prescriptions for Comparison with Data and Inter-Model Comparison" on page 16 of Appendix A) (Willmott 1981).
We also utilized various metrics to quantify the quality of the spatial and temporal hydrodynamic landscapes predicted by the different model configurations. These include the coefficient of determination, R 2 , difference maps of the hydrodynamic and water-quality fields between the models and the observations, and succinct target diagrams (Jolliff et al. 2009). We used this multi- pronged approach to analyze different aspects of the model results.
The R 2 of the observations and the model results shifted appropriately by the lead or lag between the two data sets (RMA 2005) measures the similarity in the means and variances of the two data sets. Target diagrams provide a comparison between two related metrics (Jolliff et al. 2009). We applied target diagrams to represent the unbiased RMS difference (UBRMSD) and the bias (b) (Jolliff et al. 2009), between the observed and modeled instantaneous flow. The UBRMSD and b (Equation A-5.9 in Appendix A) are, respectively, the variances of the modeled and observed time-series and the biasedness of the model predictions (MacWilliams et al. 2015).
In these diagrams, we represent on the abscissa and on the ordinate, and include circles of increasing radii centered at the origin that indicate poorer model fits (MacWilliams et al. 2015). In addition to these diagrams, the target diagrams we developed (see "Prescriptions for Comparison with Data and Inter-Model Comparison" on page 16 in Appendix A) provide a comparison between the amplitudes and phase differences of the modeled (respectively, ζ M and ψ M ) and observed (respectively, ζ O and ψ O ) harmonic components (MacWilliams et al. 2015).
In these diagrams, we represent the fraction of amplitude difference of the observed amplitude, Because model accuracy can only be evaluated up to the precision in the data, we ensured that all model results that fell within the observational precision (see "Data Sources and Quality Control" on page 8 in Appendix A) were deemed to have accurately represented the true hydrodynamics. We discuss the results of our analyses below.

RESULTS
In this section, we first report on the overall fidelity of DSM2, and then its prediction of the subtidal hydrodynamics and tidal effects. Last, we analyze its prediction of the Delta's key regulatory parameters. We present only the results of the G1T60B1, G1T15B1, G2T5B2, and G2T5BB model configurations here and in Section 6. We present comprehensive results in "Comprehensive Model Evaluation" on page 18 in Appendix A. Since there is no stochasticity in the DSM2 results beyond machine precision (see Sridharan 2015), we cannot discuss the statistical significance of our results. To eliminate any skew because of the hydrologic conditions of the Delta, we used only the model results in the water years between 2013 and 2015 when reporting overall performance for all model configurations.

Overall Model Fidelity
The overall fidelity was best in model runs with the standard configurations (G1T60B1 and G1T15B1). The aggregate error was smallest with the G1T60B1 configuration; the aggregate skill was highest with the G1T15B1 configuration ( Figure 8A). However, the results with the other model configurations without domain extension did not significantly differ from these two configurations. Model results with CDWR's recommended configuration, G1T15B1, had an aggregate error that was about 20% larger than those with the G1T60B1 configuration. A possible explanation is that model configurations generally produced the best overall results when R was close to 1. The aggregate skill typically improved by about 1% when the CFL condition approached 1, compared to larger CFL condition values ( Figure 8B). Although it is generally preferable to have both the CFL condition and R be O(1), the R criterion is the more important of the two. The model configurations with the extension to Carquinez Strait (G*T*BB) performed worst ( Figure 8A); more rigorous calibration is required in these cases. However, the extension did reduce the systematic error in the model from the reflection of oceanward wave characteristic. For all model configurations, at different locations within the Delta, we found correlations between the cumulative RMS errors in flow and water column depth that increased landward (Figures 9A-C and A-6.1). Through error decomposition analysis (Willmott 1981), we found that the systematic error in the flow was correlated inversely with that in the stage, and that this correlation decreased (regression lines in Figures 9D-E) landward (lighter colored points in Figure 9D-E) for all model configurations except the ones with the domain extensions ( Figure 9F). In these latter configurations, decreasing the energy of the spuriously reflected oceanward wave characteristic de-correlated these errors ( Figures 9F and A-6.1). As expected for a 1-D model, the unsystematic error ( Figure 9G-I) was comparable in magnitude to the systematic error ( Figures 9D-F and A-6.1). We discuss the implications of these issues in Sections 5.2 to 5.4 below.

Predictions of Instantaneous Hydrodynamics
To evaluate DSM2's results during various hydrologic and water operations scenarios, we analyzed model results in two scenarios: a wet period with high freshwater inflow (winter of [2014][2015], and a dry period with low freshwater inflow (summer of 2015). For clear visualization, in the scatter plots in Figures 10-12, we equipartitioned the data versus model results space into 100 squares, and plotted boxes colored by the mean instantaneous inflows into the system (light to dark representing low to high flows) in all the records of the time-series that occurred within each square, and sized by the number of records occurring within each square. The model typically achieved higher fidelity when the freshwater inflow was high than when it was low (darker points versus lighter points in the scatter plots in Figures  10 and 11), because of the lower importance of the more complex tidal signal relative to the comparably steady freshwater signal. In addition, as we expected, the model worked best when R ≈ 1.
DSM2 predicted the instantaneous water column depths and flows at various locations in the Delta reasonably accurately (R 2 ≈ 0.9), with the coarsest grid, time-step size, and water level boundary condition ( Figures 10, 11, A-6.5, and A-6.6). It performed very similarly with other model configurations as well, with a ~2% reduction in R 2 values between the observed and modeled water levels and flows in all configurations. This indicates that the subgrid cross-sectional geometries and Manning's n values in DSM2 have been very well calibrated for the standard configuration, and that any significant improvement would require finer streamwise resolution of the system hydraulics with significant recalibration.
DSM2 predicted daily averaged electrical conductivity with lower accuracy ( 0.3-0.5) than the water column depths and flows, particularly when the freshwater inflow was very small (Figures 12 and A-6.7). However, electrical conductivity measurements in the central Delta can themselves be quite error-prone, so a comparison between observed and modeled conductivity is limited by data uncertainty as well  (2018 in-person conversation between E. Ateljevich and V.K. Sridharan, unreferenced, see "Notes"). The daily averaged electrical conductivity fields in four different hydrologic and water operation scenarios-viz, spring 2011 with high freshwater inflow and low export:inflow ratio; summer 2011 with high freshwater inflow and high export:inflow ratio; winter 2015 with low freshwater inflow and high export:inflow ratio; and summer 2015 with low freshwater inflow and low export:inflow ratio ( Figure 3)-indicate that the Delta was relatively fresh during high inflow periods ( Figure 13A), and that the western Delta became more brackish during low inflow periods (Figures 13B-D). The entrainment of saline water by the export pumps increased electrical conductivity in the OMR corridor and South Delta as well during low inflow periods ( Figures 13C-D). All the model configurations produced almost identical daily averaged electrical conductivity fields ( Figure A-6.8). The deviation between the observed and modeled electrical conductivity field was nearly 20-50% during the low inflow periods, and this difference occurred predominantly in the West and South Delta regions ( Figures 13B-D).
Fidelity to hydrodynamics was generally better in the western Delta close to the water level boundary where the tidal signal dominates the total water level and flow (e.g., Antioch), or in the north and South Delta where the tidal signal attenuates (e.g., Freeport). Fidelity was low where there was strong coupling between the mean river flow and tidal signal, such as at Rio Vista, or where the export flows interacted with the tidal signal, such as in the OMR corridor (Figures 10, 11, and A-6.5 and A-6.6). As the tide propagated inland and dissipated, the lag between the modeled and observed water levels also increased from about 40 minutes to about 50 minutes in the Sacramento River, and from about 30 minutes to almost 90 minutes in the San Joaquin River (Figures 10 and A-6.5). These trends in the solution accuracy and lag resulted from the increasing numerical dispersion away from the tidal boundary condition (Sobey 2001). A slightly different trend existed in the prediction of flow, in which the 24 VOLUME 16, ISSUE 2,ARTICLE 6 lag between modeled and observed flows peaked away from the boundaries (Figures 11 and A-6.6). In the flow predictions, significant hysteresis occurred due to the poor prediction of the modulation of tidal amplitude by the river flow, which caused over-and under-predictions of the tidal signal in the interior Delta locations (represented by the elliptical annuli, or more scattered ellipses, respectively, in the correlation plots in Figures 11 and A-6.6). As we demonstrate in Section 5.3, this representation of tidal water level and flow with low fidelity is a direct consequence of the model's inaccurate representation of tidal effects.

Predictions of Tidal Effects
The primary purpose of DSM2 is to reproduce subtidal flow fields accurately to model the hydrodynamic transport of scalars and aquatic organisms of interest (Hutton 1995). As we discussed in Section 3.1 and in Appendix A, tidal effects are crucial to subtidal flow. The decomposition of total instantaneous flow and water levels into their tidal and subtidal components comprises tidal effects. We discuss (1) harmonic components and the springneap tidal range in water levels (Figures 14 and A-6.9) and flows (Figures 15 and A-6.10), and (2) Figure 13 Observed daily averaged electrical conductivity and difference between modeled and observed electrical conductivities for a representative day in various hydrologic and water operations scenarios: (A) wet year with a small export flow to inflow ratio, (B) wet year with a large export flow to inflow ratio, (C) critically dry year with a typical export flow to inflow ratio, and (D) critically dry year with a large export flow to inflow ratio. In the maps on the left, we have bilinearly interpolated the electrical conductivity measured at different locations over the entire domain. In the maps on the right, we have subtracted the bilinearly interpolated observations from the bilinearly interpolated modeled electrical conductivities. In each figure, arrows on the Sacramento River and near the pumps indicate the average inflow and exports on that day. In the maps on the right, we indicate modeled electrical conductivities greater than those observed by lighter colors, and modeled electrical conductivities lower than those observed by darker colors. In all the plots, we represent the model configuration, G1T15B1, with best overall predictive skill.
We report on the subtidal flow field in the four hydrologic and water operation scenarios as we did for the electrical conductivity field in Section 5.2, as well as an additional period in winter 2014, which had low freshwater inflow and a typical export:inflow ratio ( Figure 3). We report only the model's performance in predicting tidal components of hydrodynamic quantities in the summer of 2015, because the low inflow conditions represent the most stressful period for such models (e.g., Warner et al. 2005).
To evaluate tidal components of hydrodynamic quantities, we define amplitude interchangeably, depending on the context as that of either the water level or the flow. We also report the Greenwich phase of either quantity in degrees, to represent the effect of frictional retardation on wave propagation. For instance, an M 2 tide with a phase difference of 180 o between two points indicates a 6-hour lag.

Tidal Water Levels
Although DSM2 qualitatively represented general tidal oscillatory patterns, there were inconsistencies in predictions at various locations within the Delta.
The observed dominant diurnal (K 1 ) and semidiurnal (M 2 ) bands of astronomical components of water levels decayed rapidly landward ( Figure 14A and A-6.9), while the overtides and compound tides spawned near the mouth of the Delta and intensified within the Delta ( Figure 14B and A-6.9). These overtides and compound tides also intensified toward the eastern extent of the central Delta, but attenuated in the South Delta ( Figure 14B and A-6.9). Moreover, propagation of the overtides and compound tides through the system became very complex ( Figure 14B and A-6.9). 26 VOLUME 16, ISSUE 2, ARTICLE 6 The diurnal (semi-diurnal) components were retarded by almost 12 hours (6 hours) to 24 hours (12 hours) as the tides propagated into the North Delta, though they propagated without significant retardation in the central and South Delta ( Figure A-6.9). This was a direct consequence of the higher inflow in the Sacramento River, and also because of lower friction in the deeper, dredged San Joaquin River. The higher-frequency, lower-wavenumber overtides also propagated faster into the Delta, because their phase velocities ( c k = ω where ω is the frequency of the constituent, and k is its wavenumber) are much higher than the phase velocities of the astronomical components (see Figure A-6.9). The G1T15B1 model configuration was able to reproduce propagation patterns of the tidal harmonic components with highest fidelity (Figure 14 and A-6.9). However, the configurations with the Carquinez Strait extension were also able to match the observed patterns in the K 1 component reasonably well. All model configurations underpredicted the attenuation and retardation of the astronomical components from the tidal boundary into the system, and over-predicted the intensification and phase speed of the overtides and compound tides. This indicates a systematic bias toward under-damped wave motions.
The spring-neap oscillations in the tidal envelope were typically 0.3 to 0.5 m in the South Delta, while they were negligible in the North Delta ( Figure 14C and Figure A-6.9). Only the G1T15B1, G1T15BB, and G2T5BB configurations at least qualitatively recovered this pattern ( Figure A-6.10). In the top row, arrow sizes correspond to observed amplitudes, and we indicate the amplitudes and Greenwich phases (within parentheses). In the bottom row, arrow sizes correspond to modeled amplitudes, and values indicate percentage differences between the modeled and observed amplitudes, and absolute difference in the Greenwich phase (within parentheses). We have sized the arrows on a logistic scale, and because the Stokes' drift is very small compared to the tidal harmonic amplitudes it is on a different scale. We represent the model configuration, G1T15B1, with best overall predictive skill.

Tidal Flows
The amplitudes of the K 1 and M 2 components, respectively, decayed inland from about 1,000 m 3 s −1 and 3,000 m 3 s −1 to smaller than 100 m 3 s −1 in the Sacramento and San Joaquin rivers ( Figure 15A). The overtides and compound tides, which had about one-tenth the amplitude of the astronomical components, also decayed inland ( Figure 15B). There was an approximately 15-to 45-minute phase lag in the Sacramento River compared to the San Joaquin River near Threemile Slough in all tidal components. The tidal range of the spring-neap variation in flow was comparable in magnitude to the K 1 amplitude ( Figures 15C and A-6.10). Although all the model configurations produced qualitatively identical results, the configurations with the domain extension predicted more accurate amplitudes and phasing of the astronomical tides than the other configurations in the western and central Delta (Figures 15 and A-6.10). In all the model configurations, the deviation in predicting the tidal harmonics and spring-neap variations increased inland ( Figures 15C and A-6.10). The component flow estimates were the worst in the South Delta and in the Yolo Bypass, where the river flow influence is weak compared to the North Delta (Figures 15 and A-6.10). We could not draw any conclusions about the differences in predictive power between astronomical components and overtides and compound tides.
At critical junctions such as the DCC, the Head of Old River, and Threemile Slough, none of the model configurations predicted tidal flows accurately. This could have potential implications for predictions of transport and mixing, particularly by the PTM module. At the DCC, there was about 20-50% error in prediction of the amplitude of the M 2 component, and about 8-12 hours' lag in the prediction of its tidal phase ( Figures 15B and A-6.10). Although the model predicted tidal flows in Threemile Slough with much higher fidelity (typically lower than 20% error), the phasing of flows in the Sacramento and San Joaquin rivers was often incorrect by 6-8 hours ( Figures 15B and A-6.10). There was between 100% and 450% error in the prediction of the amplitude of the M 2 component, depending on the model configuration at the Head of Old River, and phase leads or lags of 4-10 hours occurred between the model results and observations (Figures 15B and A-6.10).

Subtidal Flows
The subtidal flow in the Sacramento and San Joaquin rivers was typically high in the high inflow periods, as expected ( Figures 16A and A-6.11). There was a small, O (10 m 3 s −1 ), subtidal oceanward flow in the OMR corridor in the high-inflow period, which reversed during the other periods ( Figure 16). The Stokes' drift flow was only about one-hundredth to one-tenth of the subtidal flow everywhere, and decayed rapidly in the landward direction ( Figures 15D, 16, A-6.10, and A-6.11). Qualitatively, all the model configurations recovered these patterns. However, when the subtidal flows were small, the errors in predicting both the subtidal flows and the Stokes' drift were very large, sometimes exceeding a factor of ten. This trend was exacerbated in regions near the Stockton and Sacramento River deep water shipping channels, because of sudden bathymetric changes ( Figures 15D and 16D).
In general, the model reproduced subtidal flow patterns, except in locations such as Jersey Point, Rio Vista, Garwood Bride, and in the OMR corridor. In these locations, the modification of the subtidal flow by tidal effects was strong, with significant deviation between modeled and observed subtidal flows ( Figure 17). The fidelity at Garwood Bridge was particularly low because of a sudden rise in the channel bottom between the Stockton shipping channel and the San Joaquin River. This discontinuity, of about 9 m, caused almost a hundred-fold increase in frictional retardation of the propagating tidal wave, and reflected a significant portion of the energy contained on the progressive tidal wave into the shipping channel. These trends are consistent with the difference maps in Figures 16  and A-6.11. The poor predictions of subtidal flow near junctions with complicated tidal phasing such as Threemile Slough, the OMR corridor and near the DCC indicate that the likely source of the errors at such locations is the unmodeled complexity involved in tidal modification of subtidal flow from tidal asymmetries between ebb and flood dominant channels. This inference is supported by the 28 VOLUME 16, ISSUE 2, ARTICLE 6

Figure 16
Observed average subtidal flow and difference between modeled and observed subtidal flow in various hydrologic scenarios: (A) spring in a wet year, (B) summer in a wet year, (C) winter in a critically dry year, and (D) summer in a critically dry year. In each season, we indicate the average daily inflow and the ratio of total export flows to total inflow. In the top row, we indicate the observed flows by numbers and arrows. In the bottom row, arrows sizes correspond to modeled flows, and values indicate percentage differences between the modeled and observed flows. We have sized the arrows on a logistic scale. We represent the model configuration, G1T15B1, with best overall predictive skill. importance of the Stokes' drift in this region, which was 30% of subtidal flow ( Figures 15D and 16D).
These results indicate that DSM2 is reasonably well suited for predicting the subtidal flow, provided that the complex bathymetry at important junctions such as Threemile Slough, Georgiana Slough, Stockton shipping channel, and junctions in the OMR corridor are well resolved.

Target Diagrams
We display target diagrams of model performance for instantaneous quantities (Figures 18 and A -6.2 in Appendix A) and tidal and subtidal quantities (Figures 19 and A-6.3). In these diagrams, the points displayed indicate the performance in the phase space of either the RMS error and bias for instantaneous quantities, or the amplitude and phase errors for tidal quantities during various 3-month periods.
In Figures 18 and A-6.2, the model did not predict instantaneous water column depth and flow predictions well in the central Delta and OMR corridor. In the western Delta, the model predicted these quantities reasonably well. The model generally predicted electrical conductivity better in the rivers than in the brackish parts of the Delta. Moreover, points corresponding to periods of high freshwater inflow were generally associated with lower prediction errors. Based on these diagrams, the G1T15B1 model performed best, with almost 75% of the water column depths, 70% of the flows, and 45% of the electrical conductivities falling with the In each plot, we represent the unbiased RMS difference on the abscissa and the bias on the ordinate axes, colors lighter landward, and points bigger with increasing total inflow into the system. We also indicate circles of dimensionless radii 1, 2, and 4. The numbers indicate the percent of points that fall within the radius-1 circle. Note that the G1T15B1 and G2T5BB model configurations are favorably skewed by the wet years. For better readability, we have jittered points with model predictions within the data precision inside the radius-1 circle. In each plot, we represent the unbiased RMS difference or error in amplitudes on the abscissa and the bias or error in phases on the ordinate axes, colors lighter landward, and points bigger with increasing total inflow into the system. We also indicate circles of dimensionless radii 1, 2, and 4. The numbers indicate the percent of points that fall within the radius-1 circle. Note that the G1T15B1 and G2T5BB model configurations are favorably skewed by the wet years. For better readability, we have jittered points with model predictions within the data precision inside the radius-1 circle.
30 VOLUME 16, ISSUE 2,ARTICLE 6 radius-1 circle. The models with the Carquinez Strait extension performed comparably with the other configurations.
In Figures 19 and A-6.3, the model generally predicted tidal and subtidal components of water column depth well in the western Delta, but not in the riverine regions. In addition, increasing freshwater flow resulted in better predictions of tidal and subtidal components of flow. Model configurations with the domain extension did not perform as poorly as we expected; a well-calibrated domain extension would, therefore, likely be valuable.
We show the trends in prediction errors with freshwater inflows into the system through LOWESS regression fits (Cleveland 1979) in Figure 20 using the MATLAB toolbox lowess (Burke 2012). In these fits, we deemed the fits to be statistically significant when the bootstrapped 95% credible intervals of the LOWESS estimate did not include the grand mean of the errors. The errors in predictions of water column depth and tidal and subtidal flows decreased sharply for all model configurations with increasing freshwater inflow into the Delta until about 50-100 m 3 s -1 and then more gradually ( Figures 20A-20H). The increase in error in flow predictions at very high inflows for model configuration G2T5B2 ( Figure 20B) is an artifact of the small number of records during these conditions. Although there was a statistically significant decrease in errors in the prediction of daily averaged electrical conductivities with increasing flow, this relationship was much weaker than for water column depth and flow. We show these general trends for other model configurations as well in Figure A

Estimates of Regulatory Parameters
DSM2 is being used as a management aid for water operations in the Delta (Nam 2008, unreferenced, see "Notes";Cavallo et al. 2015). Therefore, its ability to estimate regulatory parameters is of vital importance. Two important regulatory parameters that characterize the ecological health of the Delta and export pumping operations are, respectively, the NDOI, which measures the total outflow of the Delta (Monismith et al. 2002), and subtidal OMR flow, which measures the flow entrained toward the pumps (Kimmerer 2008). In each plot, we indicate R 2 , RMS error, and bias for each model configuration. The prediction of regulatory parameters is generally poor, and significantly worse during periods with low inflow. The configuration with the Carquinez Strait domain extension and fine grid perform best in recovering the NDOI as well as OMR flows. In each figure, we indicate with a thick line the bestperforming model configuration. Some lines are not visible because these model configurations produce results that are indistinguishable. 32 VOLUME 16, ISSUE 2,ARTICLE 6 In the computation of the NDOI, Q W typically contributed 5-20% of the flow, and contributed ±1-10% of Q W ( Figure 3G). During high inflows, all the model configurations reasonably predicted the trends in the NDOI ( Figure 21A). However, in the period investigated, there was an almost 35% underprediction of the NDOI, and an error of about 35% in model results. Of all model configurations, the ones with the domain extension performed best. On the other hand, with decreasing freshwater inflow, the predictive power of the NDOI was significantly reduced ( Figure 21B). During this period, none of the model configurations could predict the fortnightly filling and emptying of the Delta. However, given the poor precision in the NDOI during low outflows (Monismith 2016), it is unreasonable to expect good predictive power during such periods. In the wet period, the model configuration with fine bathymetry and R ≈ 1 (G2T5B2) predicted the flow reversal in subtidal OMR flow with slightly lower bias (27.4%) and marginally higher R 2 (0.074) than other configurations, indicating that an improvement in bathymetric resolution with smaller time-step sizes in the central Delta could potentially improve model performance ( Figure 21C). The error in model prediction and bias were, again, about 30%. Prediction of subtidal OMR flow improved slightly during low inflows, because of the relatively easyto-simulate plug flow-like influence of the exports ( Figure 21D).
Although DSM2 was able to predict the temporal patterns of key regulatory parameters during high inflow periods, its predictive power diminished during low inflow periods, partly from the unrepresented complex tidal physical processes that dominate the flow, and partly from poor precision in the data itself. However, the model did not accurately predict actual magnitudes of regulatory flows in any of the periods.

DISCUSSION
DSM2 performs well in predicting instantaneous hydrodynamic quantitates such as total water column depth, flow, and daily averaged electrical conductivity. However, predictions of tidal effects, subtidal flows, and regulatory flow parameters are less accurate because of various sources of error. In particular, flow splits and phasing of tides at critical junctions such as the DCC, the Head of Old River, and Threemile Slough are not predicted with high accuracy. When coupled with water quality and movement models such as Qual or PTM, these errors in predicting flow splits at important junctions can affect estimates of the routing proportions of scalars through these junctions.
We recommended and implemented several changes to model configurations. While none of these changes produced significantly different results overall, they nonetheless fundamentally changed the nature of the errors produced in the model. With rigorous calibration, the domain extension and finer grid and boundary condition resolution can enhance the performance of DSM2. We subsequently discuss the various sources of error in DSM2 and the performance of our proposals in Sections 6.1 and 6.2. In Section 6.2, we also discuss implementation of parameterizations of more complex physical processes than are currently represented in DSM2.

Errors in DSM2
Errors routinely appear in models because of (1) unresolved physical processes, (2) model representation, (3) numerical solution method, and (4) data availability and precision (Willmott 1981). The last source of unsystematic error can be very large, particularly during periods of very low freshwater inflow. While errors in the data cannot be addressed by a model, the first three sources of error must be addressed while balancing logistical issues such as computational speed overheads.
Of the instantaneous, subtidal, tidal harmonic, and subtidal adjustment quantities that we investigated in Section 5, the largest source of error [O(100%)] was in the subtidal and astronomical tidal harmonic component evaluations. Overtides and compound tides, and the subtidal adjustments to the NDOI, contributed about 10% of the error in computations. Even though the flows associated with subtidal adjustment are generally expected to be small in comparison with the tidal signal, they can sometimes become comparable in magnitude or even saturate the subtidal signal, particularly in dry periods ( Figure 3G)  at all times. The Stokes' drift, because of its very small magnitude in comparison with subtidal and tidal flows, contributed only 1% toward the total error (Figures 14-16). This error distribution, along with the comparable magnitude of systematic and unsystematic errors (Figure 9), indicates that unmodeled physical processes are the dominant sources of error in DSM2.
Unresolved processes cause certain unavoidable errors in the model solution. For instance, during hind-cast simulations in which models such as DSM2 are driven by data, the models are, in effect, solving the incompressible 1-D momentum balance with boundary conditions that include the effects of subtidal adjustment as well. Hence, results from such models indirectly include subtidal adjustment flows as well. However, when used to forecast hydrodynamic conditions in estuaries with astronomical tides, such models will be unable to recover subtidal adjustments to the flow. This is not a model drawback per se, but rather a consequence of the nature of the calibration data sets. In addition, the subtle feedback between under-representation of complex estuarine tidal processes and calibration of the model to data that includes these processes (see Section 3.2) is a significant source of unavoidable unsystematic error. DSM2 has been calibrated with a Manning's n in each channel, and also by varying the channel cross-sectional geometries to best fit the observations (Nader-Tehrani 2001). In constraining the incomplete model to represent the complete set of physical processes in a calibration data set, various model elements such as the subgrid cross-sections and channel hydraulics have to become tuning parameters, and do not necessarily always reflect reality. These issues make diagnosing model errors and recommending improvements challenging. Errors resulting from unresolved physical processes cannot be minimized, therefore, without including some of these processes in the model.
On the other hand, certain processes cannot be resolved by a 1-D model at all. When freshwater inflows are low, tidal processes outlined in Section 3.1 and in Appendix A become very important. In such circumstances, unmodeled tidal processes contribute significantly to model error (compare the middle panels to the left panels in Figures 10-12). Another example is the estimation of salt flux in partially-mixed estuaries, which is complicated by the periodic nature of the stratification of such estuaries (see Section 3.1;Figures 4A and 4B;Simpson et al. 1990). Since DSM2 does not resolve the vertical structure of water density, it will not be able to resolve tidal variability in salt flux, but rather only its subtidal average. Moreover, it will be unable to resolve the component of the lateral salinity gradient between the shoals and channels that is attributable to the periodic stratification and suppression of vertical mixing.
Model representation -i.e., the specification of boundary conditions, grid size, and domain extent -is another source of error. Models such as DSM2 do not represent even the included physical processes as accurately as possible because of practical considerations such as manageable grids and domain extents. This drawback results in systematic errors that we can correct only by changing the model configuration. For instance, in DSM2, the abrupt end of the model grid at Martinez introduces a spurious reflection of the oceanward wave characteristic. This causes the systematic error in flow to increase here and, consequently, decay inland. But since the water column depth at Martinez is specified, the systematic error in water column depth is small here and increases landward. This results in an inverse correlation between systematic errors in the water column depth and the flow (Figure 9). On the other hand, systematic errors in hydrodynamic quantities should become uncorrelated if oceanward-propagating wave characteristics are allowed to radiate freely out of the domain ( Figure 9F). DSM2 is also unable to correctly resolve subtidal flows and tidal flow splits in regions with sudden bathymetric changes. In addition to the sudden bathymetric change at Stockton highlighted in Section 5.3.3, the natural sill in Carquinez Strait ( Figures 7A-C) also decreases the tidal energy that propagates landward into the system compared to what the model predicts. Although bathymetric complexity is challenging for any model, 2-D and 3-D models can incorporate sudden shifts by resolving such features and by incorporating terrainfollowing coordinate systems, respectively (e.g., Fringer et al. 2006). In 1-D models, although we can make computational grids arbitrarily small, to avoid large errors in solution propagation, we must also accordingly reduce time-step size (Ferziger and Periç 1997). In DSM2, increasing the grid resolution involves a significant recalibration effort, which is challenging. In DSM2, model representation errors are thus comparable in magnitude with unsystematic errors (Figures 9D-F and 9G-I).
A third source of error is the numerical solution scheme, which includes several linearizing approximations for the sake of practical considerations. These include simplifications of the friction term which are common in discretizations of the Saint-Vénant and shallow water equations in Equation 6 and allowances for smoothness of the temporal solution. Such approximations cause under-damping of tides and inaccuracies in the prediction of the water column depths. For example, in Figure 11, in the interior Delta locations the model over-predicts the tidal signal during the ebb phase of the tide (because of the constructive interaction of the reduced effect of bottom friction on both river and tidal flows) and under-predicts the tidal signal during the flood phase (because of the destructive interaction of reduced bottom friction effect on both river and tidal flows). Such under-damping occurs from various approximations. First, the shallow-water wave assumption causes the wave speed and water surface elevation of the tidal component of flow to be over-predicted (Chen 2005). Second, the semiimplicit Crank-Nicolson scheme damps harmonic components slowly (Ehle 1969), which results in stronger wave reflections landward in the model results. Accordingly, in the model results, the tidal wave at Martinez propagates almost as a standing wave inland past Chipps Island, rather than as a progressive wave. Third, the damping by friction is artificially small because of the large depths of the channels used to avoid the intermittent wetting and drying of the intertidal mudflats in Suisun Bay. Fourth, the stability of the model solution constrains the cross-sections to vary gradually, often with unrealistically steep side slopes (Smith and Enright 1995). This also decreases the frictional retardation of the tides and river flow. Fifth, the model ignores temporal variability in cross-sectional mixing caused by the subtidal adjustment component of flow in the western Delta. As a consequence, the model cannot reproduce the energetic flows and lower water surface elevations during the ebb phase of the tide, and the reduced mixing and increased water surface elevations during the flood phase of the tide (Simpson et al. 1990). In addition to these approximations, flows and water surface elevations are under-predicted as a consequence of the constant Manning's n values for each channel reach. Moreover, Hydro takes longer to respond to flow reversals. In reality, Manning's n does change with tide phase in the Delta, particularly in complex regions such as Threemile Slough ). Last, any model is only as effective as the precision in the data with which it is calibrated and validated. The uncharted water lateral withdrawals and agricultural runoffs together can account for about 10% of the total outflow from the Delta (Sridharan 2015). The unsystematic error from these data sources is large because of the unreliable information about these withdrawals on monthly time-scales. The DCDM is specifically being developed to address uncertainties in this data set. An example of significant uncertainty in the subtidal flow is in the NDOI, in which the strength of the tidal signal saturates the flow measurements because of the large salinity gradients in the western Delta (MacCready and Geyer 2010; Geyer and MacCready 2014). The precision in the NDOI estimates can be as large as 100% during low-outflow periods (Monismith 2016). These errors in the data cannot be attributed to model performance.
We discuss various ad hoc recommendations to address some of these issues below.

Recommendations to Improve DSM2 Performance
It may be possible to correct the errors listed in Section 6.1 by using sophisticated approaches like finite elements (Kolar et al. 1994;Dawson et al. 2006), or weighted stencils (Grotkop 1973;Dawson et al. 2006), accurate spatial discretization and temporal advection schemes (Garcia and Kahawita 1986;Louaked and Hanich 1998;Chua and Fringer 2011), and by modeling the different harmonic components explicitly (Levesque et al. 1979;Walters et al. 2013, Rayson et al. 2015. Such methods however, require a complete reworking of the underlying model. Alternately, some of the schemes we outlined in Sections 1 and 4.3 could be adopted with recalibration to improve fidelity without greatly sacrificing speed and simplicity. For best results, temporal resolution of the tidal boundary condition and time-step size must be such that both the CFL condition and R should be O(1) (Sobey 2001). This would minimize error propagation into the domain from poorly-resolved boundary conditions. We could also extend the oceanward end of the domain to avoid the reflection of oceanward-propagating wave characteristics back into the domain at Martinez, and to allow for the water surface in the Delta to co-oscillate with long-term oscillations in the coastal ocean. In principle, the domain extension is equivalent to incorporating a "sponge layer" beyond the western end of the domain where the oceanward characteristics of waves are damped (Israeli 1981). Simulation quality has been improved significantly in the past with such domain extensions (e.g., Levesque et al. 1979;Sobey 2001;Dawson et al. 2006). In particular, Chua and Fringer (2011) and Holleman and Stacey (2014) used idealized versions of the Sacramento and San Joaquin rivers when they modeled the San Francisco Bay-Delta system. CDWR itself attempted to include an idealized version of the Bay, in which they treated the north Bay as a series of channels, the central Bay as a set of interconnected channels oriented to represent the observed circulation patterns and littoral reservoirs, and the south Bay as a reservoir (Ferreira and Sandhu 2016). However, that attempt was only marginally successful in reproducing the electrical conductivity at Martinez, and the extended grid is still being refined (Sandhu 2018, unreferenced, see "Notes").
Here, we added a representation of Carquinez Strait to the DSM2 grid, so that the 2-D and 3-D hydrodynamic complexities of the Bay could be circumvented while the oceanward characteristic of the tidal wave, and other evanescent overtidal and compound tidal waves, were still allowed to radiate for a few kilometers past Martinez. Even this small extension produced promising results. The behavior of the systematic error ( Figure 9F versus Figures 9D-E) indicates that the domain extension allows tidal energy to radiate oceanward. The domain extension also resulted in minor improvements in the prediction of tidal and subtidal components of flow in the western Delta (Figures 15 and 16). In addition, the finely-resolved grid produced marginal improvements in the prediction of subtidal OMR flows ( Figure 21). These outcomes suggest that these recommendations would likely improve model fidelity significantly if they were implemented with a thorough recalibration.
Subtidal adjustment flows and estuarine circulation processes cannot be represented in DSM2 and have to be parameterized. In particular, shoal-channel interactions and the wetting and drying of mudflats can be parameterized straightforwardly in DSM2. Shoal-channel systems and intertidal mudflats are currently represented as single channels with subgrid cross-sections that change rapidly with increasing elevation in DSM2. Such a representation (1) does not allow shoal-channel salinity differences and exchanges to be well represented, and (2) can result in instabilities that arise in the solution at crosssections that have embayments or shoals because of a sudden decrease in conveyance with increasing water column depth.
Our idealized shoal-channel representation resulted in very low flow through the shoal while the tidal energy dynamics of the main channel and shoal were preserved (see "Prescriptions for Model Recommendations" on page 5 in Appendix A). It was very effective in qualitatively recovering the salinity structure during the flood and ebb phases of the tide at a station 7,500 m from the landward end of the domain and 2,500 m from the ocean end ( Figure 22A). First, flow in the shoals was significantly slower than flow in the channel and was lagged by about 10 hours. Second, typical floodphase salinities in the channel and shoals of about 14 psu and 7.5 psu, respectively, and typical ebbphase salinities in the channel and shoals of about 8.5 psu and 7 psu, respectively, compared qualitatively with the cross-sectional distribution of salinity shown in flood and ebb snapshots in Figures 7 and  8 in Ralston and Stacey (2005). The asymmetry in salinity between the ebb and flood phases was due to the relatively small longitudinal dispersion; saline pulses effectively moved past the observation point without mixing relatively quickly during the flood phase. As salinity mixed over the tidal cycle, ebb salinity was lower over a longer duration. Quantitative discrepancies between our results and those of Ralston and Stacey (2005) likely result from minor differences in implementation of the landward boundary, as well as fundamental differences between a full 3-D circulation model and a 1-D model. In addition, we were also able to recover the intermittent drying of the shoals ( Figure 22B). Because intertidal mudflats in the Delta are brackish, the small quantity of salt in the 3 cm of water left in the shoals during each ebb period is acceptable (Uncles and Peterson 1996). A parameterization like this one could stabilize the solution when conveyance increases suddenly with elevation (see "Derivations of Subtidal Adjustment Process Expressions" on page 3 Appendix A). More important, accurately representing the dynamics of shoal-channel systems is crucial in informing ecosystem restoration efforts that are often focused on the littoral regions around tidal channels and marshes (Luoma et al. 2015).
We subsequently describe certain recommendations we did not evaluate, because their implementation would require a significant recoding of the model and recalibration (and such an effort was beyond the scope of this work), but which we believe are nonetheless useful additions.
Of the various effects of wind-driven flows (see Section 3.1), only the net exchange flow from the water surface set-up can be incorporated into a 1-D model. Based on Monismith's (2016) analysis, we can approximate this flow using Equation 3. However, the relationship in Equation A-2.2 (see "Derivations of Subtidal Adjustment Process Expressions" on page 3 in Appendix A) between the water surface set-up and the wind shear at Pittsburg was derived only for the summer of 2015 (Monismith 2016). This is likely to dynamically change with changing hydrology and atmospheric conditions. Having noted this caveat, we could proceed by specifying this as a uniform boundary flow at Martinez given a forecast of U W .
A model such as DSM2 cannot resolve density driven currents due to the lack of coupling between the Saint-Vénant and the transport equations. A simple fix to incorporate density-driven currents could be to normalize Q S from Equation 4 by the flow in a particular channel as where, Q i is the flow at a time-step in the i th channel. Q S,i could then be added to the flow computed in each channel. This type of fix only adds an average gravitational circulation component of opposite sign to both tidal flood and ebb phase flows, and does not resolve the spatial and temporal structure of the circulation. However, as shown in Figure 22, such a correction produces somewhat reasonable results. When used to forecast the hydrodynamics of the Delta with predicted tidal water level and flow boundary conditions, parameterized boundary conditions according to Equations 4 and 14 due to the subtidal adjustment could be specified explicitly in DSM2 to improve fidelity.
Although the numerical solution scheme in DSM2 is as sophisticated as possible for the FourPt stencil , some of the linearizations in the model could be adjusted with some effort. Incorporating a more accurate friction term involves changes to the FourPt scheme and invoking higherorder terms that are different for different harmonic components (e.g., see Sinha and Pingree [1997]). Alternately, the Manning's n values for each channel could be prescribed to vary temporally using a lookup table for each channel that could be calibrated as described below to the dominant M 2 tide. Equating the conveyance defined as (Jarrett 1984) where r is the frictional retardation and C D is the drag coefficient, we can write , n could be averaged over multiple tidal cycles to give its ebb-and flood-phase values. Freidrichs and Aubrey (1994) showed that the drag coefficient varies by at least an order of magnitude over the tidal cycle, just as Fong et al. (2009) observed in the Delta. So this factor could be important in resolving harmonic components more accurately.
We have shown that, of the three types of controllable errors in a model such as DSM2, it is relatively straightforward to minimize errors from model representation, and that with creative employment of approximations from the literature, it is possible to incorporate key 3-D physical processes into the modeling framework itself. Reducing errors from numerical approximations requires additional coding effort. Although our recommendations did not significantly improve model results in their current form, we expect that incorporating them with proper calibration would prove useful both in improving model accuracy and enhancing its usefulness as a decision support tool.

CONCLUSIONS
In this paper, we evaluated the performance of DSM2, which is used to model flows in the Delta.
We note that the model performs well for the type of flow it has been calibrated for: total flows and water column depths. We also note its predictions of tidal flows could be improved so that it can become a more reliable decision support tool. To evaluate the model's fidelity, we developed a comprehensive model-evaluation protocol with which we assessed aggregate model performance for different model configurations, analyzed various error sources, and evaluated predictions of different components of hydrodynamic quantities computed by the model. We recommended and evaluated several simple ad hoc schemes to improve model performance. Although these recommendations require thorough calibration to be useful, they nonetheless influenced systematic errors in the model results in an encouraging manner. It is a testament to the exhaustive and painstaking calibration that has been performed on the model ( CH2M Hill 2009;CDWR c2011;Liu and Ateljevich 2011;) that its standard configurations produce the best overall results.
38 VOLUME 16, ISSUE 2, ARTICLE 6 DSM2 reproduces total tidal flows and water column depths accurately, and daily averaged electrical conductivities reasonably well, particularly when freshwater inflows are high. It is commendable that a 1-D model of such a complex system is able to reproduce general spatio-temporal patterns in subtidal hydrodynamics, the non-stationary nature of the tides, and the generation and propagation of overtides and compound tides with the fidelity we observed. We note that, in the Delta, DSM2 is likely to perform comparably with models such as SUNTANS without subgrid bathymetry (Chua and Fringer 2011), SCHISM (Chao et al. 2017), and DELFT-3D (Martyr-Koller et al. 2017) in resolving 1-D flow dynamics. In regions where the physical processes not captured by DSM2 become important, only 3-D models such as UnTRIM that include subgrid resolution (MacWilliams et al. 2015) are likely to outperform DSM2. This is attributable to the extensive calibration of DSM2, and demonstrates that simple models such as DSM2 still have an important role to play in our understanding of the general hydrologic patterns of engineered surface water systems. Except in locations where bathymetric changes are sudden, the model also reproduces subtidal flow components such as river flow, tidal modulation of the river flow, and Stokes' drift well. DSM2 is able to reproduce the NDOI with reasonable accuracy, and hence its extensive use to inform water management -at least for outflow allocations -is well justified.
On the other hand, DSM2 is unable to reproduce the harmonic components of tide and subtidal flows modified by tidal signal accurately in regions where bathymetry is complex, tidal and river flow influences are comparable, and flow pathways are complex because of channel junctions. As we discussed in Sections 6.1 and 6.2, for fidelity to improve significantly, incorporating a finer grid would require an additional calibration effort. In this sense, DSM2 is closer in spirit to a physical model than a numerical one.
DMS2 is only reasonably well suited for the management purposes for which it is most commonly used-i.e., modeling daily flows to inform water quality and pumping operation-related actions. This is because, although it is able to reproduce the NDOI reasonably well, it is not able to reproduce OMR flows well. In addition, as Qual has been calibrated to daily averaged electrical conductivities, and because gravitational circulation processes cannot be incorporated into DSM2, we cannot represent dynamically changing salinity fields at sub-daily timescales, particularly in the tidal reaches of the Delta where stratification occurs. Although we did not report on PTM in this study, its most significant shortcoming is in its representation of channel junctions as fully randomizing mixers, which disrupts the chaotic nature of particle movements across several junctions (Sridharan 2015;Sridharan et al. 2018).
Reasons for the inaccuracies in DSM2 primarily include: (1) unmodeled physical processes, which are particularly exacerbated when freshwater inflows are low and tidal dynamics are important, (2) mismatched and coarse spatial resolution of the grid and temporal resolution of the tidal boundary condition, and misspecification of the model domain, which smooth the propagation of tidal information through the domain and cause systematic errors to propagate through the domain, (3) approximations in the numerical implementation of the model that under-damp tidal movements, and (4) imprecise and incomplete data that is used to calibrate and validate the model.
Rather than totally redesign the model, we suggested several ad hoc schemes that could augment it. The most straightforward strategy is to improve temporal resolution of boundary conditions and include subgrid bathymetry, and interpolate this to a spatial resolution that supports a CFL condition and resolution criterion of about 1. A domain extension to allow free propagation of the oceanward tidal wave characteristic, and utilizing a conveyance method that does not produce negative conveyance when wetting and drying occurs, are also promising improvements. However, even these simple improvements require significant re-calibration efforts.
Although reducing unsystematic model errors may be impossible beyond a certain extent in DSM2, it may be possible to employ machine learning techniques to identify spatial and temporal patterns in errors such as those shown in Figures 10-17 and model them non-parametrically. These modeled errors could then be removed from the Hydro solution to produce a corrected solution at each time-step. The schemes suggested in Section 6.2 and above are equally 39 applicable to similar 1-D models as well, and in some cases may be applicable to 2-D models too.
When deploying the improvements suggested in this study, three issues must be considered. First, runtime increases almost 10-fold when the time step-size is reduced from 60 minutes to 5 minute. Second, improving the spatial resolution of the grid must be concordant with restricting the CFL condition and resolution criterion. Third, the availability and quality of observations limits our ability to validate the model.
Of these, data availability may be the most challenging open-ended problem in the Delta. For instance, the coarseness of uncharted water withdrawals is not a model drawback but reflects data limitations. In cases like this, it may not even be physically or legally viable to collect more accurate data (Monismith 2016). There is also significant uncertainty in the measurement of Delta outflow because the strength of the tidal signal saturates flow measurements in the western Delta. Spatial heterogeneity in data availability can also introduce spatial patterns in fidelity. For example, although numerous stations throughout the Delta resolve electrical conductivity spatially, only five out of twelve stations measure water level and flow in the western Delta, and those too, only in Suisun Marsh. Similarly, of the 29 monitoring stations in the OMR corridor, only 18 measure water level, and that too, sporadically in time ( Figure 1). Electrical conductivity measurements in the central Delta are often imprecise as well (2018 in-person conversation between E. Ateljevich and V.K. Sridharan, unreferenced, see "Notes"). Modelers must be content in the knowledge that with improving field technologies and accumulated experience over time come improved models.
The model-evaluation protocol developed herein may be used as a basis for performing similar analyses for other hydrological regimes. The complicated flowcourse topology in the Delta-with multiple branched interconnected channels-makes traditional aggregate performance-evaluation techniques challenging to apply. For example, Ganju et al. (2016) list two combined skill-assessment techniques developed by other authors; however, these metrics required that the data sets used to compare the model against were qualitatively similar and spread uniformly in space. Ideally, if both flow and water level observations are available at identical locations, total observed and predicted hydraulic heads can be compared. Unfortunately, this too is not the case in the Delta. To overcome these difficulties, we developed aggregate error and skill metrics unique to such distributed flow systems. In addition, although the target diagrams MacWilliams et al. (2015) uses are adequate to evaluate instantaneous flows, water column depths and salinities, we developed new target diagrams to evaluate the tidal components of hydrodynamic quantities.
We have comprehensively evaluated the DSM2 model, and investigated the reasons for its behavior. We have also outlined and tested several ad hoc recommendations based on experiences with similar models reported in the literature. We believe that with comprehensive recalibration, the implementation of these recommendations in DSM2 is straightforward, and that they would add immense value to the modeling of -and management efforts in -systems such as the Delta.