Evaluation of Long-Term Mark-Recapture Data for Estimating Abundance of Juvenile Fall-Run Chinook Salmon on the Stanislaus River from 1996 to 2017

Conservation and management of culturally and economically important species rely on monitoring programs to provide accurate and robust estimates of population size. Rotary screw traps (RSTs) are often used to monitor populations of anadromous fish, including fall-run Chinook Salmon (Oncorhynchus tshawytscha) in California’s Central Valley. Abundance estimates from RST data depend on estimating a trap's efficiency via mark-recapture releases. Because efficiency estimates are highly variable and influenced by many factors, abundance estimates can be highly uncertain. An additional complication is the multiple accepted methods for how to apply a limited number of trap efficiency estimates, each from discrete time-periods, to a population’s downstream migration, which can span months. Yet, few studies have evaluated these different methods, particularly with long-term monitoring programs. We used 21 years of mark-recapture data and RST catch of juvenile fall-run Chinook Salmon on the Stanislaus River, California, to investigate factors associated with trap efficiency variability across years and mark-recapture releases. We compared annual abundance estimates across five methods that differed in treatment of trap efficiency (stratified versus modeled) and statistical approach (frequentist versus Bayesian) to assess the variability of estimates across methods, and to evaluate whether method affected trends in estimated abundance. Consistent with short-term studies, we observed negative associations between estimated trap efficiency and river discharge as well as fish size. Abundance estimates were robust across all methods, frequently having overlapping confidence intervals. Abundance trends, for the number of increases and decreases from year to year, did not differ across methods. Estimated juvenile abundances were significantly related to adult escapement counts, and the relationship did not depend on estimation method. Understanding the sources of uncertainty related to abundance estimates is necessary to ensure that high-quality estimates are used in life cycle and stock-recruitment modeling.

recapture data and RST catch of juvenile fall-run Chinook Salmon on the Stanislaus River, California, to investigate factors associated with trap efficiency variability across years and mark-recapture releases. We compared annual abundance estimates across five methods that differed in treatment of trap efficiency (stratified versus modeled) and statistical approach (frequentist versus Bayesian) to assess the variability of estimates across methods, and to evaluate whether method affected trends in estimated abundance. Consistent with short-term studies, we observed negative associations between estimated trap efficiency and river discharge as well as fish size. Abundance estimates were robust across all methods, frequently having overlapping confidence intervals. Abundance trends, for the number of increases and decreases from year to year, did not differ across methods. Estimated juvenile abundances were significantly related to adult escapement counts, and the relationship did not depend on estimation method. Understanding the sources of uncertainty related to abundance estimates is necessary to ensure that high-quality estimates are used in life cycle and stock-recruitment modeling.

INTRODUCTION
Conservation and management of anadromous fish species is challenged by their complex migratory life history (Merz et al. 2013). Specific habitat requirements differ between life stages, migrations can occur over broad spatial scales, and different sampling methods are necessary to accurately quantify abundance of different life stages. Rotary screw traps are a tool often used to monitor juvenile anadromous fish species and assess the effects of management strategies. In regulated rivers, estimating abundance of downstream migrants is especially useful for monitoring the effects of river management (e.g., diversion or discharge regulation; Sykes et al. 2009) and watershed restoration programs (Merz et al. 2013). Quantifying population size is fundamental to determining whether species management and production goals are being reached. Fisheries stock abundance estimates of anadromous species can be estimated from rotary screw traps (RSTs) using straightforward mark-recapture methods. In essence, a sample of fish is marked and released upstream of the trap (reviewed by Volkhardt et al. 2007), and the proportion of marked fish recaptured after the release (also called trap efficiency or capture probability) is used to expand the number of unmarked fish captured in the RST to an abundance of fish that pass the trap.
Mark-recapture techniques are used to estimate the abundance of migrants for a period of time using catch (n) from the trap and an estimate of trap efficiency (e). One of the simplest and most frequently used methods is based on the Lincoln-Petersen estimator. Trap efficiency over a discrete time period (i) is the proportion of marked fish (m i ) recaptured out of the total number of marked fish released during a trap efficiency release (M i ), e i = m i / M i . This value is then used to adjust the number fish captured in the trap for the same discrete period to obtain an abundance estimate, N i = n i / e i , where N i and n i are the estimated number of downstream migrants and the number of unmarked fish captured, respectively (Volkhardt et al. 2007). However, the Lincoln-Petersen estimate makes some simplifying assumptions, including: (1) a closed population during the mark-recapture trial period, (2) all fish (marked and unmarked) have equal capture probabilities, (3) marking does not affect catchability, (4) fish do not lose their marks, and (5) all recovered marks are reported. Violations of these assumptions will result in biased estimates (Seber 2002).
Because of the dynamic nature of streams and rivers, efficiency for a given trap is rarely constant during its entire time of operation, which is typically the duration of a migration season (often >100 days), and is influenced by exogenous and endogenous factors (Roper and Scarnecchia 1996). Discharge is one of the primary factors associated with fish migration, and can strongly affect trapping efficiency (Volkhardt et al. 2007). Visibility and noise of the trap can deter fish and reduce trap efficiency as well, particularly for species that exhibit avoidance behaviors. Time of day (e.g., Tattam et al. 2013), location where marked fish are released, and even the trap's position in the channel will influence the number of recaptured individuals. Salmonid downstream migrations -and presumably trap efficiency -also respond to moon phase, as this affects visibility at night (Youngson et al 1983). Trap efficiency will also vary by fish size; larger fish with better sight and swimming abilities are better at avoiding traps than small fish. Thus, it is not unusual for efficiency to change seasonally with the size of downstream-migrating individuals. Trap efficiency can be influenced by individual capture history (Tattam et al. 2013) and can differ between hatchery-and wild-origin fish (Roper and Scarnecchia 1996). Numerous sources of uncertainty complicate the ability to detect abundance trends as well as investigate the factors that influence trap efficiency. Because uncertainty in trap efficiency is propagated to abundance estimates, which can also exhibit high uncertainty, it is important to minimize sources of error when estimating trap efficiency.
Study designs can be modified and statistical approaches implemented to account for variation in trap efficiency. A frequently used study design is to stratify trap efficiency releases by day, week, or longer time-periods. Darroch (1961) developed a maximum likelihood model for the time-stratified Lincoln-Petersen estimator using two traps (one to catch and mark individuals and a second to estimate abundance). Similar models were developed for single-trap study designs, when fish are transported and released upstream of the trap in which they were initially captured (MacDonald and Smith 1980;Rawson 1984). For a time-stratified Lincoln-Petersen model, trap efficiency releases are performed within discrete time strata, and the efficiency from the release is used to adjust capture data from the same time-period (Carlson et al. 1998). These designs require that release periods be paired with a single capture period. However, logistical constraints might prevent a trap efficiency release from occurring, or changing environmental conditions within a timestratum could result in few to no recaptures and cause imprecise recapture rates. Another constraint could be that not enough fish are captured in the trap to perform a trap efficiency release. In these cases, either fish captured during the period are ignored, the most recent efficiencies are carried forward until the next release is performed (e.g., Bilski et al. 2011), or daily catches are grouped into periods of relatively similar environmental conditions (e.g., Steinhorst et al. 2004).
Statistical approaches have been used to estimate trap efficiency for catch data during periods when fish were not released. The simplest approach groups trap efficiency releases by an independent variable or condition class (e.g., life stage or time of day). The number of fish captured at a given level of the condition class is adjusted using the trap efficiency estimate for that specific range of the variable or condition class (e.g., Steinhorst et al. 2004). Criteria for pooling strata can be somewhat subjective, and rely on testing for differences in trap efficiency between strata (Schwarz and Taylor 1998;Bjorkstedt 2000). Another limitation is that these approaches do not allow trap efficiency to be modeled as a function of covariates. To provide daily abundance estimates, Schwarz and Dempson (1994) developed a maximum likelihood formulation to model efficiency as a function of external factors. An alternative approach involves modeling trap efficiency as a function of a priori chosen environmental variables using data from multiple release groups and run over a range of conditions for the variables of interest. After ensuring that the model adequately represents the data, it is used to predict trap efficiency for a given day (or other relevant time-period) and applied to daily capture data (e.g., Montgomery et al. 2007). Because the outcome of a trap efficiency release can be considered an independent realization of M i Bernoulli trials with probability e i , it is most often modeled as a random variable assumed to follow a binomial distribution. Therefore, statistical approaches that have been used to model trap efficiency include Generalized Linear Models (GLM or logistic regression) or Generalized Additive Models (GAM; both approaches reviewed by Cheng and Gallinat 2004). Finally, Bayesian statistics have been coopted into time-stratified Lincoln-Petersen models because the Bayesian framework's flexibility can simultaneously model trap efficiency and abundance (Mäntyniemi and Romakkeniemi 2002;Bonner and Schwarz 2011). Bayesian approaches have the added benefit of being able to account for biological realism (e.g., migratory schooling behavior or autocorrelation in time-stratified trap efficiency), but at the expense of increased model complexity and technical knowledge required to implement.
Although the total abundance estimate is often the metric of interest, it is also important to provide a level of uncertainty for the estimate. Uncertainty in trap efficiency and abundance estimates has been quantified using parametric confidence intervals, which assume a normal distribution of error (Bilski et al. 2011;Steinhorst et al. 2004) and by using nonparametric bootstrap methodology (Thedinga et al. 1994;Steinhorst et al. 2004). Bayesian approaches use Markov Chain Monte Carlo (MCMC) procedures to sample from posterior distributions of parameters. Uncertainty about parameter estimates is given by 95% credible intervals of the posterior distribution.
In California's Central Valley, RSTs are used on nearly every major tributary to the Sacramento and San Joaquin rivers to monitor seaward migration of Chinook Salmon (Oncorhynchus tshawytscha) and Steelhead (Oncorhynchus mykiss). Data from these monitoring programs are used to estimate abundance, characterize spatio-temporal aspects of juvenile downstream migration, and evaluate restoration and water export activities. However, using these data to compare abundance estimates within and among tributaries is challenging, because different resource agencies or consultants manage monitoring programs independently of one another, and programs use different analysis methods (Central Valley Salmon and Steelhead Monitoring Programs, 2007 summary report by the Interagency Ecological Program, https://nrm.dfg. ca.gov/FileHandler.ashx?DocumentID=3491&inline). For any given monitoring program, extrinsic factors and logistical challenges will inevitably change 4 VOLUME 17, ISSUE 1, ARTICLE 4 monitoring protocols over time. Furthermore, advances in statistical approaches and study designs may require monitoring protocols to adapt to new techniques. How underlying annual methodological and environmental variation influence the ability to compare abundance estimates across programs or detect meaningful trends for a specific program remains under-studied -particularly how trap efficiency is treated.
On the Stanislaus River, the downstream migration of juvenile salmon has been monitored using an RST for more than 2 decades. This long-term data set, encompassing a range of environmental conditions, presented a unique opportunity to investigate how different trap efficiency estimation methods affect abundance estimates, and estimate uncertainty. As part of an internal review of the program's analyses and methods, our first objective was to compile trap efficiency release data from 1996 to 2017 to investigate the exogenous and endogenous factors associated with trap efficiency estimates. Our second objective was to compare annual abundance estimates derived using different methodological approaches. The approaches used here differed in how strata were pooled (temporally versus homogeneous external conditions) and in methodological approach (frequentist versus Bayesian). Finally, to assess the biological relevance of this program's juvenile abundance estimates for life-cycle and stockrecruitment models, we evaluated two things: (1) the association between abundance estimates and adult escapement counts from a Riverwatcher fish-counting device (VAKI Aquaculture Systems LTD, Iceland) to verify that the number of adult spawners could be used to predict juvenile abundance estimates, and (2) that the method used to estimate juvenile abundance did not influence this relationship.

Study Site
The Stanislaus River is one of three principle tributaries to the San Joaquin River in California's Central Valley ( Figure 1) and it harbors a population of fall-run Chinook Salmon among other native and introduced fish species. Arising in the central Sierra Nevada mountain range, the Stanislaus River has a catchment area of approximately 2,700 km 2 and flows west to its confluence with the San Joaquin River. The region has a Mediterranean climate and receives 90% of annual precipitation between November and April. Before the development of the watershed by numerous impoundments, the hydrograph was characterized by low-magnitude rainfall pulses throughout the fall and winter, followed by a large snowmelt-driven pulse in the spring and early summer. Watershed development has reduced and stabilized river discharge (Kondolf and Batalla 2005), resulting in entrenched reaches, vegetation encroachment, and coarsening of substrate (Brown and Bauer 2009). Habitat modifications and an altered hydrological regime have reduced available spawning habitat for fall-run Chinook by approximately 53% (Yoshiyama et al. 2001). Spawning migrations are completely blocked by Goodwin Dam, located 94 river kilometers (rkm) upstream of the confluence with the San Joaquin River. The majority of Chinook spawning occurs in the 30-km reach below the impoundment, but some spawning has been observed further downstream (unpublished spawning survey data). A monitoring program, performed on behalf of three irrigation districts, was implemented in 1996 to track the annual downstream migration of fall-run Chinook juveniles.

Sampling Gear
We deployed a 2.4-m-diameter (8-ft) RST (E.G. Solutions, Eugene, Oregon) annually in the Stanislaus River to monitor downstream migration of juvenile fall-run Chinook Salmon. The trap was located at rkm 64.3, approximately 5 km west of Oakdale, California ( Figure 1). This trap location was chosen based on optimal water velocities for operation (optimal operation was defined as a minimum of two revolutions per minute according to the Comprehensive Assessment and Monitoring Program [USFWS 2008]), and because the site is downstream of the majority of Chinook spawning and rearing habitat (Demko and Cramer 1996). The trap consists of a funnel-shaped core suspended between two pontoons. Because the trap relies on water flowing through it to rotate, it was positioned in the current so that water could enter the funnel mouth. As water

MARCH 2019
https://doi.org/10.15447/sfews.2019v17iss1art4 enters the trap, it strikes an internal screw core, causing the funnel to rotate. Fish inside the rotating trap become entrained in pockets of water that are forced rearward into a live box, where they are held until they are processed by technicians. The trap and pontoons were held in position using steel cables anchored to the north bank or an overhead cable system, depending on river discharge conditions.

Daily Trap Monitoring
From 1996 through 2017, the Oakdale trap was typically operated between January and July during most monitoring years, but in some years operation started as early as October 6 (2011) and as late as February 2 (1996). The median start date across all years was January 3. No trapping was performed in 1997 because of high flows. Within a trapping season, the trap was operated continuously (24 hr d −1 , 7 d wk −1 ), with exceptions, until the permitted termination date of July 15, or until average daily water temperature exceeded 21°C. Trap operation was also terminated after consecutive days of low or zero catch, indicating the end of the migration period. The exceptions were that traps were not operated on days when elevated discharge caused unsafe conditions or excess debris to enter the trap. Owing to public safety concerns, the trap was also not operated during times of heavy recreational river traffic (e.g., Memorial Day weekend). The 6 VOLUME 17, ISSUE 1, ARTICLE 4 trap was monitored daily throughout the sampling period. Each morning, technicians would remove the contents of the live box, identify and enumerate all fish, and note any recaptured marked fish. Technicians checked traps more frequently as conditions required, especially during periods with high catch or high amount of debris. Each day, technicians measured (fork length [FL] in millimeters) and recorded up to 50 salmon, depending on the number of fish in the trap. Technicians anesthetized all fish with tricaine methanesulfonate  or AlkaSeltzer® before handling them. Because the sample design used a single trap to catch fish for trap efficiency releases and to estimate abundance, we defined daily catch (n i ) as the number of unmarked salmon caught in the trap. After enumeration of captured fish, technicians cleaned the traps to prevent accumulation of debris that might impair trap rotation or cause fish mortality within the live box.

Trap Efficiency Releases
Each year, multiple mark-recapture releases were performed to estimate trap efficiency (e). Naturally produced juveniles captured in the trap were the primary source used to conduct tests. Chinook Salmon obtained from Merced River Hatchery (operated by the California Department of Fish and Wildlife) were used for the release group to supplement release numbers during years of low natural production or during periods of low trap efficiency that required more individuals in the release group. Fish were transported to the release location (approximately 400 meters upstream of the trap) in either 5-gallon buckets or 20-gallon insulated coolers, depending on the number of fish.
Naturally produced and hatchery juvenile Chinook were marked onshore adjacent to the release location. All fish were anesthetized before the mark was applied. A photonic marking gun (Med-E-Jet, model S-3M, Olmsted Falls, Ohio) was used to inject an orange or pink photonic dye (DayGlo Color Corporation, Cleveland, Ohio) into the caudal fin tissue. The color used for each release group was alternated to distinguish fish from different release groups. Marked fish were held in live boxes kept in areas of low water velocity to reduce stress during their recovery from anesthesia. During typical release conditions, the release location was the river's north bank, but fish were also released in the middle of the channel from a boat during high discharge periods. Before release, a subsample of marked fish (or the entire release group if fewer than 50) was selected to measure fork length and to check for mark retention and mortalities. All releases of marked fish occurred after dark (1 hour after sunset) by releasing subsets of individuals about 30 seconds to 3 minutes apart, depending on how quickly released fish dispersed from the location. Total release time for a given trial ranged from about 8 to 30 minutes, depending on the number of fish released. Occasionally, two releases were performed on the same day. When this occurred, data were pooled to avoid pseudo-replication in the efficiency analyses.

Environmental Data Collection
We obtained provisional daily discharge (m 3 s −1 ) for the Stanislaus River at the Orange Blossom Bridge gauge (OBB) from the California Department of Water Resources http://cdec.water.ca.gov/cgi-progs/ queryF?obb. We used two methods to measure the velocity of water that entered the trap. First, we took instantaneous measurements daily with a Global Flow Probe (Global Water, Fair Oaks, California). Second, we calculated an average daily trap rotation speed for each trap by recording the time, in seconds, for three full revolutions of the cone, once before and once after the morning trap cleaning. We considered the average of the two times the average daily trap rotation speed.
We measured instantaneous water temperature daily with a thermometer at the trap site. To measure daily instantaneous turbidity, we collected a water sample each morning and tested it with a LaMotte turbidity meter (Model 2020e, LaMotte Company, Chestertown, Maryland). Turbidity was reported in nephelometric turbidity units (NTU). We measured instantaneous dissolved oxygen during trap checks with an ExStik® II D600 Dissolved Oxygen Meter (Extech Instruments Corporation, Waltham, Massachusetts) at the trapping site and recorded in milligrams per liter (mg L −1 ).

Annual Abundance Estimates
At the end of each monitoring season, we estimated juvenile production using the Petersen estimator applied to daily catch totals of unmarked fish, where trap efficiencies were stratified within each monitoring year according to size class and discharge categories (herein referred to as Within-Year Stratified Trap Efficiency [WYSTE] method). Occasionally, fish older than young-of-year (YOY) salmon were present in the trap and were excluded from abundance estimates based on fork length. We determined cutoff fork lengths for YOY by plotting length frequencies for each week during the season, and then selecting a value that was greater than or equal to 95% of individuals for that week. Length cutoff values ranged from 54 mm in week 1 to 124 mm in week 18. We estimated the daily trap efficiencies used in the Petersen equation as the mean value after pooling empirically derived efficiency estimates within a monitoring season by size class (fry < 45 mm, parr 45-80 mm, and smolt 81-124 mm) and discharge condition categories (discharge < 21.3 or > 21.3 m 3 s −1 ). We used the mean efficiency across releases of the same size class/discharge category to expand daily catches for all days that met the same conditions. For example, in 2009, the mean trap efficiency for fry at discharge < 21.3 m 3 s −1 was 0.423 (mean of eight releases). We then used this efficiency to expand all daily catches that had mean fork lengths <45 mm and discharge < 21.3 m 3 s −1 . However, if no empirical efficiency data were available for a given size class and discharge category in a given year, we used average trap efficiencies from the previous year or most recent year with available data. We calculated juvenile abundance estimates using common spreadsheet software (Microsoft Excel®) but did not include estimates of error in trap efficiency or abundance.

Trap Efficiency Modeling
Our first objective was to compile 21 years of trap efficiency release data to investigate exogeneous and endogenous factors associated with trap efficiency. Generalized linear mixed models (GLMM) are an extension of GLM that allow both fixed effects and random effects to be modeled, as opposed to GLM that can only model fixed effects (Venables and Dichmont 2004;Bolker et al. 2009). We expected discharge and year to have a significant effect on trap efficiency based on findings from Zeug et al. (2014), who analyzed mark-recapture data from the Oakdale trap for a subset of years (1996,(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009) using logistic regression (i.e., GLM). Although we were not interested in the effect of year per se, its effect on trap efficiency was significant and could not be ignored. Therefore, we used GLMM to model the fixed effects of explanatory variables while accounting for the random effects of year in the multi-year data set.
Exogenous environmental variables whose fixed effects were of interest included mean daily discharge, turbidity, and moon illumination (from U.S. Naval Observatory, http://aa.usno.navy.mil/ data/docs/MoonFraction.php). Endogenous variables related to sampling effects -including number of fish in the release group and mean FL (mm) of the release group -were also incorporated in models. We assigned salmon to a life-stage size category based on fork length size classes (fry ≤ 45 mm, parr = 45-80 mm, smolt = 81-124 mm) to investigate if trap efficiency varied by size class, but did not include size class in the same models as mean fork length. We transformed discharge, turbidity, and number of released fish to a natural logarithm scale to ensure a linear response to trap efficiency. We standardized fork length across years and sites with a z-transformation. We assessed the multicollinearity of explanatory variables by examining variance inflation factors (Legendre and Legendre 2012) for each combination of variables.
Both GLM and GLMM modeled trap efficiency as a binomial probability, bounded by 0 and 1 with a logit link function. Occasionally, release trials resulted in few or no recaptures, causing extremely low estimates of trap efficiency; therefore, we excluded trials that had less than 1% of recaptures. Additionally, years when fewer than 10 release trials were performed were not included in the model data, as recommended by Bolker et al. (2009). Because we were primarily interested in fixed effects on trap efficiency, and because a poorly chosen random component can influence the coefficients and standard errors of the fixed effects, we chose a top-down approach to select the optimal model(s) (Diggle et al. 2002;Zuur et al. 2009). Starting with 8 VOLUME 17, ISSUE 1, ARTICLE 4 a beyond-optimal model for the fixed effects (i.e., containing all explanatory variables, excluding lifestage category, and all possible interactions), we evaluated four possible random effect structures that included a random year intercept, a random year intercept and random discharge slope, and the previous two random structures with a random observation intercept to account for over-dispersion. We used Akaike Information Criterion corrected for small sample size (AIC C ) based on the number of releases to assess the relative importance of each random structure to the beyond-optimal model. We ranked models according to AIC C , then selected the best models according to ∆AIC C (∆AIC C < 2.0) and model weights (W > 0.2; Burnham and Anderson 2002). After we identified the optimal random structure, we compared nested fixed effects models of the explanatory variables. To reduce the number of fixed effects models to compare, we did not include interaction terms. We again used model selection to assess the relative importance of explanatory variables based on ∆AIC C and model weights. We assessed overall model fit by calculating marginal R 2 for the fixed effects and conditional R 2 , which incorporates the random effects (Nakagawa and Schielzeth 2013). We performed trap efficiency modeling and abundance calculations using R (version 3.4.0, R Core Team 2017) with packages lme4 (Bates et al. 2015) for GLMM, fmsb (Nakazawa 2015) for variance inflation factor, and MuMIn (Bartoń 2016) for model ranking and R 2 values.

Comparative Evaluation of Methods
Our second objective was to investigate how abundance estimates and estimate uncertainty varied across different estimation methods. Because estimate uncertainty is not quantified using the monitoring program's current method, WYSTE, this comparative study also served to qualitatively evaluate confidence in the program's abundance estimates.
We compared abundance estimates from four additional methods, each with different approaches for applying trap efficiency estimates to catch data (stratified versus modeled) and statistical methods (frequentist versus Bayesian). The four additional methods, described in detail below, were: (1) among-year stratified trap efficiency, (2) simple time-stratified mark-recapture, (3) modeled trap efficiency, and (4) Bayesian analysis of timestratified Petersen diagonal recaptures experiments. We characterized annual trends for each method by counting the number of times an increase or decrease in abundance occurred from year t to t + 1. We used a Chi-square test to evaluate if the proportion of increasing and decreasing years varied among methods. For each year, we calculated the coefficient of variation (CV) across the five methods to quantify discrepancies among them. To evaluate potential factors associated with abundance discrepancies, we used correlation between annual abundance CV and environmental (i.e., mean discharge during trapping season) and release-related factors. The releaserelated factors comprised number of release trials per year, total number of fish released, minimum and mean numbers of recaptures, mean fork length of the release group, average number of days between release trials, and the percentage of hatchery-origin fish in the release group.
Finally, because we expected the number of adult salmon spawning to influence juvenile abundance, we assessed the biological relevance of the different abundance estimates by analyzing the relationship between adult escapement and estimated juvenile abundance, and whether or not the abundance estimation method influenced this relationship. We performed an analysis of covariance (ANCOVA) with the natural log of abundance estimates as the response variable, adult escapement counts as a continuous explanatory variable, and estimation method as a categorical explanatory variable. We obtained adult escapement counts from a fish weir outfitted with a Riverwatcher fish-counting device (VAKI Aquaculture Systems LTD, Iceland) that was operated annually during the adult fall-run spawning season (September to December) starting in 2003 (Peterson et al. 2017). Therefore, this analysis was performed using juvenile abundance estimates from 2004 through 2017, as they corresponded to adult escapement during 2003 through 2016.

Among-Year Stratified Trap Efficiency
In the first two approaches, we applied Bailey's modification to the Petersen formula (Bailey 1951) that forces a finite expectation of trap efficiency. The modification allows for the inclusion of trials with 0 recaptures by adding 1 to the number of marked fish and to the number of recaptures. The amongyear stratified trap efficiency approach is similar to the annually calculated method described above (WYSTE) in that empirically derived trap efficiency values are stratified into the same size class/ discharge categories. However, instead of pooling trap efficiency within each year, we pooled trap efficiency across all years for which trap efficiency releases were performed. This method also differed from WYSTE because daily catch was proportioned into size classes for different trap efficiencies. In other words, the within-year method used one trap efficiency to expand daily catch to abundance based on mean fork length, whereas the amongyear method could use up to three trap efficiencies for a single day if all three size categories were represented in the trap that day. We calculated 95% confidence intervals by a bootstrap procedure. We resampled daily abundance estimates (after efficiency was applied to catch for each size category) 1,000 times with replacement, then summed for an annual estimate. We took confidence intervals from the values that occurred at 2.5% and 97.5% of the bootstrapped distribution of annual estimates. Bootstrapped confidence intervals have been shown to closely approximate the target 95% range when used with stratified methods (Steinhorst et al. 2004).

Simple Time-Stratified Mark-Recapture
For the next method, stratified mark-recapture, we applied the trap efficiency derived from the first efficiency release each year to daily catches from the beginning of trap operation until the next release that year was performed. After the next release was performed, the new efficiency was used until subsequent releases were performed and new efficiencies could be estimated. We evaluated the degree of uncertainty in annual migration estimates using this method by calculating the 95% confidence intervals by bootstrapping on daily abundance estimates.

Modeled Trap Efficiency
For the modeled trap efficiency method, we used the highest ranked GLMM model that explained variation in trap efficiency to develop a predictive model to estimate daily trap efficiency based on the environmental covariates (refer to "Trap Efficiency Modeling" section above). We then used predicted trap efficiencies to estimate daily juvenile abundance from the Petersen estimator described above. We derived confidence intervals for annual abundance estimates from bootstrapping on daily abundance estimates. Bonner and Schwarz (2011) introduced a Bayesian semiparametric method for estimating abundance from time-stratified Petersen mark-recapture experiments (i.e., releases). Unlike the previous methods, the Bayesian framework takes advantage of the temporal relationship between mark-recapture experiments to model trap efficiencies for time strata when releases were not performed. Their approach explicitly models catch in the trap over time using Bayesian P-splines as the smoothing function. Briefly, the nonlinear P-spline algorithm allows for flexibility in the shape of the spline regression while minimizing overfitting (Lange and Brezger 2004). For this approach, catch and trap efficiency releases were stratified by week. Because recaptures of marked fish occurred within the same strata (i.e., week) as the release (the majority of fish were recaptured the following day), the recapture data resembled a Time-Stratified Petersen with Diagonal recaptures Experiments case. Similar to the Modeled Trap Efficiency method, this method can incorporate information on covariates, such as discharge, when modeling trap efficiency. Therefore, we estimated abundance with this method for each year with and without discharge as a covariate using the R package BTSPAS (Bonner and Schwarz 2011) and the program JAGS (version 4.3.0) to sample posterior distributions. Each run consisted of three Markov chains sampled for 200,000 iterations, with the first 100,000 being discarded during the 'burnin' period. The thinning parameter was set to 50 to reduce auto-correlation in the posterior distributions, resulting in 6,000 simulated samples. Uncertainty in estimates is provided by the 95% credible interval, which contains 95% of the estimate's posterior distribution. Although we performed this analysis with a covariate, abundance estimates were not substantially different when discharge was included, and the deviance information criterion (DIC) we used to compare between covariate and non-covariate models did not support that discharge was useful for predicting trap efficiency with this method. The only years that exhibited a non-zero coefficient for the effect of intra-annual discharge on trap efficiency were 1999 (b = 0.10, 95% CI = 0.1-0.17) and 2008 (b = −0.27, 95% CI = −0.54--0.01). Therefore, we only report estimates from this method based on the noncovariate model.

Stanislaus Fall-Run Juvenile Monitoring Program
The average number of trapping days per year was 159 (Table 1). During the 6 years when the trap was deployed before 1 January (1999,2000,2001,2002,2010,2011), 706 juvenile salmon were captured between the months of October to December. Thus, juveniles passing before 1 January were unlikely to contribute substantially to annual abundance estimates. Annual total catch from January until the end of the Stanislaus trap monitoring ranged from 21,450 juvenile salmon in 2008 to 401,903 in 2004 (mean catch = 89,152). In 21 years of monitoring, 131,028 unmarked fish were measured (mean = 6,239 salmon per year), with mean fork length in a year ranging from 38 mm (2017) to 79 mm (1996). Although the magnitude of migration varied from year to year, the peak timing appeared to occur consistently between January and March each year because of high catches of fry during this period. The earliest date of 50% catch occurred in 2000 (January 27) and the latest date was in 2012 (March 19).

Trap Efficiency Modeling
From 1996 to 2017, 387 release trials were performed with a mean of 18 releases per year (range 8-31; Table 2). The mean number of fish released per trial was 419 (range 44-2,931). Naturally produced fish were used in release trials in all years. However, hatchery-reared fish were also released in 11 years, and made up half or more of the released fish in Across all years, discharge was the primary factor associated with trap efficiency estimates (Figure 2). Efficiency estimates could be high (e > 0.50) under low-discharge conditions (discharge < 5.7 m 3 s −1 ) but were extremely variable. Estimated efficiency declined precipitously as discharge increased to about 22.7 m 3 s −1 for all size classes (Figure 2A). We used GLMM to address the specific question: does the relationship between discharge and trap efficiency estimates vary across salmon size classes? The model with size class and discharge without an interaction had the majority of support (W = 0.74, Table 3), suggesting a common slope between life stages but with different intercepts ( Figure 2B). The difference between intercepts across size classes was small, yet statistically distinguishable at a = 0.05. Across the observed levels of discharge, efficiency estimates were greatest for fry, then parr, then smolts.
We used our second set of GLMMs to evaluate the influence of environmental and sampling-related factors on trap efficiency estimates. Explanatory variables used in these models did not show evidence of extreme multicollinearity, because variance inflation factors for the variables ranged from 1.0 to 2.0. Model selection found overwhelming support (W = 1.0) for the random structure that contained the random intercept and random slope for discharge across year, as well as a random intercept for each observation/release. Using model selection to evaluate fixed effects candidate models, there were two competing optimal models (∆AIC C < 2.0, W > 0.2). Both models indicated a negative relationship between discharge and trap efficiency, as well as a negative relationship between fork length and trap efficiency ( Table 4). The number of individuals in a release group was present in one optimal model but had weaker associations with estimated trap efficiency than discharge or length. Marginal (R 2 m ) and conditional (R 2 c ) R 2 values were similar for both optimal models (R 2 m = 0.31, R 2 c = 0.42), suggesting that the addition of release group size did not improve the model's explanatory power.

Comparative Evaluation of Methods
Annual abundance estimates were variable across all five methods but were generally within an order of magnitude (Table 5, Figure 3). The modeled trap efficiency method produced the highest estimates in 7 years, and the stratified mark-recapture method produced the highest estimates in 6 years, followed by the within-year stratified trap efficiency method, which produced the highest estimates in 4 years. The among-year stratified trap efficiency method produced the smallest estimates in 7 years, whereas no other method produced consistently low estimates. Uncertainty in annual abundance estimates, measured as 95% confidence intervals, was generally greater for the non-Bayesian methods that used bootstrapping (Figure 3). With the exception of 1999 and 2000, the 95% credible intervals for the Bayesian time-stratified Petersen diagonal recaptures method Figure 2 The relationship between efficiency estimates of the Oakdale RST and mean daily discharge measured at Orange Blossom Bridge for all efficiency trials performed over the entire monitoring period (1996,(1998)(1999)(2000)(2001)(2002)(2003)(2004)(2005)(2006)(2007)(2008)(2009)(2010)(2011)(2012)(2013)(2014)(2015)(2016)(2017). Panel (A) shows the untransformed relationship. Panel (B) shows the transformed relationship for different salmon life stages, based on size categories; fry (<45 mm FL), parr (≥ 45 and < 80 mm FL), and smolt (≥ 80 and < 124 mm FL). Table 3 Model selection table comparing fixed effects associated with trap efficiency from the Oakdale RST on the Stanislaus River from 1996 to 2017. Reported are the estimated coefficients and standard errors for models that evaluate the relationship between standardized discharge at Orange Blossom Bridge (QOBB) and trap efficiency by salmon size categories (SizeCat). Size categories are fry, parr, and smolt. The random component for this model series included a random intercept and slope for standardized discharge across years and a random intercept for each release. were narrower than the bootstrapped confidence intervals from the other methods.

Models
The proportion of years that showed an increase in estimated abundance ranged from 0.4 to 0.5 ( Figure 4) and was not different across methods (c 2 = 0.568, df = 4, P = 0.967). The years with the greatest discrepancy between estimates were 1996 (CV = 0.52), 2003 (CV = 0.62), and 2017 (CV = 0.48; Table 5). Abundance CV showed a positive trend with total number of fish released and mean fork length, and a negative trend over time (years) and with minimum number of recaptures. However, no correlations were significant. Using ANCOVA, we found adult escapement counts to be a significant predictor of estimated juvenile abundance (F 5,64 = 9.82, P < 0.001). There was no significant interaction term between method and escapement counts, nor was method a significant factor in explaining abundance estimates.

DISCUSSION
In this study, we reviewed 21 years of markrecapture release and RST catch data from the Stanislaus River fall-run Chinook Salmon monitoring program to identify factors associated with trap efficiency estimates and to evaluate the program's current analysis methods for estimating juvenile abundance. Trap efficiency estimates varied highly among trap efficiency releases, but river discharge was the strongest factor associated with efficiency estimates. We also found that abundance estimates were robust across different analysis methods. In most years, the majority of methods produced estimates with overlapping confidence intervals. Trends in juvenile abundance estimates-in terms of increasing or decreasing between years-were not affected by estimation method, and neither was the relationship between adult escapement and estimated juvenile abundance. In spite of the widespread use of RSTs to monitor fish populations, few published studies evaluate screw trap abundance estimates for long-term monitoring programs.
In fitting the GLMMs to explore trap efficiency estimates, we found that the random components of the efficiency models were best explained by a random intercept and slope for river discharge across years, as well as a random intercept for each release. The random slope and intercept for discharge captured what appeared to be a variable relationship   between discharge and estimated efficiency among years. Whereas some of this variation could result from varying flow regime characteristics across years (i.e., wet versus dry years, or years with managed pulse flows versus years without), it could also result from inter-annual differences in trap efficiency release strategies. For example, the number of fish released and the number of releases differed across years, as did the timing when the releases took place. Provided that release strategies were frequently dictated by the number of fish available for marking and daily discharge, both release strategy and annual flow characteristics contributed to this random variation. This was the year effect observed previously by Zeug et al. (2014) at this trap, and possibly the cause for year effects observed at a second trap operated 51 river kilometers downstream on the Stanislaus River (Zeug et al. 2014) and a multi-year data set from the Tucannon River analyzed by Cheng and Gallinat (2004).
Another source of random variation was from the individual releases themselves. The number of recaptured individuals varied highly across releases, even among releases at similar discharges, and contributed to over-dispersion of the efficiency estimates. We excluded estimates when the percent of recaptures was less than 1%. Whereas increasing our cut-off value would reduce the efficiency overdispersion, it would also limit the number of years that could be analyzed using GLMM, unless we also decreased our cutoff value for the number of release trials per year. Our cutoff of a minimum of 10 releases per year (as recommended by Bolker et al. 2009) resulted in the exclusion of 4 years. Increasing the recaptures cutoff to 10% would exclude 8 years from our analyses.
In addition to river discharge being related to random variation, it was also the best predictor of trap efficiency. The negative association between discharge and trap efficiency was also observed at both traps on the Stanislaus River as reported by Zeug et al. (2014). On the Tucannon River, Washington, Cheng and Gallinat (2004) used GAM and found a significant nonlinear relationship between trap efficiency and discharge, but was not significant using GLM. Roper and Scarnecchia (1996) observed no relationship between discharge and trap efficiency with wild salmon on the South Umpqua River, Oregon, but reported lower capture rates of hatchery salmon at lower water velocities. Clearly, the effect of discharge on the efficiency of a specific trap will be idiosyncratic and will need to be evaluated using multiple statistical methods.
We also found evidence that efficiency estimates were size dependent. When efficiency estimates were modeled as a function of discharge and categorical size-classes, efficiency was greatest for fry and lowest for smolts, across all observed discharges. When fork length was included as a continuous variable, it had a negative coefficient, indicating that estimated efficiency decreased as the fork length of the release group increased. This finding is consistent with larger fish having an increased ability to avoid the trap (Tattam et al. 2013). Interestingly, Zeug et al. (2014) did not find a significant effect of fork length at this trap but did find a significant effect at the lower Stanislaus River trap. One explanation for this difference could be that fork length has a weaker effect at the upper Stanislaus River trap than at the downstream trap, which the GLMM was able to detect after accounting for random effects.
Although we identified some factors associated with variation among trap efficiency estimates, there was still unexplained variation that we were concerned could be propagating into abundance estimates. Because the monitoring program's method for estimating abundance did not include uncertainty estimates, we compared its estimates with abundance estimates from four additional methods that varied in how trap efficiency estimates were applied. The confidence intervals provided by the additional methods allowed us to characterize uncertainty in the program's annual abundance estimates, as well as assess systematic bias across different estimation methods. Although confidence intervals could be wide, depending on method and year, different methods generally produced similar abundance estimates with overlapping confidence intervals. Using the Darroch (1961) model and varying the length of stratification intervals, Dempson and Stansbury (1991) estimated abundance of Atlantic Salmon (Salmo salar) in the Conne River, Newfoundland, and found abundance estimates to be similar, regardless of whether stratifications were set at 5 or 14 days. Schwarz and Dempson (1994) developed a likelihood model that could account for daily variation in capture probability and used it to estimate abundance from the same Atlantic Salmon population. Whereas Schwarz and Dempson (1994) concluded that the stratifications used by Dempson and Stansbury (1991) were too long, the estimated abundance and standard error from the Schwarz and Dempson (1994) model were not substantially different from Dempson and Stansbury's (1991) estimates or standard errors. Bonner and Schwarz (2011) also found similar abundance estimates for a subset of the Conne River Atlantic Salmon population, after comparing Bayesian implementations of the pooled Petersen model, the Schwarz and Dempson (1994) model, and the Mäntyniemi and Romakkaniemi (2002) model with their Bayesian P-spline approach. These few comparative studies suggest that abundances estimated using mark-recapture procedures are generally robust, regardless of the method used. Obviously, the best results from a chosen estimator should come from a monitoring program designed specifically for that estimator. In the case of long-term data, where monitoring protocols can change annually depending on environmental or institutional conditions, abundance estimates may still be unbiased but will have increased uncertainty.
Evaluating abundance trends based on a single estimation approach without knowing its level of uncertainty is difficult at best, and at worst can provide misleading results about a species' critical life stage. Estimating juvenile abundance using five different methods did not reveal any different trends in abundance over time. Furthermore, abundance estimates were significantly associated with direct counts of adult escapement, and estimation method did not affect this relationship. Results from this comparative study suggest that abundance estimates based on any one of these methods could provide accurate estimates of abundance (assuming the true abundance is contained within the range of confidence intervals across methods), but the estimates' precision will be affected by uncertainty about trap efficiency estimates as well as by the different assumptions each method makes. The primary objective of the Stanislaus River monitoring program has been to estimate abundance, and characterize migration patterns of fall-run Chinook Salmon to ensure that the managed flows meet the conservation requirements of this sensitive species. As such, these data are necessary to evaluate the effectiveness of prescribed flows, and to inform lifecycle models and stock-recruitment forecasts.

CONCLUSION
The decline of culturally and economically important species, such as salmonids, has been an impetus for long-term monitoring of populations (Nichols and Williams 2006). Rotary screw traps are frequently used to monitor salmonids around the world, including juvenile fall-run Chinook Salmon in the California Central Valley. Estimating abundance from an RST requires mark-recapture techniques to provide trap efficiency. However, because trap efficiency is influenced by myriad exogenous and endogenous factors that vary through time, the design of mark-recapture experiments and trapping stratification become the cornerstone for acquiring accurate and precise abundance estimates. When designing an RST monitoring program, adhering to the standardized procedures discussed by Volkhardt et al. (2007) and others will help the trapping design meet the assumptions of the statistical analyses, and in doing so, will help reduce uncertainty. For long-standing monitoring programs, such as the Stanislaus River program, inter-annual variability in mark-recapture releases (whether intentional or inadvertent) decreases the certainty of annual abundance estimates. Although advances in statistical techniques have been useful for handling heterogeneity in trap efficiency, few studies have evaluated how these methods influence temporal abundance trends or how recently derived approaches handle historical mark-recapture data.
Here, we showed that abundance estimates based on mark-recapture data are generally robust, in that different numerical procedures will provide similar results. Understanding the robustness of abundance estimates is particularly important in the Central Valley where there are 17 existing screw trap monitoring programs for Chinook Salmon and Steelhead (Central Valley Salmon and Steelhead Monitoring Programs, 2007). This study was a first step in evaluating how comparable abundance estimates are across different monitoring programs.