Impacts of prior mis-specification on Bayesian
fisheries stock assessment
Yong Chen
A,D, Chi-Lu Sun
Band Minoru Kanaiwa
CASchool of Marine Sciences, University of Maine, Orono, ME 04469, USA. BInstitute of Oceanography, National Taiwan University, No. 1, Section 4,
Roosevelt Road, Taipei 10617, Taiwan.
CTokyo University of Agriculture, Department of Aqua-Bioscience and Industry 196 Yasaka, Abashiri, Hokkai Japan 099-2493.
DCorresponding author. Email: [email protected]
Abstract. One of the key features of a Bayesian stock assessment is that the modeller needs to provide knowledge on
model parameters. Priors summarise modellers’ understanding of model parameters and are often defined by a probability distribution function. Priors are often mis-specified with arbitrary and unrealistic accuracy and precision in perceiving the state of nature for the parameters as a result of our limited understanding of fisheries ecosystems. Commonly used probability functions such as normal distribution functions tend to be sensitive to prior mis-specification, resulting in large uncertainty and/or errors in Bayesian stock assessment. Fat-tailed functions such as the Cauchy distribution function have been found to be robust to prior mis-specification. Using the Maine sea urchin fishery as an example, we evaluated the impacts of mis-specification in defining the prior distributions on Bayesian stock assessment. The present study suggests that the quantification of priors with a Cauchy distribution tends to be robust to the prior mis-specification. Given our limited understanding of fisheries a function such as the Cauchy distribution function that is robust to prior mis-specification tends to be more desirable. Future studies should explore the use of other fat-tailed distribution functions for quantifying priors in fisheries stock assessment.
Additional keywords: Bayesian stock assessment, Cauchy distribution, prior, prior mis-specification, robust, uncertainty.
Introduction
Understanding fish population dynamics is essential in develop-ing an effective fisheries management policy (Smith et al. 1993; Engen et al. 1997; Walters 1998). This is often accomplished by a reliable estimation of vital parameters and their associated uncer-tainties (Chen and Paloheimo 1998; Walters 1998), which is one of the key factors that ensure the achievement of the defined management objectives (NRC 1997, 1999; Walters 1998).
In general, there are two statistical approaches that can be used for parameter estimation in fisheries stock assessment: frequentist and Bayesian inferences. The statistical problem is similar for these two approaches; both are used to make statisti-cal inferences about unknown parameters in the model (Berger 1985; Box and Tiao 1992). Frequentist inference is commonly used in fisheries studies (Hilborn and Walters 1992). It assumes that parameters being estimated are fixed constants, and data are random observations from some unknown statistical pop-ulations (Cox and Hinkley 1974; Ellison 1996). An objective function needs to be defined based on assumptions made on random variables (Hilborn and Walters 1992; Chen and Palo-heimo 1998). Parameters and their confidence intervals can then be estimated by optimising the objective function (Fournier and Archibald 1982; Deriso et al. 1985). Bayesian inference assumes that parameters are random, as opposed to constant parameters
for frequentist inference. The Bayesian method uses a probability rule (Bayes’ theorem) to calculate a ‘posterior distribution’ from the observed data and a ‘prior distribution’, which summarises the prior knowledge of the parameters (Dennis 1996; Taylor et al. 1996). It has been widely used in fisheries stock assessment (Hilborn et al. 1993; Walters and Ludwig 1994; Kinas 1996; McAllister and Kirkwood 1998; Chen and Hunter 2003).
In addition to input data sampled from fish stocks and fish-eries, Bayesian stock assessment requires an extra piece of input information – that is, knowledge of model parameters before the stock assessment. The knowledge, referred to as priors, is usu-ally quantified with a statistical distribution function, and often derived from biological and ecological theory, experience from other fisheries, historical information, previous studies of the same or similar fisheries, fishermen’s experience and scientists’ insights on the fisheries. The most commonly used statistical functions in quantifying prior information of model parameters include normal (or log-normal) and uniform (or log-uniform) distribution functions. Because of biological limitations, these distribution functions are often truncated with an upper and a lower boundary value for avoiding the presence of biologically unrealistic values in priors.
Prior information about a model parameter can be inaccu-rate or mis-specified in fisheries owing to, for example, the
all-too-common problem that stock assessment scientists choose biased priors (McAllister et al. 1994; Punt and Hilborn 1997; McAllister and Kirkwood 1998; Chen et al. 2000) and overly precise priors with prior variances being much smaller than their real uncertainties (Walters and Ludwig 1994; Adkison and Peter-man 1996; McAllister and Kirkwood 1998; Chen et al. 2000). This mainly results from our limited understanding of fisheries and their aquatic ecosystem, and the limited time periods for which we have the records. Standard choices of probability func-tions often result in conjugate priors, which are known to be non-robust to prior mis-specification (Berger 1994; Chen et al. 2000).
The mis-specification of priors can lead to large errors in Bayesian stock assessment. In a well designed Bayesian stock assessment, the sensitivity of posterior distributions to prior dis-tributions for vital stock parameters are often evaluated with a few different choices of priors involved. However, because the ‘true’ prior distributions are unknown, the sensitivity-test approach may not reveal the magnitude of errors resulting from the prior mis-specification. Few studies have been carried out to systematically evaluate the impacts of prior mis-specification on the Bayesian stock assessment and to identify distribution func-tions that are robust to mis-specification of priors (Chen et al. 2000).
Fat-tailed distribution functions have been suggested to replace normal distribution function in formulating prior dis-tributions (Berger 1994). Such an approach tends to increase the robustness of Bayesian inference with respect to prior mis-specifications (Zellner 1976; Berger 1985, 1994; Berger and Robert 1990). However, few studies have been carried out to evaluate the robustness of fat-tailed prior distribution func-tions with respect to prior mis-specification in fisheries stock assessment (Chen et al. 2000).
In the present study, we evaluated the impacts of mis-specification of priors quantified with different distribution functions on Bayesian fisheries stock assessment. The robust-ness of different prior distribution functions was evaluated with respect to prior mis-specification. We then proposed a gen-eral approach that could be used to reduce the sensitivity of Bayesian stock assessment to prior mis-specification. Limited by the number of fisheries models and fisheries that could be included in the present study, we used a length-structured popu-lation dynamic model developed for the sea urchin fishery in Maine, as an example. The number of probability functions evaluated in the present study is also limited, only including those commonly used in fisheries and Cauchy function that is found to be robust to prior mis-specification in other fields (Berger 1994). However, given previous studies of identifying probability functions that are robust to prior mis-specification (see Berger 1994 for review), the results derived in the present study should be applicable to other stock assessment models and fisheries.
Bayesian approach potentially robust to prior mis-specification
For a stock assessment model described as:
Y= f(x, β) + ε, (1)
the posterior distribution of parameter β can be estimated as: p(β|x) = p(x|β)p(β)
p(x|β)p(β), (2)
where p(X|β) is the likelihood function, and p(β) are pri-ors for parameter β. The distribution functions commonly used to describe priors include uniform distribution function p(β)⊂ U(µL, µU) with lower (µL) and upper (µU) bounds
being decided based on our understanding of the parameter, log-uniform distribution, normal distribution p(β)⊂ N(µ, σ2), and log-normal distribution. Parameter µ is believed to be the most likely value for parameter β, and σ is a value that defines the stan-dard error for the distribution. To avoid biologically unrealistic values, the normal or log-normal distribution is often truncated with the upper and lower boundary values. Normal or log-normal priors are often referred to as informative priors and uniform or log-uniform priors are non-informative priors (Hilborn and Walters 1992), although it is clear that both uniform-based and normal-based priors contain useful information.
Although various arguments can be given for the above stan-dard choices for prior p(β) in fisheries, the fact remains that they are quite arbitrary. Furthermore, the standard choices for the pri-ors in Eqn 2 often result in conjugate pripri-ors, which are known to be non-robust (Chen et al. 2000, 2003). Conjugate priors can have a pronounced effect on the posteriors even if the data are in conflict with the specified prior information (Berger 1985, 1994).
The prior information about parameter β is likely to be mis-specified as a result of our limited understanding of fish life history and fishery process. Whether the mis-specification of priors can be automatically discounted in the analysis is the focus of the present study. Studies in other areas (Berger 1985, 1994) suggest that a fat-tailed distribution function such as Cauchy distribution might be robust to prior mis-specification. Thus, we include the Cauchy distribution in the present study.
Cauchy distribution that matches the quantiles of a nor-mal distribution has been suggested for quantifying priors (Berger 1994). This distribution is often considered as a possible model wherever one needs a density function with heavier tails than the normal distribution allows. The two-parameter Cauchy distribution, written as C(θ, λ2), is given by the density:
f(x)= πλ 1+ x− θ λ 2−1 , (3)
where µ is the location parameter, describing the location of the peak of the distribution, and σ is the scale parameter that speci-fies the half-width at half-maximum. To replace a normal distri-bution N(µ, σ2) with a Cauchy distribution C(σ, λ2) that has the same quantiles as the normal distribution in describing priors, the location and scale parameters for the Cauchy distribution should be calculated (Berger 1994; Chen et al. 2000) as:
θ= µ λ = 0.675σ. (4)
The parameters defining the Cauchy distribution were calcu-lated from the parameters defining the normal priors using Eqn 4 to ensure that the quantiles of these two distributions are the same
(thus ensuring the fairness of comparative studies between these two distributions in the present study (Chen et al. 2000)). Example fisheries used in the evaluation
The green sea urchin (Strongylocentrotus droebachiensis) fish-ery in the state of Maine was used as an example fishfish-ery in the present study. The fishery took off in the late 1980s as a result of expanding export markets, and landings peaked in 1992 (Chen and Hunter 2003). Since 1992, the fishery has experienced substantial declines in landings, mainly resulting from a large decrease in urchin stock abundance (Chen and Hunter 2003).
A length-structured model was developed to describe sea urchin population dynamics (Chen and Hunter 2003; Kanaiwa et al. 2005) because sea urchins are difficult to age and have large variations in growth among individuals (Quinn and Deriso 1999; Russell and Meredith 2000). The model consists of vari-ous submodels to describe key life history processes (e.g. growth, recruitment, maturation and mortality) and fisheries processes (e.g. fishing selectivity and fishing mortality). These submodels were used to generate a model fishery. The dynamics of the model fishery were driven by reported catch. Various fisheries statis-tics such as catch size composition and stock biomass could be predicted for the model fishery. Four observational models were also established to relate observed and predicted catch per unit of effort (CPUE), observed and predicted catch size compositions, observed and predicted survey abundance indices, and observed and predicted survey size composition data. These observa-tional models were used for establishing objective functions for parameter estimation. The bin width of size class used in the size-structured model was set between 40 mm and 100 mm with an interval of 1 mm. The performance of the model in describ-ing the population dynamics of sea urchin was evaluated with respect to different recruitment dynamics (Kanaiwa et al. 2005). That study suggests that the length-structured model can capture the sea urchin stock dynamics well. The detailed description of the model is documented in Kanaiwa et al. (2005).
Bayesian inference was used in the assessment. Data avail-able for the assessment include landing, size composition of catch, fishery-independent survey abundance index, size com-position of survey catch and biological information such as size-specific maturation and weight–length relationship. The detailed description of the data can be found in Chen and Hunter (2003) and Kanaiwa et al. (2005). A Student t-distribution func-tion was used in formulating the likelihood funcfunc-tion in the Bayesian inference to reduce possible impacts of outliers in the data (Chen and Hunter 2003). Although the impacts of out-liers on the Bayesian inference were considered in Chen and Hunter (2003) and Kanaiwa et al. (2005), the impacts of pos-sible prior mis-specifications for key model parameters had not been studied.
In the present study we developed various scenarios, for which different priors were assumed, to test the potential impacts of possible prior mis-specification on the Bayesian inference when different distribution functions were used to quantify priors (Table 1). The scenarios considered include the variability and mean of prior distributions were ‘correctly’ defined, ‘underesti-mated’ and ‘overesti‘underesti-mated’ (Table 1). For each prior configura-tion described above, we considered five distribuconfigura-tional funcconfigura-tions
for quantifying the priors: normal, normal, Cauchy, log-Cauchy and uniform distributions. There are many parameters in the model, and it is impossible to evaluate the above prior configurations for all of them. We considered applying all prior configurations defined in Table 1 to natural mortality while set-ting priors of all other parameters to uniform distributions. We chose natural mortality because it is important in determining population dynamics and is often not well known (Hilborn and Walters 1992; Kanaiwa et al. 2005). Except for the natural mor-tality, all other settings in the Bayesian parameter estimates were kept constant for all the scenarios defined in Table 1.
With such a design, we can ensure that differences in posterior distributions among scenarios were solely from using different prior configurations for the natural mortality. Thus, an evaluation of such differences can reveal how different prior specifications may influence stock assessment and how different prior distri-bution functions may respond to different prior specifications. Alternatively we can also apply the prior configurations defined in Table 1 to all other parameters. This, however, makes the inter-pretation of results difficult because it is impossible to identify whether differences in posterior distributions among scenarios result from using different prior functions.
Because priors define a modeller’s perceptions about possible parameter values, there are no ‘true’ or ‘correct’ values for priors by definition. In order to compare differences in posterior distri-butions resulting from using different prior functions, we have to establish a reference posterior distribution. In the present study, we considered that the reference posterior distribution was esti-mated from the base or default scenario defined as the prior of natural mortality quantified using the normal distribution with ‘correctly’ defined standard deviation and mean (i.e. Scenario A; Table 1), and compared the reference posterior distributions with posterior distributions derived for other prior scenarios defined in Table 1. These other scenarios included mean (location) and standard deviation being smaller or larger than those defined in Scenario A, representing ‘under-specified’ or ‘over-specified’ mean (location) and standard deviation of priors respectively.
The Markov Chain Monte Carlo (MCMC) simulation approach was used to estimate posterior distributions for model parameters. The Hastings–Metropolis algorithm was used in the estimation. The estimation started from the parameters at the mode of the posterior distribution identified by minimising the objective function consisting of the negative log-likelihood com-ponents and the prior probability contributions. The lag between samples was 200. The model was implemented in AUTODIFF model builder (Fournier 1996). Previous studies suggest that half a million MCMC runs can yield stable results (Chen and Hunter 2003; Kanaiwa et al. 2005). Thus, half a million MCMCs were run and results for every 200 runs were saved, resulting in 2500 sets of parameter estimates being saved for estimating posterior distributions.
For a given output parameter, we calculated the following difference index (DI) for each tested scenario to compare the means of the posterior distributions derived for Scenario A and the tested scenario:
Difference index(DI)= i βx(i) N − i βA N , (5)
Table 1. Design of the simulation study
All of the specifications described are for priors of natural mortality M in the length-structured model for the Maine sea urchin. A detailed description of model and data can be found in Chen and Hunter (2003) and Kanaiwa et al. (2005). For the Cauchy distribution function, θ is the location parameter, specifying the location of the peak of the distribution; and λ is the scale parameter that specifies the half-width at half-maximum. Parameters θ and λwere calculated from the parameters defining the normal priors (mean of µ and standard deviation of σ) using
Eqn 4 to ensure that the quantiles of these two distributions are the same Scenario M distribution function Prior distribution
µ(θ) σ(λ) Lower Upper
A (‘true’) Log-normal 0.1 0.2 0.01 0.8
B(I) (σ under-specified) Normal 0.1 0.1 0.01 0.8
B(II) (σ over-specified) Normal 0.1 0.3 0.01 0.8
B(III) (µ under-specified) Normal 0.05 0.2 0.01 0.8
B(IV) (µ over-specified) Normal 0.2 0.2 0.01 0.8
B(V) (σ under- & µ over-specified) Normal 0.2 0.1 0.01 0.8 C(I) (σ under-specified) Log-normal 0.1 0.1 0.01 0.8 C(II) (σ over-specified) Log-normal 0.1 0.3 0.01 0.8 C(III) (µ under-specified) Log-normal 0.05 0.3 0.01 0.8 C(IV) (µ over-specified) Log-normal 0.2 0.2 0.01 0.8 C(V) (σ under- & µ over-specified) Log-normal 0.2 0.1 0.01 0.8 D(I) (λ under-specified) Cauchy 0.1 0.1*0.675 0.01 0.8 D(II) (λ over-specified) Cauchy 0.1 0.3*0.675 0.01 0.8 D(III) (θ under-specified) Cauchy 0.05 0.3*0.675 0.01 0.8 D(IV) (θ over-specified) Cauchy 0.2 0.2*0.675 0.01 0.8 D(V) (λ under- & θ over-specified) Cauchy 0.2 0.1*0.675 0.01 0.8 E(I) (λ under-specified) Log-normal 0.1 0.1*0.675 0.01 0.8 E(II) (λ over-specified) Log-normal 0.1 0.3*0.675 0.01 0.8 E(III) (θ under-specified) Log-normal 0.05 0.3*0.675 0.01 0.8 E(IV) (θ over-specified) Log-normal 0.2 0.2*0.675 0.01 0.8 E(V) (λ under- & θ over-specified) Log-normal 0.2 0.1*0.675 0.01 0.8
F (true range) Uniform 0.01 0.8 –
F(I) (range under-specified) Uniform 0.01 0.5 –
F(II) (range over-specified) Uniform 0.01 1.0 –
where i is the ith run saved in MCMC, N is the total num-ber of runs saved in MCMC (i.e. 2500 in the present study), β is the estimated parameter, x is prior scenarios defined in Table 1, and A denotes base Scenario A (Table 1). The above difference index was calculated for the mean of the posterior. We also calculated the DI for median, 97.5th percentile value and 2.5th percentile value of posterior distribution. Thus, for a given scenario, the larger the DI, the larger the impact that the prior ‘mis-specification’ has had on the Bayesian estimation of model parameters. A distribution function with a small DI value for all simulation scenarios may suggest that it is robust to prior-mis-specification.
Results
The mean of the posterior distribution for M in Scenario A was 0.2079 (Fig. 1), much larger than the mean of prior distribu-tion defined for M in Scenario A (Table 1). This suggests that data used in the assessment was influential on the estimation of posterior distribution of M. The 95% creditability interval (CI) of the M posterior distribution was from 0.1787 to 0.2403. The legal stock biomass increased initially in the late 1980s, and reached the highest level in 1992, followed by a large and continuous decrease. The CIs of posterior distribution for legal
stock biomass were large in the initial time period and most recent years (Fig. 1). The recruitment also had a similar tempo-ral pattern as the legal stock biomass. The CIs of recruitment posterior distribution were large in the early years. The mean and 95% CIs of posterior distributions were 1548 mt and from 1186 to 1989 mt for maximum sustainable yield (MSY), 0.21 and from 0.18 to 0.25 for FMSYrespectively (Fig. 1).
For Scenarios B(I) to B(V) (Table 1), priors were assumed to be normally distributed, representing the ‘mis-specification’ of prior distribution function (Table 1). For legal stock biomass pos-terior estimates, Scenario B(III) had the largest difference from Scenario A, followed by B(I) (Fig. 2a). B(II) had the smallest dif-ference index, followed by B(V), suggesting that their means of posterior distribution were almost identical to that of Scenario A. Similar results could also be observed for recruitment (Fig. 2b). Scenario B(III) also had the largest difference from Scenario A in the mean, median, 97.5th and 2.5th percentiles of natural mor-tality posterior distribution (Fig. 2c), MSY posterior distribution (Fig. 2d) and FMSYposterior distribution (Fig. 2e), followed by Scenario B(I). Thus, for Scenario B, for which the priors for M were ‘mis-specified’, underestimating standard deviation and underestimating mean of M priors had the largest impacts on the estimation of posterior distributions of the key fisheries parame-ters. The ranking of the DI values among Scenarios B(I) to B(V)
0.0 0.1 0.1 0.2 0.2 0.3 0 500 1000 1500 2000 2500 3000 MSY (mt) P o ster ior distr ib ution 0.00 0.05 0.10 0.15 0.20 0 ⫺0.5 ⫺1 0.5 1 Natural mortality Pr ior distr ib ution
Lower limit Upper limit
0.00 0.05 0.10 0.15 0.20 0.25 0 0.1 0.2 0.3 0.4 Natural mortality P oster ior distr ib ution 0.0 0.1 0.2 0.3 0.4 0.5 0 0.1 0.2 0.3 0.4
Fishing mortality at MSY (FMSY)
P oster ior distr ib ution (a) 0 5 10 15 20 25 30 35 1985 1990 1995 2000 2005 Time (year) Legal biomass (1000 mt) Mean 0.975 0.025 0 2 4 6 8 10 12 1985 1990 1995 2000 2005 Time (year) Recr uits (1000 mt) Mean 0.975 0.025 (b) (c) (d ) (e) (f )
Fig. 1. Mean and 97.5th and 2.5th percentiles creditability intervals of (a) posterior distribution of legal stock biomass and (b) recruitment, (c) prior distribution of natural mortality M and (d) posterior distributions of natural mortality, (e) FMSY, and (f ) maximum sustainable yield (MSY) estimated for the green sea urchin using the Bayesian method with the priors of natural mortality being set according to Scenario A defined in Table 1.
was consistent for all of the key fisheries parameters included in the present study (Fig. 2).
For Scenarios C(I) to C(V), priors were assumed to be log-normally distributed, which was the same as the type of prior distribution function used to define M in base Scenario A (Table 1). Scenario C(I) ‘under-specified’ standard deviation of the M prior distribution, and had the largest departure from Sce-nario A in the mean of legal stock biomass posterior estimates (Fig. 3a), followed by Scenario C(III), which ‘under-specified’ the mean of the M priors (Table 1). The difference in the mean of
legal stock biomass posterior estimates was smaller for Scenarios C(II), C(IV) and C(V) (Fig. 3a). For recruitment posterior esti-mates, the difference for Scenario C(I) was much larger than that for other C Scenarios (Fig. 3b). Similar results could be observed for the mean, median, 97.5th and 2.5th percentiles of natural mortality posterior distribution (Fig. 3c); MSY posterior distribution (Fig. 3d); and FMSYposterior distribution (Fig. 3e). Thus, for Scenarios C for which the type of priors for M was ‘correctly’ specified compared with base Scenario A, underes-timating standard deviation of M priors had the largest impacts
⫺40 ⫺20 0 20 40 1985 1990 1995 2000 2005 Year
Difference in legal biomass (%)
B(I) B(II) B(III) B(IV) B(V) B(I) B(II) B(III) B(IV) B(V) ⫺40 ⫺20 0 20 40 1985 1990 1995 2000 2005 Year Difference in recruitment (%) ⫺60 ⫺40 ⫺20 0 20 40 60
B(I) B(II) B(III) B(IV) B(V)
Scenario Difference in M (%) ⫺60 ⫺40 ⫺20 0 20 40 60
B(I) B(II) B(III) B(IV) B(V)
Scenario Difference in F MSY (%) ⫺80 ⫺60 ⫺40 ⫺20 0 20 40 60
B(I) B(II) B(III) B(IV) B(V)
Scenario Difference in MSY (%) (a) (c) (e) (d ) (b) Mean Median 97.5% 2.5% Mean Median 97.5% 2.5% Mean Median 97.5% 2.5%
Fig. 2. Percentage of differences calculated using Eqn 1 for the mean of posterior distributions for (a) legal stock biomass and (b) recruitment and (c) for mean, median, 97.5 percentile and 2.5 percentiles of the posterior distributions for natural mortality, (d) FMSY, and (e) maximum sustainable yield (MSY) for Scenario B(I) to Scenario B(V) defined in Table 1.
on the estimation of posterior distributions of the key fisheries parameters.
For Scenarios D(I) to D(V), priors were assumed to follow the Cauchy distribution (Table 1). Scenarios D(I) to D(V) had similar difference indices for all the key fisheries parameters included in the present study (Fig. 4). This suggested that the estimation of posterior distribution for these parameters were robust to the parameterisation of location and scale values for the Cauchy distribution. Thus, ‘mis-specification’ of location and mean did not greatly influence the posterior estimation in the stock assessment when the Cauchy distribution function was used.
For Scenarios E(I) to E(V), M priors were assumed to follow log-Cauchy distribution (Table 1). Similar results were observed for Scenarios D(I) to D(V). Difference index was similar for Scenarios E(I) to E(V) for all of the key fisheries parameters included in the present study (Fig. 5). Thus the estimation of posterior distribution for the parameters was robust to the param-eterisation of location and scale values for the log-Cauchy distri-bution. The ‘mis-specification’ of location and mean of M priors did not greatly influence the posterior estimation in the stock assessment when the Cauchy distribution function was used.
For Scenarios F to F(II), the prior distribution of M was assumed to follow a uniform distribution with different upper
⫺40 ⫺20 0 20 40 1985 1990 1995 2000 2005 Year
Difference in legal biomass (%)
C(I) C(II) C(III) C(IV) C(V) C(I) C(II) C(III) C(IV) C(V) ⫺40 ⫺20 0 20 40 1985 1990 1995 2000 2005 Year Difference in recruitment (%) (a) (b) (c) (e) (d) ⫺40 ⫺30 ⫺20 ⫺10 0 10 20 30 40
C(I) C(II) C(III) C(IV) C(V)
Scenario Difference in M (%) ⫺40 ⫺30 ⫺20 ⫺10 0 10 20 30 40
C(I) C(II) C(III) C(IV) C(V)
Scenario Difference in F MSY (%) ⫺40 ⫺30 ⫺20 ⫺10 0 10 20 30 40
C(I) C(II) C(III) C(IV) C(V)
Scenario Difference in MSY (%) Mean Median 97.5% 2.5% Mean Median 97.5% 2.5% Mean Median 97.5% 2.5%
Fig. 3. Percentage of differences calculated using Eqn 1 for the mean of posterior distributions for (a) legal stock biomass and (b) recruitment and (c) for mean, median, 97.5 percentile and 2.5 percentiles of the posterior distributions for natural mortality, (d) FMSY, and (e) maximum sustainable yield (MSY) for Scenario C(I) to Scenario C(V) defined in Table 1.
and lower boundaries. Scenario F, which had the same lower and upper boundaries as those sets for Scenario A, had almost identical differences to Scenario F(I), which had a smaller upper boundary (i.e. under-specified upper boundary value; Fig. 6). Scenario F(II) had the smallest difference from Scenario A, compared with Scenarios F and F(I) in estimating posterior dis-tributions of legal stock biomass, recruitment, M, FMSY and MSY (Fig. 6).
Discussion
Prior distributions defined for model parameters are an extra piece of information in Bayesian stock assessment compared
with the input data requirement for the traditional frequentist stock assessment (Hilborn and Walters 1992). Although prior distribution reflects modellers’ understanding and knowledge of the model parameters and may not have correct values, for a given dataset and model, a specification of prior distribution that lies close to the posterior distribution would be more valu-able and informative in the stock assessment. A prior that is far from the state of nature, or worse excludes that the state of nature, may result in erroneous conclusions with respect to the status of fish populations, leading to mis-management. A well defined prior close to the state of nature can reduce the uncertainty in stock assessment and provide managers with bet-ter information on the status of fish stocks, thus reducing the risk
⫺40 ⫺20 0 20 40 1985 1990 1995 2000 2005 Year
Difference in legal biomass (%)
⫺40 ⫺20 0 20 40 1985 1990 1995 2000 2005 Year Difference in recruitment (%) (a) (c) (e) (d ) (b) 0 20 40 60
D(I) D(II) D(III) D(IV) D(V)
Scenario Difference in M (%) 0 20 40 60
D(I) D(II) D(III) D(IV) D(V)
Scenario Difference in F MSY (%) 0 20 40 60
D(I) D(II) D(III) D(IV) D(V)
Scenario Difference in MSY (%) D(I) D(II) D(III) D(IV) D(V) D(I) D(II) D(III) D(IV) D(V) Mean Median 97.5% 2.5% Mean Median 97.5% 2.5% Mean Median 97.5% 2.5%
Fig. 4. Percentage of differences calculated using Eqn 1 for the mean of posterior distributions for (a) legal stock biomass and (b) recruitment and (c) for mean, median, 97.5 percentile and 2.5 percentiles of the posterior distributions for natural mortality, (d) FMSY, and (e) maximum sustainable yield (MSY) for Scenario D(I) to Scenario D(V) defined in Table 1.
of overexploitation and improving fisheries management (NRC 1997, 1999). Clearly, like the quality of other input data in stock assessments, the quality of prior information on model parame-ters is also critically important with respect to the outcome of a Bayesian stock assessment (Chen et al. 2000). The present study suggests that a ‘mis-specified’ prior in its statistical property for one key parameter (i.e. M) could have a large impact on the estimation of posterior distribution for key fisheries parameters. Only one model and fishery was evaluated in the present study. The model considered was a complicated length-structured model requiring extensive input data (Kanaiwa et al.
2005) and we only considered priors for one model parame-ter. This may make the priors less influential in determining the posterior distribution. Large differences were observed in mean values between priors and posteriors for natural mortal-ity, suggesting that priors of M were less important and other input data were highly influential in estimating posterior dis-tributions (Berger 1994; Chen et al. 2000). Thus, the impacts of priors on the estimation of posterior distributions could be larger than that identified in the present study. However, it is clear that even with this type of setting the present study suggests that mis-specification of priors can result in large differences in
⫺40 ⫺20 0 20 40 1985 1990 1995 2000 2005 Year
Difference in legal biomass (%)
E(I) E(II) E(III) E(IV) E(V) E(I) E(II) E(III) E(IV) E(V) (a) ⫺40 ⫺20 0 20 40 1985 1990 1995 2000 2005 Year Difference in recruitment (%) (b) 0 20 40 60
E(I) E(II) E(III) E(IV) E(V)
Scenario Difference in M (%) (c) 0 20 40 60
E(I) E(II) E(III) E(IV) E(V)
Scenario Difference in F MSY (%) (d ) 0 20 40 60
E(I) E(II) E(III) E(IV) E(V)
Scenario Difference in MSY (%) (e) Mean Median 97.5% 2.5% Mean Median 97.5% 2.5% Mean Median 97.5% 2.5%
Fig. 5. Percentage of differences calculated using Eqn 1 for the mean of posterior distributions for (a) legal stock biomass and (b) recruitment and (c) for mean, median, 97.5 percentile and 2.5 percentiles of the posterior distributions for natural mortality, (d) FMSY, and (e) maximum sustainable yield (MSY) for Scenario E(I) to Scenario E(V) defined in Table 1.
posterior distributions when normality-based priors are used, in particular when priors are overly precise (small standard devi-ations). For simpler models with less extensive input data, the priors are likely to play a more important role in estimating pos-terior distributions (Chen et al. 2000). We can expect that the mis-specification of priors can have larger impacts than those identified in the present study.
The present study suggests that under-specified mean and standard deviation of normality-based priors tend to yield large difference indices. An overly precise (i.e. small standard devi-ation) prior limits the chance of the Bayesian estimator in
evaluating large and small values of priors because the prob-ability of having a value far away from the mean drops quickly for a normal distribution. This excludes the values that are away from the mean, leading to a large difference index for normality-based priors. The large difference index for an under-specified mean (i.e. 0.05 in Table 1) may result from the fact that the mean of posterior distribution was over 0.2 for natural mortality, much higher than 0.1 defined for the base Scenario A. The mean value of 0.05 in normal priors makes the chance of evaluating large values less likely compared with the mean of 0.1 for the base scenario with the thin tails of normality-based priors. For the
⫺5 0 5 10 15 20 25 30 1985 1990 1995 2000 2005 Year D if fe rence i n l egal bi om ass ( % ) Scenario F Scenario F(I) Scenario F(II) Scenario F Scenario F(I) Scenario F(II) 0 5 10 15 20 25 30 35 1985 1990 1995 2000 2005 Year D if fe re nce i n r e cr ui tm ent ( % ) (a) 0 20 40 60 F F(I) F(II) Scenario D if fe rence i n FMS Y (% ) 0 20 40 60 F F(I) F(II) Scenario D if fe re n c e i n M S Y ( % ) 0 20 40 60 F F(I) F(II) Scenario D if fe re n c e i n M ( % ) MeanMedian 97.5% 2.5% Mean Median 97.5% 2.5% Mean Median 97.5% 2.5% (b) (c) (e) (d)
Fig. 6. Percentage of differences calculated using Eqn 1 for the mean of posterior distributions for (a) legal stock biomass and (b) recruitment and (c) for mean, median, 97.5 percentile and 2.5 percentiles of the posterior distributions for natural mortality, (d) FMSY, and (e) maximum sustainable yield (MSY) for Scenario F to Scenario F(II) defined in Table 1.
Cauchy distribution, this fast-decreasing probability with val-ues away from the mean of prior distribution is less of a problem because it has much fatter tails, which allow the chance of eval-uating a value away from the location of Cauchy distribution more likely by the Bayesian estimator. Thus, the posterior distri-butions derived with priors defined by the Cauchy distridistri-butions tend to be much more robust to ‘mis-specification’ of priors.
For normally distributed priors, the present study suggests that it is important to identify the type of distribution function as well as the mean and standard deviation. The normal distribution
function, which represents a mis-match to the log-normal func-tion defined for Scenario A, tends to yield large DIs compared with the log-normal distribution function. This result, however, does not necessarily suggest the log-normal function is better than the normal distribution. It may only reflect the fact that the log-normal distribution tends to skew to the right (long tail on the right side of the distribution) and is more likely to have large values. This is consistent with the input data that push for a large M in the posterior distribution. Thus the DI is small for log-normal distributions. In the case of data pushing for the mean M
of a posterior distribution smaller than that defined for priors, the normal distribution would have smaller DIs.
We used prior settings defined in Scenario A as the default pri-ors (i.e. base case pripri-ors) and compared the results derived from other prior settings with those derived for Scenario A. Because there are no wrong or right priors, one may argue that we should use prior settings in other scenarios as default priors. We did so, and found that the results were consistent no matter which prior settings defined in Scenarios A, Scenarios B(I) to B(V), and Scenarios C(I) and C(V) were used as base case priors. This sug-gests that the results derived here are independent of simulation settings, and thus can be applied to other assessments.
By comparing the results between Scenarios F v. Scenarios B and C, we can conclude that a mis-specified informative prior could be worse than a non-informative prior. This suggests that in a study where there is a lack of understanding of model para-meters, we may be better off to assign a non-informative prior.
The conclusion of the present study was derived from a simu-lation study that involves only one set of data with one popusimu-lation model set. A question that may arise is whether the conclusion is conditional to data, model structure and/or simulation config-uration used in the present study. To address such a question, we can try to run an extensive simulation study including more sets of data and models with different simulation configurations. However, even with more extensive simulations it is impossible to cover all possible combinations of data, models and simulation configurations. Studies in other areas suggest that priors quanti-fying with a fatailed distribution function such as Cauchy and t-distribution functions (with a small degree of freedom) tend to be robust to prior mis-specification (e.g. Zellner 1976; Berger 1985, 1994; Berger and Robert 1990; Chen et al. 2000). The results derived in the present study are consistent with those derived in previous studies, suggesting that the conclusion derived in the present study is likely to be applicable to other stock assessment models and fisheries. Having said this, we still suggest that more studies should be carried out to evaluate the robustness of the Cauchy prior distribution with respect to prior mis-specification for different fisheries and stock assessment models.
The Cauchy distribution shows a consistent positive DI value across different simulation configurations. This is different from the results for other prior distributions included in the present study, which have both negative and positive DI values. The DI, as defined in Eqn 5, is an index to compare the difference in means of posterior distributions between a given Scenario and Scenario A (i.e. ‘true’ value). It measures possible biases of posteriors resulting from prior mis-specification. For a given type of prior distribution, there is no connection or association between scenarios of different means or location and variances (or scale parameters for Cauchy). Thus, the DI value does not necessarily have to have both negative and positive values for different specifications within each type of prior distribution (i.e. Scenarios B, C, D, E or F).
The present study suggests that the Cauchy distribution, whether it is on log or original scale, is robust to the mis-specification of statistical properties of prior distributions. For different combinations of mis-specification of location and scale parameters, the changes in posterior distribution were small. However, for normal or log-normal, a mis-specification of mean and standard deviation can result in large changes in posterior
distributions of key fisheries parameters. In practice, different modellers are likely to have different interpretations of model parameter values and subsequently assign different priors. This may lead to large uncertainty in the posterior distributions and different subsequent interpretations of fish stock status. Given the limitations we may have in understanding fish stocks and ecosystems and likelihood the of ‘mis-specification’ of priors, the use of the Cauchy distribution function in defining infor-mative priors in fisheries is perhaps more desirable. Future studies should explore the use of other fat-tailed distribution functions such as t-distribution functions with small degrees of freedom (Berger 1994) for quantifying priors in fisheries stock assessment.
Acknowledgement
Financial support of the present study was partially provided by the National Science Council of Taiwan (NSC95-2811-B-002-024), National Taiwan Uni-versity, Maine Sea Grant and Maine Department of Marine Resources. We would like to thank Dr Paul Breen and Dr Neil Andrew for discussing the robust priors many years ago.
References
Adkison, M. D., and Peterman, R. M. (1996). Results of Bayesian methods depending on details of implementation: an example of estimating salmon escapements. Fisheries Research 25, 155–170. doi:10.1016/0165-7836(95)00405-X
Berger, J. O. (1985). ‘Statistical Decision Theory and Bayesian Analysis.’ (Springer-Verlag: New York.)
Berger, J. O. (1994). An overview of robust Bayesian analysis. Test 3, 5–124. doi:10.1007/BF02562676
Berger, J. O., and Robert, C. (1990). Subjective hierarchical Bayes estimation of a multivariate normal mean: on the frequentist interface. The Annals of Statistics 18, 617–651. doi:10.1214/AOS/1176347619
Box, G., and Tiao, G. (1992). ‘Bayesian Inference in Statistical Analysis.’ (Addison-Wesley: Reading, MA.)
Chen, Y., and Hunter, M. (2003). Assessing the green sea urchin (Strongylo-centrotus droebachiensis) stock in Maine, USA. Fisheries Research 60, 527–537. doi:10.1016/S0165-7836(02)00082-6
Chen, Y., and Paloheimo, J. E. (1998). Can a more realistic model error structure improve parameter estimation in modelling the dynamics of fish populations? Fisheries Research 38, 9–19. doi:10.1016/S0165-7836(98)00115-5
Chen, Y., Breen, P., and Andrew, N. (2000). Impacts of outliers and mis-specification of priors on Bayesian fisheries stock assessment. Canadian Journal of Fisheries and Aquatic Sciences 57, 2293–2305. doi:10.1139/CJFAS-57-11-2293
Chen, Y., Jiao, Y., and Chen, L. (2003). A general approach to developing robust frequentist and Bayesian stock assessment methods in fisheries. Fish and Fisheries 4, 105–120. doi:10.1046/J.1467-2979.2003.00111.X Cox, D. R., and Hinkley, D. V. (1974). ‘Theoretical Statistics.’ (Chapman and
Hall: London.)
Dennis, B. (1996). Discussion: should ecologists become Bayesians? Eco-logical Applications 6, 1095–1103. doi:10.2307/2269594
Deriso, R. B., Quinn, T. J., II, and Neal, P. R. (1985). Catch-age analysis with auxiliary information. Canadian Journal of Fisheries and Aquatic Sciences 42, 815–824.
Ellison, A. M. (1996). An introduction to Bayesian inference for ecological research and environmental decision-making. Ecological Applications
6, 1036–1046. doi:10.2307/2269588
Engen, S., Lande, R., and Saether, B.-E. (1997). Harvesting strategies for fluctuating populations based on uncertain population estimates. Journal of Theoretical Biology 186, 201–212. doi:10.1006/JTBI.1996.0356
Fournier, D. A. (1996). ‘AUTODIFF. A C++ array language extension with automatic differentiation for use in nonlinear modeling and statistics.’ (Otter Res. Ltd.: Nanaimo, BC, Canada.)
Fournier, D. A., and Archibald, C. (1982). A general theory for analyzing catch at age data. Canadian Journal of Fisheries and Aquatic Sciences
39, 1195–1207.
Hilborn, R., Pikitch, E. K., and Francis, R. C. (1993). Current trends in including risk and uncertainty in stock assessment and harvest decisions. Canadian Journal of Fisheries and Aquatic Sciences 50, 874–880. Hilborn, R., and Walters, C. J. (1992). ‘Quantitative Fisheries Stock
Assessment: Choice, Dynamics, & Uncertainty.’ (Chapman and Hall: New York.)
Kanaiwa, M., Chen, Y., and Hunter, M. (2005). An evaluation of a complex length-based fisheries stock assessment model for the green sea urchin fishery in Maine, USA. Fisheries Research 74, 96–115. doi:10.1016/ J.FISHRES.2005.03.006
Kinas, P. G. (1996). Bayesian fishery stock assessment and decision making using adaptive importance sampling. Canadian Journal of Fisheries and Aquatic Sciences 53, 414–423. doi:10.1139/CJFAS-53-2-414 McAllister, M. K., and Kirkwood, G. P. (1998). Using Bayesian decision
analysis to help achieve a precautionary approach for managing devel-oping fisheries. Canadian Journal of Fisheries and Aquatic Sciences 55, 2642–2661. doi:10.1139/CJFAS-55-12-2642
McAllister, M. K., Pikitch, E. K., Punt, A. E., and Hilborn, R. (1994). A Bayesian approach to stock assessment and harvest decisions using the sampling/importance resampling algorithm. Canadian Journal of Fisheries and Aquatic Sciences 51, 2673–2688.
NRC (National Research Council) (1997). ‘Improving Fish Stock Assess-ments.’ (National Academic Press: Washington D.C.)
http://www.publish.csiro.au/journals/mfr
NRC (National Research Council) (1999). ‘Sustaining Marine Fisheries.’ (National Academic Press: Washington D.C.)
Punt, A. E., and Hilborn, R. (1997). Fisheries stock assessment and decision analysis: the Bayesian approach. Reviews in Fish Biology and Fisheries
7, 35–63. doi:10.1023/A:1018419207494
Quinn, T. J., II, and Deriso, R. B. 1999. ‘Quantitative Fish Dynamics.’ (Oxford University Press: New York.)
Russell, M., and Meredith, R. (2000). Natural growth lines in echinoid ossi-cles are not reliable indicators of age: a test using Strongylocentrotus droebachiensis. Invertebrate Biology 119, 410–420.
Smith, S. J., Hunt, J. J., and Rivard, D. (1993). Risk evaluation and bio-logical reference points for fisheries management. Canadian Special Publication of Fisheries and Aquatic Science, 120.
Taylor, B. L., Wade, P. R., Stehn, R. A., and Cochrane, J. E. (1996). A Bayesian approach for classification criteria for Spectacled Eiders. Ecological Applications 6, 1077–1089. doi:10.2307/2269592
Walters, C. J. (1998). Evaluation of quota management policies for develop-ing fisheries. Canadian Journal of Fisheries and Aquatic Sciences 55, 2691–2705. doi:10.1139/CJFAS-55-12-2691
Walters, C. J., and Ludwig, D. (1994). Calculation of Bayes posterior proba-bility distribution for key population parameters:a simplified approach. Canadian Journal of Fisheries and Aquatic Sciences 51, 713–722. Zellner, A. (1976). Bayesian and non-Bayesian analysis of regression
mod-els with multivariate student-t error terms. Journal of the American Statistical Association 71, 400–405. doi:10.2307/2285322