# Patch occupancy models of metapopulation dynamics: efficient parameter estimation using implicit statistical inference.

INTRODUCTIONIn the past 10 yr, interest in metapopulation biology has rapidly increased, and empirical evidence for metapopulation dynamics has gradually accumulated. Metapopulation studies are now becoming more diversified, research in metapopulation genetics and evolution is expanding, and connections to other branches of ecology, such as landscape ecology and community ecology, are being sought (Hanski and Simberloff 1997, Harrison and Taylor 1997, Hanski 1998). In many studies, including those that are related to conservation biology, there is a need to use quantitative and predictive metapopulation models (McCullough 1996). As collecting empirical data from metapopulations requires substantial resources, the amount of data needed for successful model parameterization is an important consideration in choosing the modeling approach. So-called stochastic patch occupancy models (SPOM; Caswell and Etter 1993, Gyllenberg and Silvestrov 1994, Hanski 1994a, Day and Possingham 1995, Hanski 1997) do not include any description of local dynamics; only the presence of local populations in habitat patches is modeled. Major motivations for using SPOMs are the relative simplicity of these models and the prospect of parameterizing them with a reasonable amount of data. This study presents an efficient parameter estimation method for SPOMs, which is applied to a well-defined SPOM, the incidence function model (IFM; Hanski 1994a).

A notable feature of the IFM is that it uses spatial information on patch occupancy patterns for parameter estimation. In principle, parameter estimation can be accomplished with a single snapshot of patch occupancy, although better estimates can be obtained if more data are available (Hanski et al. 1996b, ter Braak et al. 1998). Other metapopulation models use data on population turnover in parameter estimation (Sjogren Gulve and Ray 1996). However, gathering turnover data is a slow process if the turnover rate itself is slow, and one has to survey the metapopulation at least twice. Also, during a short period of time, the numbers of extinction and colonization events in a classical metapopulation will rarely balance out, even if the metapopulation is at long-term stochastic quasi-equilibrium. (By quasi-equilibrium, I mean the "typical" dynamic state of the metapopulation before inevitable eventual extinction, characterized by a stationary distribution of the number of turnover events per time unit and a characteristic pattern of patch occupancy, reflecting the long-term probabilities of different spatial patterns of occupancy.) Therefore, using turnover data in estimation is liable to produce parameter estimates that predict a trend in metapopulation size, which is desirable only if there is reason to assume that an environmental change has caused a real long-term trend.

The original parameter estimation method used for the IFM is based on nonlinear regression (NLR) between empirically observed patch occupancy and the long-term equilibrium patch occupancy (incidence) predicted by the model (Hanski 1994a). The NLR method assumes patch-specific colonization probabilities that are invariant in time, and it ignores spatial and temporal autocorrelations in patch occupancy. Therefore, the NLR method maximizes a pseudolikelihood instead of a true likelihood (ter Braak et al. 1998). The possible consequences of violating these assumptions are discussed in detail in IMF: Problems in estimation.

Diggle and Gratton (1984) define a prescribed statistical model as a parametric specification of the distribution of a random vector, whereas an implicit statistical model is defined by a generating stochastic process. Prescribed statistical models are analyzed and parameterized with traditional statistical methods. However, the distribution theory of implicit statistical models, which include many models involving space, is mathematically intractable. Fortunately, a unique likelihood function underlies every implicit statistical model, so this function can be approximated by simulation techniques, at least when the dimensionality of the model is low (Diggle and Gratton 1984). In this study, the SPOM defines the generating stochastic process. The parameter estimation method proposed here is based on Monte Carlo inference for statistically implicit models and is called the MC method. It has several useful features: (1) exact likelihoods for transitions between sequential patch occupancy patterns are used, (2) the method includes spatial autocorrelation between patches and temporal autocorrelation between sequential patch occupancy patterns into parameter estimation, (3) known changes in the spatial configuration of the habitat patch network can be incorporated into parameter estimation, and (4) the amplitude of regional stochasticity can be estimated if data for many years are available.

Here, the IFM is briefly reviewed, and the old and new parameter estimation methods are described in detail. Then the parameter estimation methods are compared using data obtained by simulating the IFM in hypothetical patch networks with known parameter values. Revised parameter estimates are obtained for two species living in well-studied metapopulations: the American pika (Ochotona princeps) at Bodie, California, and the false heath fritillary butterfly (Melitaea diamina) in the Tampere region in Finland.

THE INCIDENCE FUNCTION MODEL (IFM)

Basics

The IFM is based on a first-order linear time-homogeneous Markov chain for changes in the state (occupied/empty) of individual habitat patches (Hanski 1994a, b, Hanski 1997, ter Braak et al. 1998). At any one time t each patch i is either empty, [p.sub.i](t) = O, or occupied, [p.sub.i](t) = 1. In each time unit, an extinction probability is calculated for each occupied patch, and a colonization probability is calculated for each empty habitat patch. The extinction probability of a population in patch i, [E.sub.i], is constant in time and is assumed to decrease with increasing patch area, [Mathematical Expression Omitted], where [A.sub.i] is the area of patch i, and x and e are parameters. The colonization probability of patch i at time t, [C.sub.i](t), is assumed to be a sigmoid function increasing with connectivity, [Mathematical Expression Omitted], where [S.sub.i](t) is the connectivity of patch i and y is a parameter.

A key element in the IFM is the measure of patch connectivity, which takes into account the sizes and spatial positions of all occupied patches j:

[Mathematical Expression Omitted] (1)

where [p.sub.j](t) is the state of occupancy of patch j at time t, [d.sub.ij] is the distance between patches i and j, and [Alpha] is a constant that sets the distance-dependent migration rate. The [A.sub.j]s are included in Eq. 1 because the sizes of the source populations evidently affect the migration rate to the focal patch i. The exponent b scales the expected population size to patch area; it can be estimated from empirical data on population sizes. The model thus has five parameters: x, e, y, [Alpha], and b. Two parameters, b and [Alpha], are often estimated with independent biological data; the remaining parameters, x, e, and y, are estimated with patch occupancy data (Hanski 1994a, Moilanen et al. 1998). The model can be simulated by first calculating [S.sub.i](t) and [C.sub.i](t) and then determining the extinction and colonization events by comparing [E.sub.i] and [C.sub.i](t) to random numbers. For some studies using the IFM see Hanski (1994a, b), Cook and Hanski (1995), Moilanen and Hanski (1995), Hanski et al. (1996a, b), Wahlberg et al. (1996), Moilanen et al. (1998), and ter Braak et al. (1998).

Problems in original parameter estimation

The original method of estimating the parameters of the IFM uses nonlinear regression to minimize the difference between the empirically observed patch occupancy [p.sub.i] and the model-predicted long-term probability of patch occupancy (incidence) [J.sub.i]:

min [summation over all i] - [p.sub.i][log.sub.e]([J.sub.i]) - (1 - [p.sub.i])[log.sub.e](1 - [J.sub.i]). (2)

In Expression (2), [p.sub.i] is the average observed patch occupancy, which is calculated as the average over t. Similarly, [p.sub.i](t) in Eq. 1 is replaced by the average [p.sub.i] before calculating [C.sub.i] and [E.sub.i] for Eq. 3a, b. The [J.sub.i] values are derived from the steady-state expression of the two-state Markov chain for one patch, [J.sub.i] = [C.sub.i](1 - [J.sub.i]) + (1 - [E.sub.i])[J.sub.i], from which solving for [J.sub.i] gives:

[Mathematical Expression Omitted]. (3a)

For metapopulations with a moderate or high turnover rate, it is desirable to use a model variant (Hanski 1994a) that includes the so-called rescue effect (Brown and Kodric-Brown 1977, Hanski and Gilpin 1997), stipulating that a local population can be saved from extinction by increased population size due to past migration. One mechanistic way to model the rescue effect is to replace [E.sub.i] by [E.sub.i](t) = (1 - [C.sub.i](t))[E.sub.i] (for a justification see Hanski 1997). Incorporating the rescue effect, solving for [J.sub.i] gives

[Mathematical Expression Omitted] (3b)

where e[prime] = e[y.sup.2]. Parameters e and y can be separated from the product e[prime] by defining a minimum patch area, [A.sub.0], for which the extinction probability in unit time is 1, [Mathematical Expression Omitted], from which [Mathematical Expression Omitted] (Hanski 1994a). (Parameter values reported in this study assume that patch areas, including [A.sub.0], enter the model in hectares and distances enter the model in kilometers. The example about the American pika is an exception. There, patch "area" enters the model in kilometers, because pikas only utilize the perimeter of the habitat patch [Smith and Gilpin 1997].)

Eq. 3 assumes that [E.sub.i] and [C.sub.i] are constant in time. Especially for [C.sub.i], the validity of this assumption can be questioned (ter Braak et al. 1998). If the metapopulation is large, extinction colonization stochasticity causes only a little fluctuation in the fraction of occupied patches; the [S.sub.i] values (given by Eq. 1) and the consequent [C.sub.i] values will remain relatively constant. However, if the metapopulation is small, it is likely that large fluctuations in the fraction of occupied patches will occur (e.g., see Moilanen et al. 1998), so that the [C.sub.i] values cannot be taken as constants. Also, if there is strong distance dependence on dispersal (large [Alpha], Eq. 1), the [C.sub.i] value of a patch will be predominantly affected by nearby populations; localized changes in patch occupancy will cause fluctuations in the [C.sub.i] values, even if the overall patch occupancy in the metapopulation remains relatively constant. Furthermore, because of distance-dependent dispersal and the rescue effect, fluctuations in the [C.sub.i] values will likely be correlated in space. In summary, the assumption of relatively constant [C.sub.i] values can be expected to hold only in large metapopulations and for species that are good dispersers in comparison with the average interpatch distance.

Another problem is temporal autocorrelation between sequential snapshots of patch occupancy. Assume a metapopulation with slow turnover. When observing a patch for a few years, its state of occupancy is not likely to change. Therefore, even if the true average occupancy of a patch was 0.5, one is likely to observe a Pi value of either 0 or 1. This implies that, for a metapopulation with slow turnover, observational data for several years does not contain much more information than data for one year.

PARAMETER ESTIMATION USING IMPLICIT STATISTICAL INFERENCE

SPOMs assume that the pattern of patch occupancy at time t + 1 depends only on the pattern of occupancy at time t, which is the so-called Markov condition. (If occupancy at time t - 1 had an explicit effect on the transition, we would use information on how long a patch has been occupied, which would be comparable to modeling local population dynamics.) Therefore, a SPOM can be formulated as a Markov chain with a state space of size [2.sup.N], where N is the number of patches in the network. Different SPOMs can then be constructed by making different assumptions about factors that affect the transitions between subsequent occupancy patterns. The following method for parameterizing the IFM is based on properties of Markov chains and statistically implicit models, and, therefore, it can be adapted to any SPOM.

The general estimation problem in maximum likelihood estimation is

max L([Theta]) = In f(y; [Theta]) (4)

where y is a set of data, [Theta] is a vector of parameters, and f([center dot]) is a distribution assigning a likelihood to y for a given [Theta]. In statistically implicit models, the distribution f([center dot]) is mathematically intractable, but it can be approximated by simulating the model. Theoretically, Eq. 4 can always be written as follows (Diggle and Gratton 1984):

max L([Theta]) = [Sigma] ln([f.sub.i]([y.sub.i][where][y.sub.1], . . ., [y.sub.i-1]; [Theta])). (5)

If the conditional probabilities [f.sub.i]([center dot]) in Eq. 5 can be obtained by simulation, then statistically implicit inference can be applied to the problem (Diggle and Gratton 1984). In this study, the data points [y.sub.i] are the observed patch occupancy patterns, [Theta] is the parameter vector of the SPOM, and the SPOM itself is the mechanism that generates the likelihoods [f.sub.i]([center dot]).

Before going to the details of the MC method, let us introduce some notation. Let [p.sub.i](t) be the state of patch i at time t, i [element of] [1, . . ., N], where N is the number of patches in the network. X(t) is a vector ([p.sub.1](t), . . ., [p.sub.N](t)) describing the state of the Markov chain at time t. I use X(t) specifically to denote patch occupancy patterns produced by simulations with the IFM. For clarity, I denote by O([t.sub.n]) the patch occupancy pattern observed empirically at time [t.sub.n], even though the vector O([t.sub.n]) is of the same form as X(t). Let n = 0, . . ., M - 1, where M is the number of times the metapopulation was surveyed. Let B be the state space of the Markov chain (size [2.sup.N]).

Let [S.sup.k] be the k-step state transition probability matrix (size [2.sup.N] X [2.sup.N]) of the Markov chain, and let [S.sup.k][X, Y] be the element of the matrix giving the probability of moving from state X(t) at time t to Y(t + k) at time t + k. Note that the (k + 1)-step transition matrix is given by the matrix product [S.sup.1][S.sup.k] (for general transition properties see the Chapman-Kolmogorov equation). Finally, let [Theta] [subset or equal to] {x, e[prime], [A.sub.0], [Alpha]} be the parameter set of the IFM (including the rescue effect) to be optimized.

Maximum likelihood estimation aims to find the parameter set [Theta] that maximizes the likelihood of the empirically observed sequence of patch occupancy patterns. This is the probability of arriving at the first observed state multiplied by the probabilities of the subsequent transitions:

max P[O([t.sub.0])] X P[O([t.sub.1])[where]O([t.sub.0])]

x . . . x P[O([t.sub.M-1])[where]O([t.sub.M-2])]. (6)

Because of the Markov condition, the probabilities P[[center dot]], which are analogous to probabilities [f.sub.i]([center dot]) in Eq. 5, can be calculated independently. The most important case is the transition between two patterns that are separated by a single time step. The probability of such a transition is the product of the probabilities for the respective transitions for each patch:

[Mathematical Expression Omitted]. (7)

Note that Eq. 7 holds for the transition between two empirical observations in consecutive years as well as for the transition from a pattern X generated by model simulation to an observed pattern O. In many empirical studies, the metapopulation will be surveyed in each year. In this case, the transition probabilities between the sequential patterns can be calculated exactly, and a true likelihood for observing the chain of transitions exists.

The problems start when there is more than one time unit between two observations. First consider the case where the interval is 2 yr. Now the probability of starting from X(t) and ending at Y(t + 2) is a sum over all possible chains of transition:

P[Y(t + 2)[where]X(t)] = [summation over Z[element of]B] P[Z(t + 1)[where]X(t)]

x P[Y(t + 2)[where]Z(t + 1)]. (8)

In Eq. 8 one has to sum over all possible states Z(t + 1), for which one full row and one column of the one-step state transition matrix is required. Unfortunately, for most metapopulations, calculating even one row of the matrix [S.sup.1] is practically impossible, due to the huge length, [2.sup.N], of the vector. This is where Monte Carlo inference becomes necessary.

Consider the situation with two observations O(t) and O(t + n), where n [greater than or equal to] 2. One can calculate the probability of this transition using a Monte Carlo method. The possibility that comes to mind first is to start from pattern O(t), simulate the metapopulation model for n yr, repeat this many times, and calculate the proportion of replicates that end up in pattern O(t + n). Unfortunately, the probability of ending at exactly pattern O(t + n) will generally be vanishingly small if N is even a modest number. Thus, the above idea will not work in practice. However, the situation can be saved by a slight modification to the above procedure. Instead of simulating the metapopulation for n yr, let us simulate it for n - 1 yr to arrive at pattern X(t + n - 1). We can then calculate the exact probability of the desired final transition P[O(t + n)[where]X(t + n - 1)] by applying Eq. 7. By doing a large number L (I used L = 250) of Monte Carlo replicate simulations, the required probability can be calculated as the average:

P[O(t + n)[where]O(t)] [approximately equal to] 1/L [summation of] P[O(t + n) [where][X.sup.u](t + n - 1)] where u = 1 to L (9)

where [X.sup.u](t + n - 1) is the pattern produced by the Monte Carlo replicate u.

The final unresolved part of Eq. 6 is the probability of reaching the first observed state, P[O([t.sub.0])]. Including this term in Eq. 6 implies the assumption that the metapopulation is in quasi-equilibrium. This is because one must now obtain parameter estimates that can produce the first observed pattern O([t.sub.0]) with a relatively large probability. P[O([t.sub.0])] can be calculated as a sum over the column O([t.sub.0]) of the k-step state transition matrix of the ergodic part of the Markov chain (state [Phi], metapopulation extinction, is excluded), where k [approaches] [infinity]. As before, this calculation is not feasible in practice. However, this probability can be tackled with almost the same method as used in Eq. 9. Instead of doing many replicate simulations, we just do one long (L time units) simulation run, and calculate the required probability as

P[O([t.sub.0])] [approximately equal to] 1/L [summation of] P[[O.sub.o](u + 1) [where]X(u)] where u = 1 to L (10)

where X(u) is the simulated patch occupancy pattern in year u, and [O.sub.0](u + 1) denotes the pattern O([t.sub.0]) at time u + 1. Thus we calculate the average probability of jumping to pattern O([t.sub.0]) from all patterns X(u) in a long simulation run (I used L = 1000).

One final consideration has yet to be taken into account. Let [R.sub.i] = [C.sub.i](1 - [J.sub.i]) + (1 - [C.sub.i])[E.sub.i][J.sub.i] be the turnover rate of patch i. Next, assume two independently sampled patch occupancy patterns, O([t.sub.x]) and O([t.sub.y]). The probability of patch i being in different states in these patterns is [Mathematical Expression Omitted]. By direct substitution of Eq. 3b into the expressions of [R.sub.i] and [D.sub.i] it can be calculated that [R.sub.i] = [[C.sub.i] + (1 - [C.sub.i])[E.sub.i]][D.sub.i], and therefore [R.sub.i] [less than or equal to] [D.sub.i]. This means that the average number of patches in different states between two independent patch occupancy patterns taken from the same metapopulation is practically always greater than the turnover rate of the metapopulation.

What does this have to do with parameter estimation? When calculating P[O([t.sub.0])], we are not counting the occurrences of pattern O([t.sub.0]) itself but instead we are calculating transition probabilities from simulated patch occupancy patterns X(u) to the first observed state O([t.sub.0]) (Eq. 10). Due to the reasons stated, the number of differences in patch states between X(u) and O([t.sub.0]) is generally greater than the empirically observed R, even for the true parameter values. Therefore, because we are trying to find parameters that maximize P[[O.sub.0](u + 1)[where]X(u)], we will arrive at parameters that overestimate the turnover rate of the metapopulation.

To avoid overestimating the turnover rate, I imposed a condition called turnover limitation, meaning that the average turnover rate in the simulation from which P[O([t.sub.0])] is calculated is not allowed to exceed the empirically observed turnover rate. Turnover limitation was implemented in the numerical optimization routine by disqualifying trial parameter combinations that violated this condition. This completes the approximation of Eq. 6. Turnover limitation can be applied also to the NLR parameter estimation method. There is no reason to suspect that the NLR method systematically overestimates turnover, but for the sake of a fair comparison of the methods, two variants of turnover limitation were used with the NLR method. In the NLRU method, the upper limit of turnover events was set as in the MC method. In the NLRB method, turnover was constrained to be between 0.9 and 1.1 times the observed turnover rate. As expected, the NLRB method generally produced better parameter estimates than the NLR and NLRU methods, and only the results for NLR and NLRB are reported in this study. Turnover limitation was implemented in the NLR method as it was in the MC method.

If one has data for many sequential years, one may omit the probability P[O([t.sub.0])] from the calculation of Eq. 6 and use only the information in the transitions, thus obtaining a true likelihood (here called the TMC method). This is similar to using turnover data (see, e.g., Sjogren Gulve and Ray 1996, ter Braak et al. 1998), but here we use information on turnover of patterns, not information on individual assumedly independent patches. As a possible disadvantage, the TMC method may estimate parameters that will predict a trend in metapopulation size, because quasi-stationarity is not assumed. For a summary of the different estimation methods used in this study, see Table 1.

Confidence limits for the parameter estimates can be calculated for the parameters based on likelihood ratio tests. Let L([[Theta].sup.*]) be the log-likelihood of the optimal parameter set [[Theta].sup.*]. Assume that we are computing confidence limits for a particular parameter, say x, which in [[Theta].sup.*] has the optimal value of [x.sup.*]. To find confidence limits for x, we vary the value of x around [x.sup.*]. With each x[prime] [not equal to] [x.sup.*] the values of all other parameters in [Theta] are re-estimated, giving a parameter set [Theta][prime] with a loglikelihood L([Theta][prime]). The difference 2[L([[Theta].sup.*]) - L([Theta][prime])] follows the [[Chi].sup.2]-distribution with one degree of freedom (Rao 1973). Thus x[prime] belongs to the 95% confidence limits of x if 2[L([[Theta].sup.*]) - L([Theta][prime])] [less than] 3.841 (Rao 1973). Diggle and Gratton (1984) note that confidence limits based on log-likelihoods derived by simulation techniques tend to be conservative. Confidence limits were not calculated in this study, as computing them for numerous data sets would require excessive computer time.

TABLE 1. Summary of the different parameter estimation methods used in this study. Method Explanation NLR(*) The original parameter estimation method based on NonLinear Regression used for the Incidence Function Model (IFM). NLRB The NLR method with the average turnover rate limited to 0.9-1.1X the empirically observed turnover rate. MC The new parameter estimation method proposed in this study, which is based on implicit statistical inference and Monte Carlo methods. MC uses the optimality measure of Eq. 6. TMC A variant of the MC method, in which the first term of Eq. 6 is ignored, and, thus, quasi- stationarity is not assumed. * Source: Hanski 1994a.

Regional stochasticity

Extinction-colonization stochasticity leads to fluctuations in patch occupancy. These fluctuations may be amplified by regional stochasticity, which is defined as spatially correlated environmental stochasticity affecting the quality of many or all patches simultaneously (Hanski 1991), caused for example by variation in climatic conditions. Regional stochasticity has been implemented in the IFM by assuming synchronous variation in the effective areas of habitat patches (Hanski et al. 1996b): the area of each patch is multiplied yearly with the same factor v, which is drawn from a [log.sub.[10.sup.-]] normal distribution with unit mean and variance [[Sigma].sup.2]. Estimation of the magnitude of regional stochasticity from empirical data is of interest, since the presence of such stochasticity will affect predictions of metapopulation persistence. Generally, increasing regional stochasticity increases the risk of metapopulation extinction. The amplitude of regional stochasticity may be estimated using the basic MC procedure outlined in Eqs. 6-9. By simply setting [Sigma] as a free parameter in the optimization algorithm, the Monte Carlo simulations (now including regional stochasticity) are done exactly in the same manner as before. Naturally, the reliable estimation of [Sigma] requires data from many years.

Computational issues

The optimality measure used in this study is stochastic due to the Monte Carlo simulations. Therefore it cannot be solved analytically, and some method of global optimization is needed. One reasonable choice is simulated annealing (Metropolis et al. 1953, see also Kirkpatrick et al. 1983), which was used in this study. Any commonly available variant of simulated annealing will work as the optimization procedure for this problem, as one will maximally optimize only five parameters. During optimization, a parameter set was accepted as the new optimum only if the average likelihood of three replicate function evaluations was higher than the likelihood of the current optimum. Hence, only parameter sets producing consistently high likelihoods were accepted as optima. As the optimality measure is stochastic, estimations were repeated twice, and parameter values from the replicate yielding the higher likelihood for the data were used. See the Appendix for information on obtaining supplementary material.

METHODS

Several simulated data sets were generated for this study using the habitat configuration of patch network 43 in Hanski et al. (1996b). This patch network was chosen because it has substantial variation in patch sizes (60-8200 [m.sup.2]) and isolations, and has a size (53 patches) that is in the typical range of metapopulation studies (Verboom et al. 1991, Hanski 1994a, Kindvall 1996, Sjogren Gulve and Ray 1996, Wahlberg et al. 1996, Smith and Gilpin 1997). Smaller patch networks would hardly provide enough data for reliable parameter estimation using any method.

To generate simulated data, parameter values (x = 0.9, e = 0.01, y = 3.5, A0 = 0.0060, b = 0.5, and [Alpha] = 1.5) were selected that caused patch occupancy to fluctuate around 0.5. Fig. 1 shows a sample time series, which gives an idea about the amplitude of fluctuations with and without regional stochasticity. First simulating the dynamics for 500 yr to allow the metapopulation to reach quasi-equilibrium, and then sampling the simulation, generated hypothetical data sets. To mimic typical empirical situations, the number of snapshots recorded was kept low with the snapshots spaced sequentially or a few years apart. Table 2 summarizes the hypothetical data sets used to generate Table 4 and Figs. 3-5. To ensure that results are applicable to varying conditions, 48 additional data sets were generated by using all combinations of four dissimilarly aggregated patch networks [ILLUSTRATION FOR FIGURE 2 OMITTED], four sampling schemes (Table 3) and three different sets of parameter values (Table 3). These data sets were used to produce Table 5.

The parameter estimation procedures were also applied to two empirical data sets, one on the American pika (Ochotona princeps) and the other on the false heath fritillary butterfly (Melitaea diamina). The data for the American pika is from a network with 76 patches at Bodie, California, USA. The metapopulation has been surveyed four times in the years 1972, 1977, 1989, and 1991. For description and analysis of these data and the metapopulation see Smith and Gilpin (1997) and Moilanen et al. (1998). The data used for the brief analysis on the false heath fritillary butterfly, Melitaea diamina, is the same as used in Wahlberg et al. (1996).

RESULTS

Simulated data

Simulated data were used to compare different estimation methods. Fig. 3 gives the mean value and the standard deviation for model parameter x, when estimated with different methods from simulated data sets A, B, and C (Table 2). In these comparisons only x and e[prime] were estimated from the data, while values for [A.sub.0] and [Alpha] were assumed to be known. The main purpose of Fig. 3 is to illustrate some consequences of the number and spacing of patch occupancy patterns used for parameter estimation. Since these results are based on data generated by simulating a metapopulation in one particular patch network with one set of parameter values, the differences between different estimation methods are only suggestive. A more comprehensive comparison of the estimation methods is presented in Table 5.

Several results suggesting general patterns are noteworthy in Fig. 3. First, the results of NLR1A (original estimation method, one snapshot of hypothetical data A), NLR5A and NLR10A show how the parameter estimates converge when more data become available. A comparison between NLR5A and NLR5B shows how temporal autocorrelation adversely affects parameter estimation; the standard deviation of parameter estimates is lower when the five snapshots were spaced 50 yr apart than when they were sequential and, hence, affected by temporal autocorrelation. The turnover-based tests TMC5A and TMC10A did not produce good parameter estimates. This may seem strange, because the data used in those tests consisted of series of sequential snapshots, and, therefore, the measure to be optimized was the exact likelihood calculated using Eq. 7. The likely explanation for the large variance in TMC5A is that 5 yr is a short enough period to include a spurious trend, which increases variation in parameter estimates. The smallest variance was obtained with MC5A. (Monte Carlo methods MC5A and TMC5A differ only in whether the first term in Eq. 8 is included in the calculation of the likelihood; its inclusion imposes the condition of quasi-stationarity into parameter estimation using MC.) An F-ratio test for equality of variances between NLR5A and MC5A gives F = 2.7, df = 24, 24, and P = 0.009, showing that estimates obtained with the MC method have a significantly smaller variance than those obtained with the NLR method. The last two columns, NLR2C and MC2C, use 2 yr worth of data with a spacing of 5 yr. Here the standard deviations for x are 0.210 and 0.137 for NLR and MC, respectively, again showing an advantage for the MC method (F = 2.35, df = 24, 24, and P = 0.02).

TABLE 2. Summary of the composition of hypothetical data sets A-E. For example, data set A contains 25 replicate simulations without regional stochasticity, each of which was sampled during 10 consecutive years after year 500. Data Replicates Snapshots Interval [Sigma] A 25 10 1 0 B 25 5 50 0 C 25 2 5 0 D 10 50 1 0 E 10 50 1 0.5

[TABULAR DATA FOR TABLE 3 OMITTED]

The least amount of data needed for parameter estimation is an interesting question. Column NLR1A in Fig. 3 uses only one snapshot of data, and the accuracy of parameter estimates is not very good. Encouragingly, estimation with the same data using the MC method (MC1A) produces much better parameter estimates. However, this estimation contains a little cheating, since the use of the MC method requires information about the turnover rate of the metapopulation. In this case the turnover rate was calculated from simulations and given as extra information to the estimation procedure. Nevertheless, this success of the MC method implies that, minimally, it may be possible to use the MC method with 1.5 yr of data (i.e., one complete metapopulation survey and a partial survey the following year) to obtain an approximate estimate of turnover. It is worth noting that MC1A is based only on the Monte Carlo estimation of the first term of Eq. 6.

Fig. 4A presents another comparison of the NLR5A and MC5A results; Fig. 4B presents another comparison of the NLR5C, and MC2C results. The estimated parameters x and e[prime] are shown. In both cases, the scatter in the results is smaller with the MC method. In all the estimations conducted, it appeared that the more parameters that were estimated simultaneously, the greater the advantage for the MC method. Fig. 5 shows results for simultaneously estimating x, e[prime], and [A.sub.0] from 5 yr of simulated data (data set A, Table 2). It should be noted that the correct value of [A.sub.0], which is used to separate the values of e and y in the product e[prime], is not given to the optimization algorithm. The NLR method does much worse than the MC method, which is not surprising, as the NLR method does not use information about the turnover rate. Thus it is more appropriate to compare MC with NLRB, but even here MC is clearly superior.

Results related to the estimation of regional stochasticity are presented in Table 4. The two data sets used in this analysis, D and E, consist of 50 yr long sequences (10 replicates). Data E includes strong regional [TABULAR DATA FOR TABLE 4 OMITTED] stochasticity, with [Sigma] = 0.5. I first checked that the parameter estimation without stochasticity performed satisfactorily. With data set D, MC produced the best results, and NLR produced the worst results: the standard deviations of parameter estimates produced by the MC method were less than half of those produced by the NLR method.

The second section in Table 4 deals with the situation where there is regional stochasticity in the data, but parameter estimation is performed without assuming regional stochasticity. Again the best estimates are produced by the MC method, with the worst estimates coming from the NLR method. Interestingly, x and [Alpha] are consistently underestimated, while e[prime] is consistently overestimated. There is a plausible explanation for this result: regional stochasticity, as implemented here, strongly increases the extinction probabilities of large populations and generally increases the turnover rate [ILLUSTRATION FOR FIGURE 1 OMITTED]. A way to increase the extinction probability of large patches is to decrease x, and a way to increase the turnover rate is to increase e[prime] and [Alpha]. However, the ultimate consequences of these parameter changes are different from the consequences of adding regional stochasticity to the system. The essential point is that regional stochasticity is synchronous across the network, which greatly increases the probability of extinction of the entire metapopulation. Elevating the turnover rate and the extinction risk of large patches by changing model parameters will not introduce the element of synchrony, and therefore dynamics inferred from these parameter estimates can be misleading.

The third and fourth sections of Table 4 report the results for parameter estimation when regional stochasticity is assumed. In these cases, there are no results for the NLR and NLRB methods, which do not allow the estimation of regional stochasticity. When there is no regional stochasticity in the data ([Sigma] = 0.0), but regional stochasticity is yet assumed in parameter estimation, the parameter estimates remain good. Encouragingly, the MC method also produces accurate parameter estimates when regional stochasticity is present and it is estimated.

The tests described in Table 4 are based on a single habitat configuration. To extend the generality of the results, a more extensive test was performed, in which four different habitat configurations, four different sampling schemes, and three sets of parameter values were used. The habitat configuration studied covered a range of habitat aggregations [ILLUSTRATION FOR FIGURE 2 OMITTED]. The sampling schemes were chosen to be realistic for studies (Table 3). The sets of parameter values (Table 3) were chosen to generate different turnover rates and levels of aggregation; parameter values in set 1 produce relatively slow turnover with dispersal strongly limited by distance, while parameter values in set 3 describe a species with good dispersal ability and fast turnover rate. All combinations of habitat configuration, sampling, and parameter values were used to generate 48 simulated data sets. Table 5 summarizes results from estimating parameters using these data sets.

Table 5 gives the results in terms of the absolute distances between the true and estimated parameter values, with x, e[prime], [A.sub.0], and [Alpha] all estimated from the data. Parts of the table show results for different subsets of the 48 estimations performed with each method. For instance, the row 'MC habitat 1' gives the results for the 12 combinations of sampling and parameter values that were obtained using the MC method and habitat configuration 1 [ILLUSTRATION FOR FIGURE 2 OMITTED]. The MC method produced the most accurate parameter estimates for parameters x and [TABULAR DATA FOR TABLE 5 OMITTED] [A.sub.0] for all habitat configurations, sampling schemes, and parameter values. For [Alpha] and e[prime], the MC method produced most accurate estimates for 8 of the 11 combinations. Thus, the habitat configuration and parameter values used had little effect on the relative performance of the MC and NLRB methods; MC prevailed. The sampling scheme had a large effect on the performance of the estimation methods, which was expected, since sampling affects the amount of information available for parameter estimation. When there were only two snapshots of data available (sampling schemes 2 and 4), MC was clearly better than NLRB. Also, when there were five sequential snapshots (sampling scheme 1), MC was superior. Only in the case when there were five snapshots spaced 5 yr apart (sampling scheme 3), the MC and NLRB methods performed equally well. This is consistent with Fig. 3. In summary, the MC method had clearly superior overall performance.

Empirical data sets

I apply the MC method to empirical data to see whether the new parameter estimates agree with those obtained earlier by the NLR method and with independent biological knowledge. The first empirical example is the American pika. This metapopulation is an excellent example of a classical mammalian metapopulation (Moilanen et al. 1998). Table 6 gives the parameter estimates, including an estimate for regional stochasticity. The estimates produced by the MC method are consistent with those obtained with the other methods in Moilanen et al. (1998).

The endangered false heath fritillary butterfly (Melitaea diamina), is known to occur at only two localities in Finland. The other locality has a network of 93 patches, which was surveyed for the presence of the species in 1995 (Wahlberg et al. 1996). As the second empirical example, I present a reanalysis of this data set, which highlights some difficulties in parameter estimation and demonstrates the use of the MC method with data for a single year. The problems are as follows.

First, in this metapopulation there are three areas where the density of habitat patches is high and patch occupancy tends to be high. Additionally, there are a few large populations scattered around the landscape, situated many kilometers away from other occupied populations. The problem is that biological knowledge about the species (Wahlberg et al. 1996) suggests that it is not a good enough disperser to have colonized the most isolated habitat patches. One possible explanation for the presence of the isolated populations is that in [TABULAR DATA FOR TABLE 6 OMITTED] the past there were many more suitable habitat patches (moist meadows), the local populations were then less isolated. Hence, they may represent remnants of a once larger metapopulation.

How does this affect parameter estimation? When testing the MC method with hypothetical data, the migration parameter [Alpha] was generally well estimated. In the case of M. diamina the original data gives an estimate of [Alpha] [approximately equal to] 0.4. This is a very small value, suggesting that the species is a substantially better disperser than Melitaea cinxia, which is contrary to what we expect on the grounds of independent biological knowledge (Wahlberg et al. 1996). Hence, let us assume that the isolated populations are indeed remnants of a once larger metapopulation. If seven very isolated and relatively large populations are removed from the data set, our estimate of [Alpha] is revised to 1.5, which is close to what could be expected based on our knowledge about Melitaea cinxia ([Alpha] [approximately equal to] 1.0).

The second problem with the M. diamina data is that we have no direct information on turnover. This fact does not prevent us from attempting to estimate model parameters that are not directly related to the turnover rate. Table 7 gives the parameter estimates for M. diamina while allowing for different maximum turnover rates. The values of x, [A.sub.0], and [Alpha] remain consistent for a large range of turnover rates, suggesting that reasonable estimates for these parameters can be obtained with the MC method, even with just a single snapshot of spatial data.

DISCUSSION

In this study, a new parameter estimation method (MC) for stochastic patch occupancy models (SPOMs) has been described and tested. With all the data sets used, the new method produced parameter estimates for the IFM that were more accurate than those produced by the original method (Hanski 1994a). Some likely reasons for the success of the new method are the following: (1) fluctuations and spatial autocorrelations in patch connectivity are included in estimation, (2) temporal autocorrelations between snapshots of data are included in estimation, and (3) an exact likelihood, instead of a pseudolikelihood, is calculated for transitions between sequential patch occupancy patterns. Two additional advantages of the MC method are that changes in the spatial structure of the environment can be included in parameter estimation, and that the MC method lends itself to the estimation of the amplitude of regional stochasticity.

The IFM assumes particular mathematical functions to describe various biological relationships, such as those between probability of dispersal and distance, and between probability of colonization and number of immigrants (see Hanski 1997 for discussion of these relationships). Studies of the form of such relationships should precede any parameter estimation procedure. A strength of the Monte Carlo method proposed here is that it can easily be adapted to estimate parameters in any SPOM.

Perhaps the most important feature of the MC method is the maximal efficiency of information usage in parameter estimation. Because of problems with parameter estimation using pattern data, it has been suggested that when more data are available, one should model extinctions and colonizations separately, and one should use turnover information to parameterize models (ter Braak et al. 1998). Estimations using the TMC method, which does not impose the condition of quasi-stationarity, demonstrate that the unintended estimation of a trend induces extra variation into parameter estimates (compare the MC and TMC methods in Fig. 3 and Table 4). As an extreme example, we may consider a situation where either extinctions or colonizations only are observed between two successive snapshots. Then it makes a huge difference for parameter estimation whether quasi-stationarity is assumed or not. Turnover-based parameter estimation methods are likely to suffer from this same problem. Consider for example the MC2C estimations in Figs. 3 and 4. The data used for these estimations includes two snapshots spaced 5 yr apart. The number of turnover events during the 5-yr interval in the 25 replicate data sets varied from 11-21. It seems unlikely that turnover-based estimation with so little data would be reliable.

TABLE 7. Parameter values obtained for Melitaea diamina with the MC method, using one snapshot of data and allowing for different maximum rates of population turnover. Maxi- mum turnover x e[prime] [Alpha] [A.sub.0] ([m.sup.2]) 5/86 1.20 0.039 1.62 165 10/86 1.10 0.060 1.51 219 20/86 1.24 0.045 1.50 216 30/86 1.06 0.054 1.56 194 40/86 1.24 0.015 2.32 170 Note: The largest turnover rate allowed (40 events for the 86 patches) is already very high, since the maximum average long-term turnover rate is 0.5 for the model variant that includes the rescue effect.

Regional stochasticity is a phenomenon likely to be present in most natural environments. It has important consequences for a metapopulation, because the more or less synchronous variation in population size will reduce the persistence time of the metapopulation. In metapopulation models, the presence of regional stochasticity has often been ignored, perhaps because including it would complicate models and mathematical analysis, as well as make parameter estimation more difficult. The MC method can be used to estimate the amplitude of regional stochasticity, although the amount of data required for this is quite large. Also, the effects of regional stochasticity may vary when the quality of patches differs. Uncovering such interactions from real data is likely to be very difficult. An alternative way to examine the effects of regional stochasticity is sensitivity analysis: a model can be parameterized without regional stochasticity, various levels of regional stochasticity can then be added to the model, thus permitting an exploration of the consequences for metapopulation dynamics (Moilanen et al. 1998).

The MC method was applied to two empirical data sets, which produced parameter estimates that were consistent with pre-existing estimates and other biological knowledge. The brief analysis on the false heath fritillary butterfly demonstrates how one may use one year of data combined with sensitivity analysis to infer model parameters that are unrelated to the turnover rate. It is interesting that the value of parameter [Alpha] was generally estimated very well with the MC method. In several earlier studies, it had been suggested that [Alpha] should be estimated from independent biological (mark-recapture) data. If [Alpha] can be estimated from spatial pattern data it may be unnecessary to do the mark-recapture study, and, even if the mark-recapture study is performed, it is useful to obtain another estimate of [Alpha] with a completely different method.

ACKNOWLEDGMENTS

I thank Andrew T Smith for providing the pika data and P. Abrams, J. Alho, C. ter Braak, I. Hanski and an anonymous reviewer for helpful comments on the manuscript.

LITERATURE CITED

Brown, J. H., and A. Kodric-Brown. 1977. Turnover rates in insular biogeography: effect of immigration on extinction. Ecology 58:445-449.

Caswell, H., and R. J. Etter. 1993. Ecological interactions in patchy environments, from patch occupancy models to cellular automata. Lecture Notes in Biomathematics 96:93-109.

Cook, R. R., and I. Hanski. 1995. On expected lifetimes of small-bodied and large-bodied species of birds on islands. American Naturalist 145:307-315.

Day, J., and H. P. Possingham. 1995. A stochastic metapopulation model with variability in patch size and position. Theoretical Population Biology 48:333-360.

Diggle, P. J., and R. J. Gratton. 1984. Monte Carlo methods of inference for implicit statistical models. Journal of the Royal Statistical Society, Series B 46:193-227.

Gyllenberg, M., and D. S. Silvestrov. 1994. Quasi-stationary distributions of a stochastic metapopulation model. Journal of Mathematical Biology 33:35-70.

Hanski, I. 1991. Single-species metapopulation dynamics: concepts, models and observations. Pages 3-16 in M. Gilpin and I. Hanski, editors. Metapopulation dynamics. Academic Press, London, UK.

-----. 1994a. A practical model of metapopulation dynamics. Journal of Animal Ecology 63:151-162.

-----. 1994b. Patch-occupancy dynamics in fragmented landscapes. Trends in Ecology and Evolution 9:131-135.

-----. 1997. Predictive and practical metapopulation models: the incidence function approach Pages 21-45 in D. Tilman and P. Kareiva, editors. Spatial ecology. Princeton University Press, Princeton, New Jersey, USA.

-----. 1998. Metapopulation dynamics. Nature 396:41-49.

Hanski, I., and M. Gilpin. 1997. Metapopulation biology: ecology, genetics and evolution Academic Press, London, UK.

Hanski, I., A. Moilanen, and M. Gyllenberg. 1996a. Minimum viable metapopulation size. American Naturalist 147: 527-541.

Hanski, I., A. Moilanen, T. Pakkala, and M. Kuussaari. 1996b. Metapopulation persistence of an endangered butterfly: a test of the quantitative incidence function model. Conservation Biology 10:578-590.

Hanski, I., and D. Simberloff 1997. The metapopulation approach, its history, conceptual domain and application to conservation. Pages 5-26 in I. Hanski and M. Gilpin, editors. Metapopulation biology: ecology, genetics and evolution. Academic Press, London, UK.

Harrison, S., and A. Taylor. 1997. Empirical evidence for metapopulation dynamics. Pages 27-68 in I. Hanski and M. Gilpin, editors. Metapopulation biology: ecology, genetics and evolution. Academic Press, London, UK.

Kindvall, O. 1996. Habitat heterogeneity and survival in a bush cricket metapopulation. Ecology 77:207-214.

Kirkpatrick, S., C. D. Gelatt, Jr., and M. P. Vecchi. 1983. Optimization by simulated annealing. Science 220:671-680.

McCullough, D. R. 1996. Metapopulations and wildlife conservation. Island Press, Washington, D.C., USA.

Metropolis, N., A. Rosenbluth, M. Rosenbluth, A. Teller, and E. Teller. 1953. Equation of state calculations by fast computing machines. Journal of Chemical Physics 21:1087-1092.

Moilanen, A., and I. Hanski. 1995. Habitat destruction and competitive coexistence in a spatially realistic metapopulation model Journal of Animal Ecology 64:141-144.

Moilanen, A., Smith, A. T. and I. Hanski. 1998. Long-term dynamics in a metapopulation of the American pika. American Naturalist 152:530-542.

Rao, C. R. 1973. Linear statistical inference and its applications. Second edition. Wiley, New York, New York, USA.

Sjogren Gulve, P., and C. Ray. 1996. Large scale forestry extirpates the pool frog: using logistic regression to model metapopulation dynamics Pages 111-137 in D. R. McCullough, editor. Metapopulations and wildlife conservation and management. Island Press, Washington, D.C., USA.

Smith, A. T, and M. Gilpin. 1997. Spatially correlated dynamics in a pika metapopulation. Pages 407-428 in I. Hanski and M. Gilpin, editors. Metapopulation biology: ecology, genetics and evolution. Academic Press, London, UK.

ter Braak, C. J. F., I. Hanski, and J. Verboom. 1998. The incidence function approach to the modeling of metapopulation dynamics. Pages 167-188 in J. Bascompte and R. V. Sole, editors. Modeling spatiotemporal dynamics in ecology. Springer-Verlag, Berlin, Germany.

Verboom, J., A. Schotman, P. Opdam, and J. A. J. Metz. 1991. European nuthatch metapopulations in a fragmented agricultural landscape. Oikos 61:149-156.

Wahlberg, N., A. Moilanen, and I. Hanski. 1996. Predicting the occurrence of species in fragmented landscapes. Science 273:1536-1538.

APPENDIX

A self-extracting executable computer program from the metapopulation research group that implements the NLR and MC methods, including sample data and instructions, is available in ESA's Electronic Data Archive: Ecological Archives 080-003.

Printer friendly Cite/link Email Feedback | |

Author: | Moilanen, Atte |
---|---|

Publication: | Ecology |

Date: | Apr 1, 1999 |

Words: | 8517 |

Previous Article: | Search strategies for landscape-level interpatch movements. |

Next Article: | Effects of sampling effort on characterization of food-web structure. |

Topics: |