Key-factor/key-stage analysis for life table data.
Organisms are subjected to various mortality factors, such as natural enemies and climate, each of which causes different mortality in each life stage of the organism. Some factors may affect the mortality of both the egg stage and the larval stage, while other factors may affect only the larval stage. The factor that contributes most to the population fluctuation should be called the "key factor" (Morris 1959). Similarly, the stage that contributes most to the population fluctuation should be called the "key stage." To learn why population density varies in place as well as in time, life tables should be compiled simultaneously for a number of different environments. These data could be used to show which factor is the key factor, which stage is the key stage, and through which stage the key factor exerts its greatest effect.
It generally is laborious to examine various environmental factors simultaneously. For this reason, Morris (1959) proposed a "single-factor analysis." In this analysis, only the mortality caused by a single factor, which is suspected to be a key factor, is analyzed. If the variability of mortality caused by this factor is sufficiently large, it can be confirmed that the factor is a key factor. Morris applied this analysis to the percentage of parasitism for a black-headed budworm, Acleris variana (Fern.). However, the mortality caused by a specific factor is not always measurable. In many cases, the mortality of each life stage can only be estimated, since the mortality may be caused by a mixture of several factors. If the single-factor analysis were applied on a stage-by-stage basis, the analysis would identify the key stage, but would not identify the key factor, unless the stage is subjected to only a single mortality factor. Subsequent to Morris's work, several authors improved the single-factor analysis under the terminology of "key-factor analysis" (Varley and Gradwell 1960, Mott 1966, Metcalfe 1972, Smith 1973, Podoler and Rogers 1975, Manly 1977). In these studies, however, factors and stages were confused, and this confusion seems to remain today.
In this paper, I emphasize the importance of discriminating between the key factor and the key stage. First, I discuss the conventional key-factor analyses that should be called "key-stage analyses." Second, I discuss the utilization of ANOVA as a key-factor analysis. Third, I propose a "key-factor/key-stage analysis" by integrating the conventional key-factor analysis and ANOVA. The effectiveness of the analysis is demonstrated by using the life table data of Pieris rapae crucivora Boisduval (Lepidoptera, Pieridae).
KEY STAGE ANALYSIS
The conventional key-factor analyses will be summarized as follows. Let [N.sub.ij] be the population entering the ith life stage of the jth observation. Let [k.sub.ij] be the mortality of the ith life stage of the jth observation, defined as the negative logarithm of the survival rate: [k.sub.ij] = -log([N.sub.i+1,j]/[N.sub.ij]), (i = 1, 2, . . ., s; j = 1, 2, . . ., n). Let [K.sub.j] be the mortality through all stages, i.e., [Mathematical Expression Omitted]. Here, "mortality" will be used in a broad sense to cover any loss in a given population, whether this loss results from direct mortality, from dispersal, or from reduced fecundity (Morris 1957). The mean of [k.sub.ij] is denoted by [Mathematical Expression Omitted]; the mean of [K.sub.j] is denoted by [Mathematical Expression Omitted]. Also, the variance of [K.sub.j] is denoted by [Mathematical Expression Omitted], the variance of [Mathematical Expression Omitted], the and the covariance between [k.sub.ij] and [k.sub.hj] by [Mathematical Expression Omitted]. The relative contribution of the ith-stage mortality to the fluctuation of the total mortality can be evaluated by comparing the deviation of the ith-stage mortality, [Mathematical Expression Omitted], and the deviation of the total mortality, [Mathematical Expression Omitted]. Varley and Gradwell (1960) plotted the deviations of mortality on a graph and visually compared the synchronization of the fluctuation.
Several other methods enabling a quantitative comparison are based on the division of the variance into components. Mott (1966) divided V(K) into V([k.sub.i]) and cov([k.sub.i], [k.sub.h]):
[Mathematical Expression Omitted]. (1)
This division elucidates the influence of the interaction between the mortality of different stages, as well as the influence of each stage. However, the interpretation of Eq. 1 is not straightforward, since the effect of each stage is scattered among many terms. Smith (1973) and Harcourt (1986) bundled the terms of Eq. 1 symmetrically among stages:
V(K) = [summation of] [V([k.sub.i]) + [summation of] cov([k.sub.i], [k.sub.g]) where g = 1 to s(i[not equal to]g)] where i = 1 to s (2)
which can be expressed in a simpler form given by
V(K) = [summation of] cov([k.sub.i], K) where i = 1 to s.(3)
Podoler and Rogers (1975) used the regression coefficient of [k.sub.ij] against [K.sub.j] to evaluate the influence of [k.sub.ij] on [K.sub.j]. Their method is identical to the division used by Smith (1973), since the regression coefficients are proportional to cov([k.sub.j], K). These kinds of analyses divide the variance into stages, but do not divide it into factors, unless the stage is subjected to only a single mortality factor. Therefore, I call these analyses key-stage analyses instead of key-factor analyses.
The deviation of the total mortality, [Mathematical Expression Omitted], which is influenced by the mortality of each stage, is also influenced by various environmental factors, e.g., rainfall, temperature, food quality, and natural enemies. As an illustration, imagine the following field conditions of four plots (j = 1, 2, 3, 4): j = 1, higher rainfall and higher temperature; j = 2, higher rainfall and lower temperature; j = 3, lower rainfall and higher temperature; j = 4, lower rainfall and lower temperature. Assume that [Mathematical Expression Omitted] increases by the increment [B.sub.1] when the rainfall is higher and that [Mathematical Expression Omitted] increases by the increment [B.sub.2] when the temperature is higher. Knowledge of the effects of a variation of the environmental factors on the variation of [K.sub.j] is desired. Hence, the sum of the effects of each environmental factor should be defined as zero. Furthermore, assume that [Mathematical Expression Omitted] decreases by the decrement [B.sub.1] when the rainfall is lower and that [Mathematical Expression Omitted] decreases by the decrement [B.sub.2] when the temperature is lower.
The deviation of the total mortality, [Mathematical Expression Omitted], will be further subjected to variabilities that cannot be explained by known factors. Hence, [Mathematical Expression Omitted] should be written as follows:
[Mathematical Expression Omitted] (4)
where [U.sub.j] are the unknown variabilities. It is convenient to express these equations by a matrix form:
[Mathematical Expression Omitted]. (5)
Generally, the deviation [Mathematical Expression Omitted] can be expressed by
[Mathematical Expression Omitted] (6)
where K is a column vector with elements [K.sub.j], [Mathematical Expression Omitted] is a column vector with elements [Mathematical Expression Omitted], X is a design matrix determining the arrangement of environmental factors, B is a column vector expressing the effects of factors, and U is a column vector with elements [U.sub.j]. It is being assumed that the number of observations is the same for all combinations of the levels of factors.
To identify the key factor, I divide the variance, [Mathematical Expression Omitted], into factors. A simple principle to achieve the division is to use the sum of squares in the ANOVA table. In the performance of the ANOVA, treat each factor as an independent variable, and treat K, (not [Mathematical Expression Omitted]), as a dependent variable. Then, the column of the sum of squares in the ANOVA table gives a partition of [Mathematical Expression Omitted]. To clarify the relation between the ANOVA and the conventional key-factor analyses in the next section, [X.sub.h] is defined as the X matrix in which only the columns related to the hth factor are included and the other columns are replaced by zeros (h = 1, 2, . . ., f). For the example, these matrices are as follows:
[Mathematical Expression Omitted]
x = [summation of] [X.sub.h] where h = 1 to f. (7)
TABLE 1. Key-factor, key-stage table for the mortality of larvae of P. rapae crucivora. Instar Factor df 1st 2nd 3rd 4th Total(a) Block 3 -114 105 272 423 686 Spacing 1 56 -76 172 181 332 Season 1 29 -22 -16 575 565 Spacing x season 1 67 -29 62 13 114 Unknown variability 9 4 -41 361 742 1066 Total(b) 15 42 -64 850 1933 2762 Notes: The variance of the total mortality, V(K), is divided into five factors X four stages. All values are multiplied by 104 to facilitate the comparison. a This column shows which factor is the key factor. b This row shows which stage is the key stage.
Then, the sum of squares in the ANOVA table is expressed by [Mathematical Expression Omitted], and the residual sum of squares is expressed by [Mathematical Expression Omitted], where a hat (^) indicates estimation obtained by ANOVA and a prime (^) indicates a transposition (see Searle 1971). Hence, V(K) can be expressed as
[Mathematical Expression Omitted]. (8)
Eq. 8 divides the variance into (f + 1) components separated by brackets, each corresponding to a factor and unknown variability, where f is the total number of factors. Hence, the relative contribution of each factor can be evaluated by comparing these (f + 1) terms. This kind of analysis should be called key-factor analysis, since it divides the variance into factors.
The key-stage analysis given by Eq. 3 and the key-factor analysis given by Eq. 8 can be combined in the following manner. I have considered the effects of environmental factors on the deviation of the total mortality [Mathematical Expression Omitted]. In an actual situation however, the environmental factors affect each of the life stages. To take this into account, I express the effects of environmental factors on the ith stage by
[Mathematical Expression Omitted] (9)
where [k.sub.i] is a column vector with elements [k.sub.ij], [Mathematical Expression Omitted] is a column vector with elements [Mathematical Expression Omitted], [b.sub.i] is a column vector expressing the effects of factors in the ith stage, and [u.sub.i] is a column vector expressing the unknown variability in the ith stage. As a logical consequence, [Mathematical Expression Omitted] and [Mathematical Expression Omitted]. To evaluate the effect of each combination of factor x stage on V(K), the variance and covariance of [k.sub.i] in Eq. 2 should be divided into factors. A simple principle to achieve the division is to use the sum of squares and sum of products calculated in MANOVA. The [k.sub.i], (not [Mathematical Expression Omitted]), of the different stages are to be treated as different variables in the performance of MANOVA. Then, the variance is expressible as
[Mathematical Expression Omitted] (10)
where [Mathematical Expression Omitted] and [Mathematical Expression Omitted] are the elements of the SSP matrix (sum of squares and products matrix) calculated in MANOVA (see Chatfield and Collins 1980). Eq. 10 divides the variance into s(f + 1) components, separated by brackets, each corresponding to a combination of factors x stages. These s(f + 1) terms may be easily obtained by using statistical software such as SAS, JMP, or Systat; [Mathematical Expression Omitted] is obtained by summing the elements of the ith column of the SSP matrix of the hth factor; and [Mathematical Expression Omitted] is obtained by summing the elements of the ith column of the error SSP matrix. In the usual MANOVA, the SSP matrix is used to test the effects of factors. However, such kinds of statistical tests are not of interest at present, since the manner in which the total variance is attributed to factors and stages is under investigation.
The effectiveness of this method will be demonstrated using the data obtained by the experiment studying the effect of plant spacing on the mortality of P. rapae crucivora (Yamamura and Yano, unpublished data). The data are listed in the Appendix so that readers can verify the calculations. In this experiment, cabbage seedlings were planted at two levels of plant spacing: 2 x 2 m (sparse) and 0.5 X 0.5 m (dense). The experimental plots, each 10 x 10 m, were replicated by four blocks, and the number of the first, second, third, fourth, and fifth instar larvae were estimated in each plot. The experiment was repeated twice in 1989, starting on May 25 (spring experiment) and on July 12 (summer experiment).
The result of the analysis is expressed in a "key-factor/key-stage table," with columns indicating stages and rows indicating factors (Table 1). The "Total" column lists the sum of squares of each factor given by Eq. 8, showing the degree of influence due to each factor in determining the fluctuation of total mortality. The largest component is the unknown variability (1066) in this example. The "Total" row lists the result of the key-stage analysis given by Eq. 3, showing the degree of influence due to each life stage in determining the fluctuation of total mortality. The mortality of the fourth-instar larvae is most influential (1933) in this example. Furthermore, the key-factor/key-stage table shows the combination of factor x stage that is influential in determining the total mortality. The largest source of the variance of K is the effect of unknown variability through the mortality of the fourth-instar larvae (742) in this example. The variation in season also significantly increases the variance of K through the mortality of the fourth-instar larvae (575). The effects of block, i.e., the effects of spatial variation, through the mortality of the third- and fourth-instar larvae are also relatively large (272 and 423, respectively).
The results of the key-factor/key-stage analysis and those of the conventional key-factor analyses may be sometimes contradictory, since the key-factor/key-stage analysis distinguishes between the cause and the effect. For example, the percentage of parasitism that is directly measured in field is not a factor in the key-factor/key-stage analysis; it is an effect caused by factors such as the number of parasites. As an illustration, consider a situation in which the activity of parasites is highly influenced by the weather condition (Kiritani and Hokyo 1970, for example). If the weather condition fluctuates widely, the percentage of parasitism fluctuates widely. In this case, the conventional key-factor analyses identify the parasitism as the key factor. In contrast, the key-factor/key-stage analysis identifies the weather condition as the key factor and states that the variability of weather condition increases the variance of K through parasitism. Thus, if the environmental factors as well as the mortality of populations could be measured, the key-factor/key-stage analysis could quantitatively identify the ultimate cause of population dynamics.
The key-factor/key-stage analysis can be applied to numerical factors as well as nominal factors if only a single factor is considered. For example, it is possible to evaluate the effect of temperature, given by numerical values instead of nominal levels, on the population variability. In this case, X in Eq. 6 is a column vector whose elements are the temperature with the mean subtracted. The variance, V(K), can be divided into the effect of temperature and the effect of unknown variability on each stage, using the SSP matrix generated by multivariate linear regression analysis. When more than one factor is analyzed, however, numerical factors will frequently present problems, i.e., if the components of design matrix, [X.sub.h], are not mutually orthogonal, the key factor cannot be determined, since V(K) cannot be divided algebraically. This difficulty can be intuitively understood by considering an imaginary situation where the rainfall and the temperature are closely correlated, e.g., more observations are obtained for the combination of high rainfall and high temperature. In such a case, even if the rainfall has a large contribution to the population variability, it cannot be judged whether this contribution derives from the rainfall or from the temperature. Therefore, the experiment should be carefully designed when more than one factor is being considered. It is preferable that the number of observations is the same for all combinations of levels of factors.
It is also possible to analyze the variability of the population that completed the sth stage, ([N.sub.s + 1, j), instead of the variability of mortality. For convenience, let [k.sub.0j] be the negative logarithm of the number of individuals entering the first stage, i.e., [k.sub.0j] = -log([N.sub.1j]). Then, the following relation holds: [Mathematical Expression Omitted]. Hence, the variance can be divided into the contribution of each stage by a form similar to Eq. 3:
V[log([N.sub.s + 1])] = [summation of] cov[[k.sub.i], - log([N.sub.s + 1])] where i = 0 to s. (11)
The variance can be further divided into each combination of factor x stage by a form similar to Eq. 10:
[Mathematical Expression Omitted] (12)
where the (f + 1)[(s + 1).sup.2] terms seen inside the summation sign are the elements of the SSP matrix calculated in MANOVA in which [k.sub.i] of the different stages are treated as different variables.
The estimates of population size are sometimes subjected to large sampling errors. The population data used for the calculation of Table 1 also contain sampling errors; the survival rate exceeds 1 in several elements. If [N.sub.ij] contains such sampling errors, the cov([k.sub.i], K) in Eq. 3 is overestimated for i = 1 and i = s, and the cov[[k.sub.i], -log([N.sub.s + 1])] in Eq. 11 is overestimated for i = s (see Kuno 1971). Similarly, the components of the first and final stages are overestimated in Eq. 10, as are components of the final stage in Eq. 12. Thus, the results of the analysis may be biased in this case. Careful identification of the key factor and key stage should be made if the estimates of population sizes contain large sampling errors.
I thank K. Kiritani and K. Iwao for their valuable comments on the manuscript.
Chatfield, C., and A. J. Collins. 1980. Introduction to Multivariate Analysis. Chapman, London, UK.
Harcourt, D. G. 1986. Population dynamics of the diamondback moth in Southern Ontario. Pages 3-15 in T. D. Griggs, editor. Diamondback moth management: proceedings of the first international workshop Asian Vegetable Research and Development Center, Shanhua, Taiwan
Kiritani, K., and N. Hokyo. 1970. Studies on the population ecology of the southern stink bug, Nezara viridula L. (Heteroptera: Pentatomidae). In Japanese. Ministry of Agriculture, Forestry, and Fisheries, Tokyo, Japan.
Kuno, E. 1971. Sampling error as a misleading artifact in "key factor analysis." Researches on Population Ecology 13:28-45
Manly, B. F. J. 1977. The determination of key factors from life table data. Oecologia 31:111-117.
Metcalfe, J. R. 1972. An analysis of the population dynamics of the Jamaican sugar-cane pest Saccharosydne saccharivora (Westw.) (Hom., Delphacidae). Bulletin of Entomological Research 62:73-85.
Morris, R. F. 1957. The interpretation of mortality data in studies on population dynamics Canadian Entomologist 89:49-69.
----- 1959. Single-factor analysis in population dynamics. Ecology 40:580-588.
Mott, D. G. 1966. The analysis of determination in population systems Pages 179-194 in K. E. E Watt, editor Systems analysis in ecology Academic Press, New York, New York, USA.
Podoler, H., and D. Rogers. 1975. A new method for the identification of key factors from life table data. Journal of Animal Ecology 44:85-114.
Searle, S. R. 1971. Linear models Wiley, New York, New York, USA.
Smith, R. H. 1973. The analysis of intra-generation change in animal populations Journal of Animal Ecology 42:611622.
Varley, G. C., and G. R. Gradwell. 1960. Key factors in population studies. Journal of Animal Ecology 29:399401.
|Printer friendly Cite/link Email Feedback|
|Date:||Mar 1, 1999|
|Previous Article:||Effects of cumulative defoliations on growth, reproduction, and insect resistance in mountain birch.|
|Next Article:||Multi-criteria assessment of ecological process models.|