Printer Friendly

Two Types of GGE Biplots for Analyzing Multi-Environment Trial Data.

MULTIENVIRONMENT TRIALS are conducted for all major crops throughout the world. The main purpose of MET is to identify superior cultivars for recommendation to farmers and to identify sites that best represent the target environment. Usually, a large number of genotypes are tested over a number of sites and years, and it is often difficult to determine the pattern of genotypic responses across environments without the help of graphical display of the data.

Yan et al. (2000) developed a "GGE biplot" methodology for graphical analysis of MET data. "GGE" refers to the genotype main effect (G) plus the genotype x environment interaction (GE), which are the two sources of variation that are relevant to cultivar evaluation. A biplot (Gabriel, 1971) is a plot that simultaneously displays both the genotypes and the environments (or in more general terms, both the row and the column factors). The GGE biplot is a biplot that displays the GGE of MET data. It is constructed by plotting the first two principal components (PC1 and PC2, also referred to as primary and secondary effects, respectively) derived from singular value decomposition (SVD) of the environment-centered data. Models that decompose the environment-centered data are commonly referred to as sites regression models or SREG, and SREG with two PCs is referred to as [SREG.sub.2]. SREG can be used on scaled or non-scaled data. When replicated data are available, SREG on scaled data (Crossa and Cornelius, 1997) is more desirable because it deals with any heterogeneity of within-site error variance.

One unique merit of a GGE biplot is that it can graphically show the which-won-where patterns of the data, as first described in Yan et al. (2000). Briefly, markers of the cultivars furthest from the plot origin (0,0) are connected with straight lines to form a polygon such that markers of all other cultivars are contained in the polygon. To each side of the polygon, a perpendicular line, starting from the origin of the biplot, is drawn and extended beyond the polygon so that the biplot is divided into several sectors and the markers of the test sites are separated into different sectors. The cultivar at the vertex for each sector is the best performer at sites included in that sector, provided that the GGE is sufficiently approximated by PC1 and PC2. Thus, groups of sites that share the same best performers are graphically identified.

If the which-won-where patterns identified by a biplot are repeatable over years, different mega-environments (subregions) can be defined. By selecting superior cultivars for each mega-environment, both G and GE can be effectively exploited. The GGE biplot is still useful even in cases where the which-won-where patterns are not repeatable over years, which suggests that the tested environments belong to a single mega-environment. It can be used to identify superior cultivars and test environments that facilitate identification of such cultivars, provided that the target mega-environment is sufficiently sampled and that the genotype PC1 scores have near-perfect correlation (say, r [is greater than] 0.95) with the genotype main effects. Ideal cultivars should have large PC1 scores (higher average yield) and near zero PC2 scores (more stable). Similarly, ideal test environments should have large PC1 scores (more discriminating of the cultivars) and near zero PC2 scores (more representative of an average environment). (Note that a "test environment" refers to a year-site combination; it does not necessarily correspond to a "test site".) Thus, the GGE biplot allows many important questions to be addressed effectively and graphically.

However, the requirement for a near-perfect correlation between genotype PC1 scores and genotype main effects is not always met, which restricts to the utility of the [SREG.sub.2] based GGE biplot. Analysis of the yearly MET data of the Ontario winter wheat performance trials during 1989-1999, and of winter wheat performance trials from several states of the USA (Yan, unpublished) indicates that the genotype PC1 scores are usually highly correlated with the genotype main effect. Poor correlations between genotype PC1 scores and genotype main effects, however, do occur for some years. Moreover, when multiple years of data are analyzed together, this becomes a norm rather than an exception because of large and complex GE interaction (discussed later). In such cases, the genotype PC1 scores cannot be interpreted as representing the same information as the genotype main effects. Consequently, the yielding ability and stability of the genotypes, and the discriminating ability and the representativeness of the test environments cannot be readily visualized.

To avoid these possible exceptions, in this paper we report an alternative GGE biplot, which is constructed by Mandel's sites regression on genotype main effects as the primary effect and the first principal component derived from subjecting that residual to SVD as the secondary effect. Such a GGE biplot is referred to as a [SREG.sub.M+1] biplot, with the subscript "M" referring to Mandel's solution. In a [SREG.sub.M+1] biplot, the primary effects are the genotype main effects per se; it is, therefore, free from the problem discussed above for the [SREG.sub.2] biplot. However, it is not clear if a [SREG.sub.M+1] biplot is as effective as the [SREG.sub.2] biplot in explaining the GGE and in displaying the which-won-where patterns of the data. This study was initiated to answer these questions by comparing the [SREG.sub.2] biplot and the [SREG.sub.M+1] biplot applied to several datasets that showed different relations between genotype PC1 scores of [SREG.sub.2] and the genotype main effects.


The [SREG.sub.2] Biplot

The [SREG.sub.2] based GGE biplot is derived from Eq. [1]


where [Y.sub.ij] is the average yield of Genotype i in Environment j, [[Beta].sub.j] is the average yield of all genotypes in Environment j, [[Lambda].sub.n] is the singular value for principal component PCn, [[Xi]] and [[Eta].sub.nj] are scores for Genotype i and Environment j on PCn, respectively, and [[Epsilon].sub.ij] is the residual associated with Genotype i in Environment j. The values of [[Lambda].sub.n], [[Xi]], and [[Eta]] are simultaneously obtained by subjecting the environment-centered yield (i.e., [Y.sub.ij]-[[Beta].sub.j]) to SVD. This can be achieved by principal component analysis of the environment-centered yield using the SAS procedure PRINCOMP. The PRINCOMP generates [[Xi]], as the genotype scores and ([[Lambda].sub.n][[Xi]]) as the environment scores. Alternatively, [[Lambda].sub.n], [[Xi]] and [[Eta].sub.jn] can be obtained by the SVD function within the SAS procedure IML, which is a basic function in many SAS procedures related to principal component analysis. A SAS program for principal component analysis of MET data is available from the senior author of this paper.

To display results of fitting Eq. [1] in a biplot, the singular value [[Lambda].sub.n] has to be absorbed by the singular vector for cultivars [h.sub.jn] and that for environments [[Xi]]. That is, [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. [A.sub.n] is chosen such that the range of the environment markers is equal to the range of the cultivar markers:





[2] [A.sub.n] = 0.5 {1 + 1n(max([[Eta].sub.jn]) - min([[Eta].sub.jn])/ max([[Xi]]) - min([[Xi]])) / 1n[[Lambda].sub.n]}.

The [SREG.sub.M+1] Biplot

Mandel (1961) presented the following model for analysis of non-additivity of two-way data:

[3] [Y.sub.ij] = [[Beta].sub.j] + [b.sub.j][[Alpha].sub.i] + [[Epsilon].sub.ij]

where [Y.sub.ij] and [[Beta].sub.j] are the same as in Eq. [1], [[Alpha].sub.1] is the main effect of Genotype i, and [b.sub.j] is the regression coefficient of the environment centered yields (i.e., [Y.sub.ij] - [[Beta].sub.j]) within Environment j on the genotype main effects ([a.sub.i]). Equation [3] is similar to the well-known model of Finlay and Wilkinson (1963), but the roles of cultivars and sites are exchanged.

If the first principal component ([[Lambda].sub.1][[Xi].sub.i1][[Eta].sub.j1]) from SVD of the residual from Eq. [3], i.e., ([Y.sub.ij] - [[Beta].sub.j] - [b.sub.j][[Alpha].sub.i]), is added, then

[4] [Y.sub.ij] = [[Beta].sub.j] + [b.sub.j][[Alpha].sub.i] + [[Lambda].sub.1][[Xi].sub.i1][[Eta].sub.j1] + [[Epsilon].sub.ij] or [Y.sub.ij] - [[Beta].sub.j] = [b.sub.j][[Alpha].sub.i] + [[Lambda].sub.1][[Xi].sub.i1][[Eta].sub.j1] + [[Epsilon].sub.ij]

where all terms are the same as defined in Eq. [1] or [3]. To construct a [SREG.sub.M+1] biplot, Eq. [4] is written as



[6] B = [square root of max([[Alpha].sub.i]) - (min([[Alpha].sub.i])/ max([b.sub.j]) - min([b.sub.j])].

[A.sub.1] and B are chosen such that the plot space used by genotypes are the same as that by environments. Analogous to PC1 and PC2 in the [SREG.sub.2] model, [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] and [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] are referred to as the primary and secondary effects, respectively. All analyses were conducted using SAS (SAS Institute, 1996).

The Data

The data used in this study were from the 1989 to 1999 Ontario winter wheat performance trials (Yan, 1999). Each year, 10 to 33 winter wheat (Triticum aestivum L.) cultivars are tested with four to six replicates in seven to 14 sites representing the Ontario winter wheat growing areas. Previous analysis indicated that the yearly variance components due to environment (E) dominated the total yield variation, ranging from 55 to 91% and averaging 80% of the total variance. The variance component due to G ranged from 1.8 to 28.5%, whereas that due to GE ranged from 7.3 to 15.1% (Yan, 1999). G ranged from 13 to 65% of the total GGE. Analysis with the [SREG.sub.2] biplot revealed that in all years except 1995 the environmental PC1 scores were of the same sign; and in all years except 1995 and 1996 the genotype PC1 scores showed high correlation with the mean yield of the genotypes (r [is greater than] 0.93). Thus, in this study the 1995, 1996, and 1998 datasets, representing different types of relations between genotype PC1 versus genotype main effects, were chosen to compare the GGE biplot based on [SREG.sub.M+1] with one based on [SREG.sub.2]. In addition, a complete subset of 11 cultivars by 34 environments (year-site combinations) extracted from the 1996 to 1999 trials was also used in the comparison.


For all datasets, both [SREG.sub.2] and [SREG.sub.M+1] use the same number of degrees of freedom [(g+e-2)+ (g+e-4) or 2(g+e)-6, where g is the number of genotypes and e the number of the environments] (Table 1). With the same number of degrees of freedom, [SREG.sub.2] is theoretically the most effective model for explaining the variation due to GGE, because the first two principal components are computed to explain the maximum amount of variation. Nevertheless, [SREG.sub.M+1] explained only slightly smaller amounts of GGE. When averaged over 12 datasets, [SREG.sub.2] explained 69.1%, whereas [SREG.sub.M+1] explained 67.8% of the total GGE (Table 1). Thus, [SREG.sub.M+1] is nearly as effective as [SREG.sub.2] in explaining the variation of GGE. So the discussion will be focused on whether the [SREG.sub.M+1] biplot displays similar which-won-where pattern as the [SREG.sub.2] biplot.
Table 1. Proportions of GGE SS explained by [SREG.sub.2] and
[SREG.sub.M+1] for 12 datasets from the 1989-1999 Ontario winter
wheat performance trials.

                                            % of GGE explained


           No. of     No. of   Degrees of
Year      cultivars   sites     freedom     PC1    PC2    Total

1989         10         9          32       42.5   21.3    63.8
1990         10         7          28       59.7   21.2    80.9
1991         10         9          32       53.3   20.7    74.0
1992         10        10          34       57.0   19.9    76.9
1993         18         9          48       56.8   20.0    76.8
1994         14        11          44       45.6   16.2    61.8
1995         14        14          50       54.2   13.4    67.6
1996         23         9          56       29.6   24.5    54.1
1997         28         8          66       55.0   15.9    70.9
1998         33         8          76       71.5   14.7    86.2
1999         31         9          74       51.5   17.4    68.9
1996-99      11        34          84       24.5   22.7    47.2

Average      --        --          --       50.1   19.0    69.1

               % of GGE explained


Year      Primary   Secondary   Total

1989       40.7       21.9       62.6
1990       53.5       25.1       78.6
1991       49.1       22.1       71.2
1992       56.4       20.1       76.5
1993       55.4       21.2       76.6
1994       41.6       16.8       58.4
1995       40.8       25.2       66.0
1996       26.7       25.3       52.0
1997       54.0       15.9       69.9
1998       71.0       15.2       86.2
1999       50.7       17.7       68.4
1996-99    23.0       23.9       46.9

Average    46.9       20.9       67.8

1998 Data

The PC1 scores of the [SREG.sub.2] model had near-perfect correlation (r = 0.99) with the genotypic main effects for this dataset. Consequently, the [SREG.sub.2] biplot and the [SREG.sub.M+1] biplot look almost exactly alike. They were, therefore, equally effective in displaying the GGE information (Fig. 1A and 1B).


The GGE biplot is constructed by plotting the primary effect scores of each genotype (as x-axis) and each site against their respective secondary effect scores (as y-axis) such that each genotype and each test site is represented by a "marker." For visualizing the which-won-where pattern, the genotype markers located away from the plot origin were first visually identified and connected with straight lines to form a polygon, within which the markers of all other genotypes are contained. These away-from-origin genotypes, namely 6, 9, 29, 33, 27, 28, 20, and 2 in Fig. 1A, are called "corner" or "vertex" genotypes because they are at the corners of the polygon. Next, starting from the origin, lines perpendicular to the sides of the polygon are drawn to, and extended beyond, each side of the polygon dividing the plot into several sectors; each site will fall into one of the sectors (note that only perpendiculars relevant to discussion were drawn). Assuming that the biplot sufficiently approximates the variation of GGE, it can be mathematically proven that all sites in the same sector share the same winning genotype, which is the vertex genotype for that sector (Yan et al., 2000).

In Fig. 1A, the sites fell into three sectors: the winning genotype for sites RN, WE, ID, and NN was Genotype 6; the winning genotype for sites WK, HN, and EA was Genotype 9; and the winning genotype for site OA was Genotype 29. Note that Genotype 9 was the best performer for WK, HN, and EA because markers of these sites were on Genotype 9's side of the perpendicular to the line that connects Genotypes 9's marker and that of genotype 6. Vertex genotypes without any site in their sectors were not the highest yielding genotypes at any site; moreover, they were the poorest genotypes at all or some sites. Genotypes within the polygon, particularly those located near the plot origin, were less responsive than the vertex genotypes. It can be appreciated that the supplementary lines on the biplot are critical for visual analysis of the MET data.

In addition, a near-perfect correlation between genotype primary effect scores and the genotype main effects allows both biplots, Fig. 1A, as well as Fig. 1B, to be used to evaluate cultivars for their yielding ability and stability and to evaluate environments for their discriminating ability and representiveness. Genotypes 6 and 9 gave the highest average yields (largest primary scores) and were relatively stable over the sites (small absolute secondary scores). In contrast, three non-adapted Genotypes 27, 28, and 31 yielded poorly at all sites, as indicated by their small primary scores (low yielding) and relatively small secondary scores (relatively stable). The average yield of Cultivars 1 and 20 were below average (primary scores [is less than] 0) and highly unstable (large absolute secondary values). The biplots show not only the average yield of a genotype (the primary effect), but also how it was achieved. That is, the biplots also show the yield of a genotype at individual sites. For example, Cultivar 6 had the highest average yield because it yielded the highest at sites RN, WE, ID, and NN, and yielded above average at all other sites. On the other hand, the average yield of Cultivar 20 was below average, because it yielded below average at sites OA, EA, HN, WK, and NN, even though it was quite good at RN. A below-average yield is indicated if the virtual line from the origin to the marker of a genotype has an obtuse angle with the virtual line from the origin to the marker of a test site. Likewise, an above-average yield is indicated by an acute angle. Supplementary lines, not presented in the biplots, are required to explicitly determine these relationships.

With respect to the test sites, RN was most discriminating as indicated by the longest distance between its marker and the origin. However, due to its large secondary score, cultivar differences observed at RN may not exactly reflect the cultivar differences in average yield over all sites. Site NN was not the most discriminating, but cultivar differences at NN should be highly consistent with those averaged over sites because it had a near-zero secondary effect score. At a site with a near-zero secondary effect score, the genotypes are essentially ranked according to their primary effect scores (i.e., genotype main effects since they were perfectly correlated in this dataset) and the differences among genotypes are in proportion to the primary effect scores of the sites. Thus, a genotype that yielded well at such a site has a large average yield. On the contrary, site OA was neither discriminating (small primary effect score) nor representative (large secondary effect score); and therefore, cultivars had high yield at OA did not necessarily give high average yield over sites. Analysis of multiple year data indicated that OA represented a different mega-environment (eastern Ontario) from the major winter wheat growing regions in Ontario (Yan et al., 2000; Yan, 1999).

1996 Data

As with most datasets, the [SREG.sub.2] biplot (Fig. 2A) for 1996 indicates that all PC1 scores of the sites were of the same sign, which was arbitrarily assigned positive so that the genotype PC1 scores correlated positively with the genotype main effect. However, as mentioned earlier, the correlation between the genotype PC1 scores and the genotype main effects for this dataset was only 0.85. The relatively poor correlation is associated with the fact that the GGE explained by PC1 is only slightly greater than that by PC2 (29.6 vs. 24.5%). The poor correlation prevents the genotype PC1 scores of the [SREG.sub.2] solution being interpreted as representing the genotype main effect; in fact, it alone is not interpretable in known biological and agricultural terms. In such cases, the utility of a [SREG.sub.2] biplot is limited to investigation of the which-won-where patterns. Based on Fig. 2A, Cultivar 1 was the best performer at sites RN, LN, ID, and WE; and Cultivar 2 was the best performer at sites EA, WK, CA, and OA, and nearly the best at HW.


The [SREG.sub.M+1] biplot (Fig. 2B) explained slightly less GGE, but revealed the same which-won-where patterns as the [SREG.sub.2] biplot. It indicates that Cultivar 1 won at sites RN, LN, WE, and ID, and Cultivar 2 won at sites EA, WK, CA, HW, and OA. In addition, the [SREG.sub.M+1] biplot is more interpretable. By definition, the primary effects of the [SREG.sub.M+1] biplot are the cultivar main effects, and its secondary effects are deviations from the main effects of the cultivars. Thus, the [SREG.sub.M+1] biplot explicitly showed that Cultivars 1 and 2 were the highest yielding cultivars on average, but neither was very stable, as evidenced by their relatively large secondary effects. With respect to the sites, the [SREG.sub.M+1] biplot indicated that site EA was highly discriminating, but not representative of the average environment, whereas WK and RN were both discriminating and representative.

1995 Data

The 1995 dataset was the only dataset found during the 1989 to 1999 Ontario winter wheat performance trials in which the site PC1 scores of the [SREG.sub.2] differ in sign (Fig. 3A). Among the 14 test sites, four (Sites 4, 6, 7, and 10) had negative PC1 scores, though their absolute values were small. This led to poor a correlation between the cultivar PC1 scores and the cultivar main effects (r = 0.83). The [SREG.sub.2] biplot indicates that cultivar G6 was the best for nearly all sites except Sites 4, 6, and 7, at which Cultivar G4 (and also G10) was better than G6. Cultivar G7 was as good as G6 for Sites 5 and 12. These patterns are similar in the [SREG.sub.M+1] biplot (Fig. 3B). It indicates that Cultivar G6 was on average the best and Cultivar G12 the second best, and that Sites 5 and 12 were highly discriminating but neither was representative. Interestingly, all sites had positive primary effects in the [SREG.sub.M+1] biplot, as compared with the site PC1 scores of different signs in the [SREG.sub.2] biplot.


1996-1999 Data

Although the environmental PC1 scores in the [SREG.sub.2] model tend to be of the same sign for yearly MET, they often take different signs when multi-year data are jointly analyzed. For this dataset, among all 34 year-site combinations, 9 had negative PC1 scores and the rest had positive PC1 scores (Fig. 4A). Like the 1996 data, the GGE explained by PC1 was only slightly greater than that by PC2 (24.5 vs. 22.7%). As a result, the correlation between cultivar PC1 scores and cultivar main effects was only 0.58. This low correlation prevents visual identification of cultivars with high average yield based on the [SREG.sub.2] biplot. Nevertheless, as with all previous datasets, both biplots displayed very similar which-won-where patterns (Fig. 4A and 4B). The [SREG.sub.2] biplot predicted that cultivar "2533" was the best performer in about half of the 34 environments while cultivar "Men" was the best in the other half. Therefore, it can be inferred that cultivars "2533" and "Men" must be the two best performers on average. This, however, is explicitly indicated only in the [SREG.sub.M+1] biplot. As for the 1995 dataset, while the primary effects of the environments were of different signs in the [SREG.sub.2] biplot, they were all positive in the [SREG.sub.M+1] biplot.



Merits of the Two Types of GGE Biplots

This study indicates that both the [SREG.sub.2] biplot and the [SREG.sub.M+1] biplot explained similar amounts of variation due to GGE, although the former tends to explain slightly more in most cases. Both biplots displayed the same which-won-where pattern and indicated the same winning cultivars in individual environments. Therefore, the two biplots can be considered as equally effective in these regards.

The [SREG.sub.M+1] biplot was designed to be more interpretable than the [SREG.sub.2] biplot. First, since the genotypic scores for the primary effect of [SREG.sub.M+1] are designated to indicate the average yield (general adaptation) of the cultivars, the genotypic scores of the secondary effect must indicate GE interaction associated the cultivars, which is an indicator of selective or specific adaptation. Thus, the [SREG.sub.M+1] biplot simultaneously displays both general adaptation and specific adaptation (stability) of the cultivars. The ideal cultivars are those with large primary effect scores but near-zero secondary scores. Second, because the genotypic primary effects indicate general adaptation of the cultivars, the environmental primary effects must indicate the ability of the environments to discriminate among the cultivars in terms of general adaptation. Environments with larger primary effects would thus facilitate identification of cultivars with better general adaptation. Third, analogous to the genotypic secondary effects, the environmental secondary effects must indicate the tendency of each environment to cause GE interaction. Environments with large (absolute) secondary effects should favor the performance of some cultivars, but disfavor others at the same time. Thus, cultivars selected under environments with large secondary effects may be highly specific to these environments but lack general adaptation or stability. Therefore, from the perspective of selection for high yielding and stable cultivars, the ideal test environments should have large primary effects, but near-zero secondary effects.

Why Correlation between Genotype Scores of PC1 in [SREG.sub.2] and Genotype Main Effects Varies with Datasets

It was concluded that the [SREG.sub.M+1] biplot is more desirable than the [SREG.sub.2] biplot for MET data analysis because the interpretability of the latter is impacted by the uncertain relations between its primary effects and the genotype main effects. On the basis of the trials investigated in this study, Fig. 5 indicates that this correlation is strongly determined by the relative importance of G in GGE. Near-perfect correlation occurs when G is 40% or more of GGE (the 1992, 1993, 1997-1999 datasets), and poor correlation occurs when G is 20% or less of GGE (the 1995, 1996 and 1996-1999 datasets). The essence of principal component analysis is to pick up the most important pattern in the data using the smallest number of degrees of freedom. PC1 picks up the largest pattern, PC2 picks up the second largest pattern, and so on. A close correlation between PC1 scores and genotype main effects occurs only when the genotype main effect is large enough to be the most important component of GGE. A poor correlation occurs otherwise, which suggests strong and complex GE interaction in the data. Therefore, it is not surprising that the correlation between PC1 scores of [SREG.sub.2] and genotype main effect is typically poor when multi-year data are analyzed in a genotype x environment (year-site) fashion, because greater and more complex GE interactions are sampled in a multi-year MET than in a single year MET. Complex GE interaction is usually accompanied by similar amounts of GGE explained by PC1 and PC2 (as for the 1996 and 1996-1999 datasets, Table 1), as opposed to much more GGE explained by PC1 than by PC2 (e.g., the 1998 dataset).


The Usefulness of the GGE Biplot Based on a Single Year MET

As a graphic approach to MET data analysis, GGE biplot can be useful in two major aspects. The first is to display the which-won-where pattern of the data, which may lead to identification of different mega-environments. The second is to identify high-yielding and stable cultivars and discriminating and representative test environments. However, both promises are based on the assumption that the data is sufficiently representative of the target environment; a conclusion can never go beyond what the data allow. While multi-year MET data are required for any decisive cultivar and site evaluation, they are normally unbalanced, and therefore the biplot technique can not readily applied; single year data are usually balanced but they may not be representative of future years. Thus, a question arises whether biplot analysis of single year MET data is really useful if the which-won-where pattern is not repeatable over years.

A single year data may indeed have limited value because of the year-to-year variation. Nevertheless, we believe biplot analysis of single year MET data is worthwhile for the following reasons. First, the GGE biplot is a graphic display of the G and GE of the data, which are relevant to cultivar evaluation and mega-environment identification. Therefore, if the researcher believes that a single year MET is worthy of analysis, and we believe most researchers do, the GGE biplot technique should be the first choice. Although the biplot does not add new information to the data, it does help the researcher quickly view the patterns that are in the data. The biplot gives the researcher the power to "see" what was going on in a particular year. Some may question the usefulness of the single year patterns if they are not repeatable over years. But without knowing the patterns from individual years, how could one know if they are repeatable or not? Second, the biplot can be used to identify research problems. For example, if two cultivars were found to perform the best in two different groups of locations in a particular year, one might want to know what were the underlying reasons, and answers to this question may lead to valuable findings. By relating biplot scores to explanatory variables collected in the trials, Yan and Hunt (2001) was able to reveal that in Ontario, Canada, tall and late winter wheat cultivars tended to be favored in seasons with cold winters and cool summers, whereas early and short cultivars tended to be favored in seasons with warm winters and hot summers. Third, the biplot patterns based on a single year MET can serve as hypotheses, which can be tested using extended data and more critical statistics. For example, biplots based on yearly data from the Ontario winter wheat performance trials led to the hypothesis that two eastern Ontario sites (Ottawa and Kemptville) constituted a mega-environment different from the rest of the Ontario winter wheat growing region, which was subsequently tested and supported by variance component analysis based on pooled data from 11 yr of performance trials (Yan, 1999). Thus, although conclusions from a single year MET may not be decisive, they are valuable suggestions. Fourth, even if the which-won-where pattern is proven to be unrepeatable over years, the researcher would still want to know the average yield and the stability of the cultivars based on each year's MET. These two aspects of cultivar performance are graphically depicted by the abscissa and ordinate of the biplot, respectively. Finally, although a biplot from a single year may not be very informative, biplots constructed from several years can be highly valuable.

Moreover, the biplot technique is not limited to single year MET data analysis. It can also be applied to balanced subsets extracted from multiple years of trials. In Ontario, for example, over 20 winter wheat cultivars are common to three to four years of performance trials, and a balanced subset from such database should contain valuable information. Furthermore, the biplot technique is not even limited to genotype x environment data analysis. It can also be used in displaying and analyzing other types of two-way data such as genotype x trait data and diallel cross data (Yan, unpublished research). In conclusion, the GGE biplot is a useful tool for, but not limited to, MET data analysis.


Crossa, J., and P.L. Cornelius. 1997. Sites regression and shifted multiplicative model clustering of cultivar trial sites under heterogeneity of error variances. Crop Sci. 37:405-415.

Finlay, K.W., and Wilkinson, G.N. 1963. The analysis of adaptation in a plant breeding program. Aust. J. Agric. Res. 14:742-754.

Gabriel, K.R. 1971. The biplot graphic display of matrices with application to principal component analysis. Biometrika 58:453-467.

Mandel, J. 1961. Non-additivity in two-way analysis of variance. J. Am. Stat. Assoc. 65:878-888.

SAS institute, 1996. SAS/STAT user's guide, second edition. SAS institute Inc., Cary, NC.

Yan, W. 1999. A study on the methodology of yield trial data analysis--with special reference to winter wheat in Ontario. Ph D diss., University of Guelph, Guelph, Ontario, Canada.

Yan, W., L.A. Hunt, Q., Sheng, and Z. Szlavnics. 2000. Cultivar evaluation and mega-environment investigation based on the GGE biplot. Crop Sci. 40:597-605.

Yan, W., and L.A. Hunt. 2001. Genetic and environmental causes of genotype x environment interaction for winter wheat yield in Ontario. Crop Sci. 41:19-25.

Abbreviations: G, genotypic main effect; GE, genotype x environment interaction; GGE, Genotype main effects plus genotype x environment interaction; E, environment main effect; [SREG.sub.M+1], Mandel's sites regression model with one additional multiplicative term; PC, principle component; [SREG.sub.2], Sites regression model with two multiplicative terms; SVD, singular value decomposition.

Weikai Yan,(*) Paul L. Cornelius, Jose Crossa, and L. A. Hunt

W. Yan and L.A. Hunt, Dep. of Plant Agriculture, Univ. of Guelph, Guelph, Ontario, Canada N1G 2W1; P.L. Cornelius, Dep. of Agronomy and Dep. of Statistics, Univ. of Kentucky, Lexington, KY 405460091; Jose Crossa, Biometrics and Statistics Unit, International Maize and Wheat Improvement Center (CIMMYT), Lisboa 27, Apdo. Postal 6-641, 06600 Mexico D.F., Mexico. Received 14 Feb. 2000. (*) Corresponding author (
COPYRIGHT 2001 Crop Science Society of America
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2001 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Yan, Weikai; Cornelius, Paul L.; Crossa, Jose; Hunt, L.A.
Publication:Crop Science
Article Type:Statistical Data Included
Geographic Code:1USA
Date:May 1, 2001
Previous Article:Comparison of Phenotypic and Marker-Assisted Selection for Quantitative Traits in Sweet Corn.
Next Article:Interpretation of Genotype x Environment Interactions for Early Maize Hybrids over 12 Years.

Terms of use | Privacy policy | Copyright © 2021 Farlex, Inc. | Feedback | For webmasters |