Printer Friendly

Comparison of Selection Strategies for Marker-Assisted Backcrossing of a Gene.

THE BACKCROSS PROCEDURE is used in plant breeding to transfer favorable alleles from a donor genotype, which has mostly poor agronomic properties, into a recipient elite genotype (Allard, 1960, p. 150). Marker assays can be of advantage in backcross breeding for foreground selection and background selection (Hospital and Charcosset, 1997). In the first approach, the presence of a target allele in an individual is diagnosed by monitoring the genotype at flanking markers for alleles of the donor parent. This is a powerful tool for manipulation of oligogenic traits under numerous situations in plant breeding (for review see Melchinger, 1990), but also for manipulation of quantitative trait loci (QTL) (Stuber, 1995). The second approach, devised by Tanksley et al. (1989), accelerates recovery of the RPG. Individuals are selected which are homozygous for the alleles of the recurrent parent at a large number of marker loci covering the entire genome. Marker-assisted background selection has meanwhile been established as a standard tool in plant breeding (see, e.g., Ragot et al., 1995).

Computer simulations have proved to be a powerful tool for investigating the design and efficiency of marker-assisted selection programs (for review see Visscher et al., 1996). These authors studied marker-assisted QTL introgression in an animal breeding context, using an infinitesimal model to explain differences among breeds. Hospital and Charcosset (1997) determined the optimal position and number of marker loci for manipulating QTL in foreground selection. Further, they investigated the combination of foreground and background selection in QTL introgression. Openshaw et al. (1994) determined the population size and marker density required in background selection. They recommended the use of four markers per chromosome (of 200-cM length) and a selection strategy for proximal recombinants of the target allele.

Although efficient PCR-based DNA markers such as simple sequence repeats and amplified fragment length polymorphisms are available (Ribaut et al., 1997), their use in background selection is restricted by the large number of required MDP. In this study, we investigate strategies for reducing the total number of MDP needed in background selection. Our research objectives were to (i) determine the number of MDP required in background selection, (ii) investigate the effects of varying population sizes from early to late backcross generations on the level of RPG and the MDP required, and (iii) compare a two-stage selection procedure, consisting of one foreground and one background selection step, with alternative selection procedures consisting of one foreground selection step and two or three background selection steps.


Genetic Map

Our simulations were based on a published linkage map of maize (Schon et al., 1994) constructed from a population of 380 [F.sub.2] individuals derived from the cross of two flint inbred lines. The total map length was 1612 cM. On the basis of previous investigations (Openshaw et al., 1994; Visscher et al., 1996; Frisch et al., 1998), an average marker density of about 20 cM is sufficient to warrant a good coverage of the genome in marker-assisted selection programs. Hence, 80 of the 89 polymorphic restriction fragment length polymorphism markers used by Schon et al. (1994) were chosen to obtain an average marker density of 20 cM. Markers umc128, umc5, umc175, bn16.06, umc54, umc51, umc110, bnl7.61, and bn19.44 were tightly linked to other markers and, therefore, excluded from the present study. There were two larger gaps on this map: one 90-cM marker interval on Chromosome 3 and one 89-cM marker interval on Chromosome 9. The target locus was assumed to be located on Chromosome 5, 30 cM from the telomere. In our simulations, the entire map was additionally covered with equally spaced (1 cM) background loci to monitor the parental origin of the whole genome.


Software PLABSIM (Frisch et al., 1999b), a computer program written in C++, was used to simulate the recombination process during meiosis. Crossover events were generated by a random-walk algorithm (Crosby, 1973, p. 237). Recombination frequencies required for the random walk were calculated from the map distance by Haldane's (1919) mapping function. This assumes that neither chiasma interference nor chromatide interference (Stam, 1979) occur. To check our simulation software, the original linkage map of Schon et al. (1994), which was based on experimental [F.sub.2] data, was compared with a linkage map constructed from simulated data of [F.sub.2] individuals by MAPMAKER software (Lander et al., 1987). Both maps were in excellent agreement, confirming that the models underlying the two software packages were similar.

Simulation Runs

Each simulation of a backcross program started by the cross of two parents, which were assumed to be homozygous and polymorphic at all loci (target locus, marker loci, background loci). The recurrent parent was assumed to carry the desirable alleles at all loci of the genome except for the target locus. The donor parent was assumed to carry the desirable allele at the target locus in homozygous state. One heterozygous [F.sub.1] individual was backcrossed with the recurrent parent and [n.sub.1] [BC.sub.1] individuals were produced. The best [BC.sub.1] individual was selected according to the selection strategies described below and, for production of generation [BC.sub.2], backcrossed with the recurrent parent. This procedure was repeated for t backcross generations. For the selected individual in each generation [BC.sub.t], the percentage of the RPG was determined by dividing the number of loci (marker and background loci) homozygous for the recurrent parent allele by the total number of loci monitored. Furthermore, each analysis of a marker locus in a backcross individual was counted as a MDP. In [BC.sub.1], the entire set of markers was analyzed (at least in the individual selected as parent for producing generation [BC.sub.2]). In the following generations, only those markers not fixed for the recurrent parent allele in the nonrecurrent parent (i.e., individual selected in the previous generation) were analyzed. The number of MDP required in each generation was counted and summed over the whole backcross program. The simulation of each backcross program was repeated 10 000 times to reduce sampling effects and obtain results with sufficient numerical accuracy.

Threshold for the RPG

The values gained from these 10000 repetitions can be regarded as realizations of random variables that describe the proportion of RPG and the total number of MDP required after t generations in a backcross program with the parameter settings considered. The 10% percentile of the empirical distribution of the RPG in the selected individual (Q10) is used as an estimator for the amount of RPG reached after selection in generation [BC.sub.t] with probability 0.90. Compared with arithmetic means, percentiles have two advantages.
   1. The skewness of the RPG distribution increases in advanced backcross
   generations. Percentiles are more suitable than arithmetic means for
   comparison of skewed distributions.

   2. Inferences about the probability to achieve a certain goal can be made.
   For example, a Q10 value of 98% means that "with probability 0.90 an RPG
   proportion greater than 98% is attained" under the considered parameter

Simulations to Determine Threshold Values

A full backcross program usually consists of six generations (Allard, 960, p. 155). Hence, the Q10 values reached in generation [BC.sub.6] by applying random selection among all individuals carrying the target allele was used as a termination threshold for a marker-assisted backcross program. This threshold was determined by simulations with selection only for presence of the target allele but no selection for any marker loci.

Selection Strategies

For describing our selection strategies in general terms, we consider a chromosome carrying the target locus (carrier chromosome) of length [l.sub.0] and c further chromosomes (noncarrier chromosomes) with length [l.sub.c]. Positions on the chromosomes are represented by a scale in Morgan units ranging from 0 to [l.sub.c]. The target locus is located at position x on the carrier chromosome and two flanking markers at positions [y.sub.1] and [y.sub.2]; i additional markers on the target chromosome are located at positions [z.sub.i]. On the non-carrier chromosomes are altogether in markers positioned at positions []. Let X, [Y.sub.1], [Y.sub.2], [Z.sub.i], and [], be indicator variables, which take the value 1, if the corresponding locus is homozygous for the recurrent parent allele and 0 otherwise. From these random variables we obtain the count variables Y = [Y.sub.1] + [Y.sub.2] and U = [Y.sub.1] + [Y.sub.2] + [[Sigma].sub.i] [Z.sub.i] + [[Sigma].sub.c][[Sigma].sub.k] []. Furthermore, we define the indicator variable Z, which is 1 if all i additional markers on the carrier chromosome are homozygous for the recurrent parent allele and 0 otherwise.

By means of the random variables X, Y, Z, and U as selection indices, three sequential selection strategies were applied. The first step always involved selection of individuals carrying the target allele (X = 0). Subsequently one, two, or three steps with background selection followed (Table 1). In each selection step, only those individuals selected in the previous step are subjected to marker assays. In the selected individual for producing the next backcross generation, all markers not fixed in the previous generation(s) are assayed to determine homozygosity and, hence, which need not to be assayed in the following generation(s).

Table 1. Description of selection steps and their sequence in the three selection strategies investigated.
                                                        Sequence of
                                                          steps in

Selection step                     Condition([dagger])    selection

Select individuals carrying
 the target allele                        X = 0               1

Select individuals homozygous
 for the recurrent parent allele          max(Y)          -([double
 at most flanking markers                                  dagger])

Select individuals homozygous
 for the recurrent parent allele
 at all additional markers on
 the carrier chromosome                   max(Z)              --

Select one individual which is
 homozygous for the recurrent
 parent allele at the maximum
 number of all markers across
 the genome                               max(U)              2

                                    Sequence of selection steps in

                                    Three-stage   Four-stage
Selection step                       selection    selection

Select individuals carrying
 the target allele                       1            1

Select individuals homozygous
 for the recurrent parent allele
 at most flanking markers                2            2

Select individuals homozygous
 for the recurrent parent allele
 at all additional markers on
 the carrier chromosome                 --            3

Select one individual which is
 homozygous for the recurrent
 parent allele at the maximum
 number of all markers across
 the genome                              3            4

([dagger]) X, [Y.sub.1], [Y.sub.2], [Z.sub.i], and [] are indicator variables, which take the value 1, if the loci at positions x, [y.sub.1], [y.sub.2], [z.sub.i], and [] are homozygous for the recurrent parent allele and 0 otherwise. From these random variables the count variables Y = [Y.sub.1] + [Y.sub.2] and U = [Y.sub.1] + [Y.sub.2] + [[Sigma].sub.i] [Z.sub.i] + [[Sigma].sub.c] [[Sigma].sub.k] [] are obtained. The indicator variable Z is 1 if all i additional markers on the carrier chromosome are homozygous for the recurrent parent allele and 0 otherwise.

([double dagger]) Not carried out.

The selection strategies differ in the selection pressure applied to carrier versus non-carrier chromosomes. In two-stage selection, selection in the second step is based on the Index U, which takes into account all marker loci irrespective of their position in the genome. In three-stage selection, the second selection step rests on the flanking markers (Index Y), while the final step is again based on all markers (Index U) irrespective of their genomic location. Four-stage selection is similar to three-stage selection, but inserts after the second step one additional selection exclusively based on the markers located on the carrier chromosome (Index Z). Hence, emphasis given to RPG recovery on the carrier chromosome increases from two- to four-stage selection. A selection procedure preferring recombinants at flanking markers similar to our three-stage selection was proposed by various authors (Tanksley et al., 1989; Hospital et al., 1992; Openshaw et al., 1994; Hospital and Charcosset, 1997).

Population Size

Backcrossing with a constant number of individuals in each generation [BC.sub.t] ([n.sub.t] = 20, 40, 60, 80, 100, 125, 150, 200) was compared with backcrossing, in which the population size [n.sub.t] varied from [BC.sub.1] to [BC.sub.3]. The total number of individuals [Sigma][n.sub.t] = 300 was allocated to backcross generations [BC.sub.1]:[BC.sub.2]:[BC.sub.3] with ratios of 3:2:1, 1:1:1, 1:2:4, 1:2:3:, 1:3:5, and 1:3:9.


In backcrossing, when selection is performed only for the presence of the target allele, the mean of the RPG was about 1% below the theoretical values expected without selection (Table 2). After six generations of backcrossing, a Q10 value of 96.7% was reached. This value was subsequently used as a threshold to determine the termination of a marker-assisted backcrossing program. From [BC.sub.7] to [BC.sub.10], Q10 increased only 2.0% with marginal gains in advanced generations.

Table 2. Simulation results for the mean and 10% percentile (Q10) of the distribution of the recurrent parent genome in generation [BC.sub.t] with random selection of individuals carrying the target allele and expected values for the mean without selection.
                 No selection          Selection
Generation           Mean          Mean          Q10

[BC.sub.1]           75.0          74.0         67.4
[BC.sub.2]           87.5          86.1         80.7
[BC.sub.3]           93.8          92.4         88.3
[BC.sub.4]           96.9          95.6         92.7
[BC.sub.5]           98.4          97.3         95.2
[BC.sub.6]           99.2          98.2         96.7([dagger])
[BC.sub.7]           99.6          98.7         97.6
[BC.sub.8]           99.8          99.0         98.1
[BC.sub.9]           99.9          99.1         98.5
[BC.sub.10]         100.0          99.3         98.7

([dagger]) Used as threshold in subsequent tables.

Under two-stage selection with a constant population size, Q10 amounted to 97.8% with [n.sub.t] = 20 in [BC.sub.4] and 97.1% with [n.sub.t] = 60 in [BC.sub.3] (Table 3). The first parameter setting resulted in saving two backcross generations and required a total of 1180 MDP, while the second parameter setting saved three generations and required 3340 MDP. Even with [n.sub.t] = 200, the Q10 value did not exceed the threshold of 96.7% in [BC.sub.2]. For [n.sub.t] = 150 and 7990 MDP, Q10 reached 97.6% in [BC.sub.3], which corresponds to a saving of four backcross generations.

Table 3. Simulation results for the 10% percentile (Q10) of the distribution of the recurrent parent genome (RPG) and total number of marker data points (MDP) required in a backcross program to introgress one target allele, using constant population size [n.sub.t] in all backcross generations. Values for MDP are rounded to multiples of ten.
                                  Number of individuals [n.sub.t]
                                     per backcross generation

Generation                           20      40      60      80

                                         Q10 (%) of the RPG

Two-stage selection
[BC.sub.1]                         76.7    78.7    79.7    80.3
[BC.sub.2]                         90.3    91.9    92.8    93.3
[BC.sub.3]                         95.8    96.2    97.1    97.3
[BC.sub.4]               97.8([dagger])    97.9    98.4    98.5
[BC.sub.5]                         98.7    98.9    99.0    99.0

Three-stage selection
[BC.sub.1]                         71.2    72.7    73.4    73.6
[BC.sub.2]                         86.1    87.2    88.5    89.3
[BC.sub.3]                         94.4    95.7    96.5    96.9
[BC.sub.4]                         97.7    98.2    98.4    98.4
[BC.sub.5]                         98.7    98.8    98.9    98.9

Four-stage selection
[BC.sub.1]                         71.0    71.9    72.1    71.7
[BC.sub.2]                         85.5    86.2    87.2    87.6
[BC.sub.3]                         93.7    95.0    96.0    96.5
[BC.sub.4]                         97.6    98.2    98.3    98.4
[BC.sub.5]                         98.7    98.8    98.9    98.9

                                  Number of MDP required in total

Two-stage selection
[BC.sub.1]                          800    1560    2400    3200
[BC.sub.2]                         1010    2130    3150    4170
[BC.sub.3]                         1180    2280    3340    4390
[BC.sub.4]                         1210    2310    3380    4430
[BC.sub.5]                         1220    2320    3380    4430

Three-stage selection
[BC.sub.1]                          250     320     420     510
[BC.sub.2]                          440     610     830    1100
[BC.sub.3]                          550     820    1130    1470
[BC.sub.4]                          590     860    1170    1500
[BC.sub.5]                          590     860    1170    1500

Four-stage selection
[BC.sub.1]                          230     270     340     390
[BC.sub.2]                          370     460     590     750
[BC.sub.3]                          460     660     900    1140
[BC.sub.4]                          500     710     950    1190
[BC.sub.5]                          510     710     950    1190

                                  Number of individuals [n.sub.t]
                                     per backcross generation

Generation                          100     125     150       200

                                         Q10 (%) of the RPG

Two-stage selection
[BC.sub.1]                         80.7    81.3    81.7      82.2
[BC.sub.2]                         93.6    93.9    94.0      94.6
[BC.sub.3]                         97.4    97.5    97.6      97.8
[BC.sub.4]                         98.5    98.6    98.6      98.7
[BC.sub.5]                         99.0    99.0    99.0      99.0

Three-stage selection
[BC.sub.1]                         73.3    73.2    72.8      72.2
[BC.sub.2]                         90.2    90.7    91.3      91.8
[BC.sub.3]                         97.2    97.3    97.5      97.6
[BC.sub.4]                         98.4    98.5    98.5      98.5
[BC.sub.5]                         98.9    98.9    99.0      99.0

Four-stage selection
[BC.sub.1]                         71.6    71.5    71.2      71.0
[BC.sub.2]                         88.2    88.7    89.1      89.8
[BC.sub.3]                         96.8    97.0    97.2      97.4
[BC.sub.4]                         98.4    98.4    98.4      98.5
[BC.sub.5]                         98.9    98.9    98.9      98.9

                                  Number of MDP required in total

Two-stage selection
[BC.sub.1]                         4000    5000    5990     8 000
[BC.sub.2]                         5180    6430    7670    10 100
[BC.sub.3]                         5430    6720    7990    10 500
[BC.sub.4]                         5470    6750    8030    10 600
[BC.sub.5]                         5470    6760   8,030    10 600

Three-stage selection
[BC.sub.1]                          590     690     750       840
[BC.sub.2]                         1390    1780    2210     3 110
[BC.sub.3]                         1810    2260    2740     3 740
[BC.sub.4]                         1840    2280    2760     3 760
[BC.sub.5]                         1840    2280    2760     3 760

Four-stage selection
[BC.sub.1]                          430     470     480       520
[BC.sub.2]                          910    1140    1360     1 900
[BC.sub.3]                         1390    1710    2020     2 690
[BC.sub.4]                         1430    1740    2050     2 720
[BC.sub.5]                         1430    1740    2050     2 720

([dagger]) Q10 values exceeding for the first time the threshold of 96.7% and the respective total number of MDP required are printed in italics.

After generation [BC.sub.3], the required number of MDP increased slowly for all values of [n.sub.t] (Table 3). A large proportion of markers were fixed for the recurrent parent allele in the individual selected in generation [BC.sub.3]. Increasing [n.sub.t] beyond 100 had little effect on the recovery of the RPG, but was consuming of MDP. For example, in a two-stage selection program with constant [n.sub.t], with [n.sub.t] = 100 resulted in Q10 = 97.4% in [BC.sub.3] and required 5430 MDP, while with [n.sub.t] = 200 resulted in Q10 = 97.8% but required 10 500 MDP. The total number of MDP required in two-stage selection with constant population size was approximately proportional to [n.sub.t]. The greatest proportion of total MDP was consumed in generation [BC.sub.1] : about 60% for [n.sub.t] = 20 and about 80% for [n.sub.t] = 200.

Three-stage selection with constant [n.sub.t] yielded lower Q10 values than two-stage selection only in [BC.sub.1] and [BC.sub.2], but in subsequent backcross generations the difference was only marginal especially for greater [n.sub.t] values (Table 3). Increasing [n.sub.t] from 20 to 60 resulted in a substantial increase of Q10 values only up to [BC.sub.3] but not in later backcross generations. Likewise, increasing [n.sub.t] beyond 60 resulted only in marginal gains in Q10. In comparison with two-stage selection, less than half the total number of MDP were required in a three-generation backcross program for all values of [n.sub.t]. This reduction was attributable to considerable savings in [BC.sub.1]. (Table 3).

For four-stage selection with constant [n.sub.t], the Q10 values followed the same trends as for three-stage selection. Corresponding Q10 values never exceeded those for the latter procedure, but differences were negligible after generation [BC.sub.2], irrespective of the choice of [n.sub.t] (Table 3). However, the total MDP number was reduced, compared with three-stage selection (about 15% for [n.sub.t] = 20 and 28% for [n.sub.t] = 200), and even more when compared with two-stage selection.

Variation in [n.sub.t] values for [BC.sub.1] to [BC.sub.3] with the restriction [Sigma][n.sub.t] = 300 hardly influenced the Q10 values reached in [BC.sub.3] under two-stage selection (Table 4). In contrast, the number of MDP required was strongly reduced with larger values for [n.sub.t] in advanced backcross generations. In comparison to the ratio 1:1:1, increasing ratios of [n.sub.t] reduced the required number of MDP up to 50%, while decreasing ratios of [n.sub.t] increased the required number of MDP up to 150%. Variation of [n.sub.t] in three- and four-stage selection had only marginal influence on both the RPG and the required number of MDP for ratios of 3:2:1 to 1:2:4. A reduction in RPG was observed for the ratio 1:3:9 (Table 4).

Table 4. Simulation results for the 10% percentile (Q10) of the distribution of the recurrent parent genome (RPG) and total number of marker data points (MDP) required in a backcross program to introgress one target allele, for increasing and decreasing population sizes [n.sub.t]. Values for MDP are rounded to multiples of ten.
                          Ratio [n.sub.1]: [n.sub.2]: [n.sub.3]

Generation                    3:2:1   1:1:1   2:3:4   1:2:3

                              Number of individuals [n.sub.t]

[BC.sub.1]                      150     100      66      50
[BC.sub.2]                      100     100     100     100
[BC.sub.3]                       50     100     133     150

                                    Q10 (%) of the RPG

Two-stage selection
[BC.sub.1]                     81.6    80.7    80.0    79.3
[BC.sub.2]                     93.8    93.6    93.2    93.1
[BC.sub.3]                     97.3    97.4    97.4    97.4

Three-stage selection
[BC.sub.1]                     72.8    73.1    73.7    73.1
[BC.sub.2]                     90.5    90.0    89.5    88.8
[BC.sub.3]                     97.0    97.1    97.1    97.0

Four-stage selection
[BC.sub.1]                     71.2    71.6    72.0    72.0
[BC.sub.2]                     88.5    88.3    88.0    87.4
[BC.sub.3]                     96.5    96.7    96.8    96.8

                              Number of MDP required in total

Two-stage selection
[BC.sub.1]                     6010    4000    2680    2000
[BC.sub.2]                     7120    5180    3910    3290
[BC.sub.3]                     7240    5430    4280    3720

Three-stage selection
[BC.sub.1]                      750     590     450     370
[BC.sub.2]                     1740    1390    1070     930
[BC.sub.3]                     1930    1820    1690    1660

Four-stage selection
[BC.sub.1]                      480     430     350     300
[BC.sub.2]                     1070     910     740     640
[BC.sub.3]                     1310    1390    1400    1400

                          Ratio [n.sub.1]: [n.sub.2]: [n.sub.3]

Generation                     1:3:5   1:2:4   1:3:9

                            Number of individuals [n.sub.t]

[BC.sub.1]                       33      43      23
[BC.sub.2]                      100      86      68
[BC.sub.3]                      166     171     209

                                Q10 (%) of the RPG

Two-stage selection
[BC.sub.1]                     78.3    78.9    77.1
[BC.sub.2]                     92.8    92.8    91.9
[BC.sub.3]                     97.4    97.4    97.3

Three-stage selection
[BC.sub.1]                     72.3    72.8    71.4
[BC.sub.2]                     88.1    88.3    86.9
[BC.sub.3]                     96.9    97.0    96.7

Four-stage selection
[BC.sub.1]                     71.5    71.9    71.1
[BC.sub.2]                     87.0    87.0    86.0
[BC.sub.3]                     96.6    96.6    96.3

                         Number of MDP required in total

Two-stage selection
[BC.sub.1]                     1370    1720     920
[BC.sub.2]                     2720    2850    1900
[BC.sub.3]                     3230    3380    2650

Three-stage selection
[BC.sub.1]                      290     340     250
[BC.sub.2]                      740     790     580
[BC.sub.3]                     1620    1680    1760

Four-stage selection
[BC.sub.1]                      260     290     240
[BC.sub.2]                      540     570     440
[BC.sub.3]                     1400    1450    1500


Recurrent Parent Genome

In analogy to response to selection for a quantitative character with a normal distribution (Falconer and Mackay, 1996, p. 185), response to selection for the RPG in background selection can be calculated as R = i [Sigma] r. Here, i denotes the selection intensity, [Sigma] the standard deviation of the RPG, and r the correlation between the proportion of recurrent parent alleles at marker loci and the proportion of recurrent parent alleles across the whole genome. Values of [Sigma] and r for the three selection strategies are given in Table 5.

Table 5. Factors determining response to marker-assisted selection for the recurrent parent genome (RPG) in backcrossing: [Sigma] = standard deviation of the RPG and r = correlation between the proportion of recurrent parent alleles at marker loci and the proportion of recurrent parent alleles across the whole genome are given for the carrier chromosome, the non-carrier chromosomes, and for all chromosomes. Only individuals carrying the target allele are considered.
                           Standard deviation [Sigma]

[n.sub.1]:                Chromosomes    [BC.sub.1]   [BC.sub.2]

Two-stage selection
100:100:100               carrier          0.125        0.112
                          non-carrier      0.055        0.029
                          all              0.051        0.027
50:100:150                carrier          0.125        0.117
                          non-carrier      0.055        0.031
                          all              0.051        0.029
150:100:50                carrier          0.125        0.113
                          non-carrier      0.055        0.028
                          all              0.051        0.026

Three stage-selection
100:100:100               carrier          0.125        0.096
                          non -carrier     0.055        0.041
                          all              0.051        0.037

Four stage-selection
100:100:100               carrier          0.125        0.088
                          non-carrier      0.055        0.043
                          all              0.051        0.039

                          Standard        Correlation r

[n.sub.1]:                [BC.sub.3]      [BC.sub.1]

Two-stage selection
100:100:100                 0.067           0.964
                            0.013           0.911
                            0.012           0.913
50:100:150                  0.068           0.964
                            0.013           0.911
                            0.013           0.913
150:100:50                  0.067           0.964

                            0.012           0.911
                            0.012           0.913

Three stage-selection
100:100:100                 0.055           0.964
                            0.020           0.910
                            0.019           0.913

Four stage-selection
100:100:100                 0.036           0.964
                            0.024           0.911
                            0.022           0.913

                               Correlation r

[n.sub.1]:                [BC.sub.2]      [BC.sub.3]

Two-stage selection
100:100:100                 0.947           0.894
                            0.813           0.642
                            0.814           0.681

50:100:150                  0.948           0.899
                            0.830           0.669
                            0.830           0.700

150:100:50                  0.947           0.896
                            0.807           0.642
                            0.807           0.683

Three stage-selection
100:100:100                 0.918           0.698
                            0.884           0.795
                            0.877           0.795

Four stage-selection
100:100:100                 0.887           0.380
                            0.896           0.883
                            0.887           0.830

In addition to background selection for RPG, the backcross process itself increases the RPG values in each backcross generation. By expectation, the donor genome proportion is halved with each backcross generation, irrespective of its amount present in the nonrecurrent parent. This implies that increasing the RPG proportion by selection in a backcross generation has a carry-over rate of one half to the next backcross generation. Consequently, increasing the RPG by selection is more effective (with regard to the RPG in the end product of the breeding program), if it is realized in an advanced backcross generation. This proposition can be proved analytically and is a generalization of results of Hospital et al. (1992). They demonstrated that a single generation background selection is most efficient if selection is performed in the last backcross generation.

Marker-assisted selection is different from selection for a quantitative character, where a high selection intensity in early generations can take advantage of the large segregation variance among individuals. There is no such optimum generation for applying high selection intensities in marker-assisted background selection. If large [BC.sub.1] population sizes are chosen, the response to selection is high due to large values of [Sigma] and r (Table 5). However, in each of the following backcross generations this initial gain in RPG is halved. In contrast, the response to background selection achieved by large population sizes in the last backcross generation is fully recovered in the breeding product and not diluted by further backcrossing, even if due to smaller [Sigma] and r values (Table 5) the absolute values of the response to selection are smaller in advanced backcross generations. A compensation of both effects explains why in [BC.sub.3] the content of RPG in the selected individual is hardly influenced by the ratio of population sizes used in [BC.sub.1] to [BC.sub.3], given a constant total number of individuals.

Compared with two-stage selection, in three-stage or four-stage selection greater emphasis is given to the carrier chromosome in generation [BC.sub.1]. This is illustrated by the low value of r = 0.38 for the carrier-chromosome in [BC.sub.3] under four-stage selection (Table 5). Because of a high selection pressure in early backcross generations, almost all markers on the carrier-chromosome are homozygous for the recurrent parent allele. Hence, they describe only poorly the differences in RPG that still do exist between the individuals.

Preferential selection of individuals with high RPG content on the carrier chromosome in [BC.sub.1] and [BC.sub.2] results in a lower overall RPG content, because the noncarrier chromosomes, on which only a reduced selection pressure is applied, form the major part of the genome. In three- or four-stage selection, non-carrier chromosomes selection is less intensive in [BC.sub.1]. Therefore the corresponding value for r in [BC.sub.3] is distinctly higher. This results in efficient [BC.sub.3] selection, which compensates for the lower RPG values derived from [BC.sub.1] and [BC.sub.2].

Number of Marker Data Points Required

The major portion of MDP required in a two-stage selection program with constant [n.sub.t] is required in generation [BC.sub.1] (Table 4). Its expectation is [mn.sub.1]/2, where in is the total number of marker loci. A reduction in [n.sub.1] results in a proportional reduction of the MDP required in generation [BC.sub.1] (Table 4). In advanced backcross generations, many marker loci are already fixed for the recurrent parent allele. This results in a substantial MDP decrease if larger population sizes are used in advanced backcross generations instead of [BC.sub.1] or [BC.sub.2].

In the second selection step of three-stage selection, only the flanking markers are analyzed in all carriers of the target allele. Hence, instead of [mn.sub.1]/2 MDP only [n.sub.1] MDP are required by expectation. Subsequently, analysis of the remaining marker loci in the third selection step requires (m - 2)a MDP for the a preselected individuals. This smaller number of MDP in generation [BC.sub.1] results in the observed overall MDP reduction (up to 50%) (Table 4). In four-stage selection, a further MDP reduction is achieved by investigating only the i non-flanking markers on the carrier chromosome in the third selection step. This requires ia MDP instead of (m - 2)a. The whole marker set is only analyzed on the b individuals preselected in the third step, which requires (m - 2 - i)b MDP.

Transferability to Other Situations in Breeding

Like simulations in general, the results presented in this study depend on the underlying model. In the present context, simulation results are influenced by (i) the theoretical assumptions underlying the simulation of the meiotic recombination and (ii) the choice of genetic and dimensioning parameters.

We chose the map of Schon et al. (1994) because it represents a typical linkage map used in breeding programs. To investigate the robustness of our results with regard to the target allele position, we analyzed two additional scenarios.
   1. The target locus was located on Chromosome 7, with a distance of 40 cM
   from the telomere.

   2. The target locus was assigned to a random position on the genome in each
   repetition of the simulation. While the absolute Q10 values under these
   scenarios differed slightly from the results presented here, the general
   trends were the same (data not shown).

Simulations with varying linkage maps demonstrated that an average marker density higher than 20 cM results only in a marginal increase of Q10 values, but requires a substantially larger number of MDP (Frisch et al., 1998). In generation [BC.sub.1] and [BC.sub.2], a chromosome only consists of several segments of different origin (for a chromosome of length l, the expected number of segments in [BC.sub.1] is l + 1). Hence, the bottleneck limiting marker-assisted selection in early backcross generations is the number of chromosome segments itself, not the number of markers used for monitoring the composition of the chromosomes.

With a linkage map with equally spaced markers (Frisch et al., 1998), smaller population sizes and fewer MDP were required than with the linkage map underlying this study, which has regions of 60 or 80 cM length not covered by markers. For example, with a linkage map uniformly covered by markers, a saving of four backcross generations can be achieved with population sizes that resulted in a saving of three backcross generations with the linkage map used in this study (Frisch et al., 1998). This shows that an equally covered linkage map is mandatory for obtaining maximum RPG values in [BC.sub.2] and [BC.sub.3].

The differences in Q10 and MDP values between the selection strategies are caused by a different treatment of carrier and non-carrier chromosomes. Hence, the ratio between carrier and non-carrier chromosomes determines the different outcome of the selection strategies. The amount of reduction in the required number of MDP reported here is specific for 10 chromosomes and map length of 16 Morgan. In crops with genomes consisting of less than 10 chromosomes, the differences are expected to be smaller, because the ratio between carrier and non-carrier chromosomes increases. For more than 10 chromosomes, the proportion of genome on the non-carrier chromosomes increases and, consequently, the differences between the selection strategies are expected to be greater.

The presented results should cover a wide range of gene introgression programs in crops with 2x = 20 and also 2x = 18 chromosomes, such as maize or sugar beet (Beta vulgaris L.). For different linkage maps, our simulation software PLABSIM (Frisch et al., 1999b) can be used for conducting simulations to compare the effect of selection strategies or breeding designs in marker-assisted backcrossing.

Design of Marker-Assisted Backcross Programs

Tanksley et al. (1989) stated that a sufficiently high proportion of the RPG is recovered after three generations of marker-assisted backcrossing. Hospital et al. (1992) expected a saving of two backcross generations because of marker-assisted background selection. This is in accordance with our simulations, resulting in a saving of two to four backcross generations in the transfer of a single target allele (Table 3).

The backcross procedure can be terminated after four instead of six backcross generations even with small population sizes and a limited number of MDP (Table 2). This demonstrates that marker technology can be advantageous even when the resources in a breeding program are limited. A shortening from six to three backcross generations can be regarded as a realistic goal for practical breeders, because moderate population sizes and number of MDP are required, and the breeding program is two times faster than it is without markers. As demonstrated by our results, marker-assisted selection has the potential to reach in generation [BC.sub.3] the same level of RPG as reached in [BC.sub.7] without use of markers. However, large numbers of MDP are required to unlock this potential. With the marker systems presently available, this application is yet unrealistic or at least not economic.

In generations [BC.sub.1] and [BC.sub.2], two-stage selection is superior to three- and four-stage selection because it reaches a larger RPG proportion with a given population size (Table 3). Thus, two-stage selection seems appropriate in two-generation backcross programs with limited population size. Furthermore, it can be applied without information about the marker linkage map and, hence, is the only option for application in generation [BC.sub.1], if no marker linkage map is available.

An increasing population size [n.sub.t] is preferable over a constant population size in a two-stage selection program, because the number of marker analyses is reduced without reducing the Q10 values. Limits for varying [n.sub.t] are practical restrictions for handling large values of [n.sub.3] and the risk of loosing the target allele in [BC.sub.1] with low values of [n.sub.1]. With probability P = [1/2.sup.n1] none of the [n.sub.1] backcross individuals carries the target allele. Hence, a minimum of 15 to 20 individuals per generation should be produced to obtain with almost certainty at least one carrier of the target allele.

Reduction of the linkage drag is one of the main goals in marker-assisted backcrossing (Tanskley et al., 1989). Theoretical results (Stam and Zeven, 1981) show that the donor segment attached to the target allele remains surprisingly large in backcrossing without marker-assisted selection even in advanced backcross generations. In introgression of target alleles from unadapted germplasm, linkage drag is the main cause for the differences between the recipient line and the converted line. Tightly linked flanking markers can be used for a substantial reduction of the linkage drag. Individuals with recombination between tightly linked loci have a low frequency in backcross populations, but may not be selected by applying two-stage selection. Hence, if reduction of the linkage drag has high priority, three- or four-stage selection should be applied. This avoids the necessity of additional backcross generations at the end of the breeding program to ascertain detection of a recombination event between tightly linked flanking markers and the target locus.

While three- and four-stage selection yield considerably lower RPG values in [BC.sub.2] than two-stage selection, the slightly lower Q10 values reached in [BC.sub.3] can be compensated by larger population sizes [n.sub.3]. Thus, without restrictions on [n.sub.3], applying three- or four stage selection in three-generation backcross programs results in a reduction of the required number of MDP by as much as 50 or 75 % (Table 3). They combine economic marker use with the possibility to efficiently reduce the linkage drag.

In a separate paper (Frisch et al., 1999a), we give equations for calculating the minimal population size for obtaining at least one carrier of the target allele homozygous for the recurrent parent allele at one or both flanking markers. The required population size depends on (i) the map distances between the flanking markers and the target allele and (ii) the chosen probability of success. These results can be used for the design of efficient three- and four-stage selection backcross programs in marker-assisted background selection.


The financial support from fellowships by KWS Kleinwanzlebener Saatzucht AG, Einbeck, Germany, and Pioneer Hi-Bred Intl. Inc., Johnston, IA, USA, to M. Frisch is gratefully acknowledged.

Abbreviations: [BC.sub.t], tth backcross generation; cM, centimorgan; MDP, marker data points; QTL, quantitative trait locus; RPG, recurrent parent genome.


Allured, R.W. 1960. Principles of plant breeding. Wiley, New York. Crosby, J.L. 1973. Computer simulation in genetics. Wiley, New York.

Falconer, D.S., and T.F. Mackay. 1996. Introduction of quantitative genetics. Longman Group Limited, Harlow, UK.

Frisch, M., M. Bohn, and A.E. Melchinger. 1998. Markerdichte und Anzahl benotigter Markeranalysen in markergestutzten Ruckkreuzungs-programmen. Vortrage fr Pflanzenzuchtung 42:1-3.

Frisch, M., M. Bohn, and A.E. Melchinger. 1999a. Minimum sample size and optimal positioning of flanking markers in marker-assisted backcrossing for transfer of a target gene. Crop Sci. 39:967-975.

Frisch, M., M. Bohn, and A.E. Melchinger. 1999b. PLABSIM: Software for simulations of marker-assisted backcrossing. J. Heredity (In press).

Haldane, J.B.S. 1919. The combination of linkage values and the calculation of distance between the loci of linkage factors. J. Genet. 8:299-309.

Hospital, F., and A. Charcosset. 1997. Marker-assisted introgression of quantitative trait loci. Genetics 147:1469-1485.

Hospital, F., C. Chevalet, and P. Mulsant. 1992. Using markers in gene introgression breeding programs. Genetics 132:1119-1210.

Lander, E.S., P. Green, J. Abrahamson, A. Barlow, M.J. Daly, S.E. Lincoln, and L. Newburg. 1987. MAPMAKER: An interactive computer package for constructing primary genetic linkage maps of experimental and natural populations. Genomics 1:174-181.

Melchinger, A.E. 1990. Use of molecular markers in breeding for oligogenic disease resistance. Plant Breeding 104:1-19.

Openshaw, S.J., S.G. Jarboe, and W.D. Beavis. 1994. Marker-assisted selection in backcross breeding. In Proceedings of the Symposium "Analysis of Molecular Marker Data", Corvallis, OR. 5-6 Aug. 1994. Am. Soc. Hortic. Sci. and Crop Sci. Soc. Am.

Ragot, M., M. Biasiolli, M.F. Delbut, A. Dell'Orco, L. Malgarini, P. Thevenin, J. Vernoy, J. Vivant, R. Zimmermann, and G. Gay. 1995. Marker-assisted backcrossing: a practical example. In Techniques et utilisations des marqueurs moleculaires. Montepellier, France. 29-31 March 1994. INRA, Paris.

Ribaut, J.M., X. Hu, D. Hoisington, and D. Gonzalez-de-Leon. 1997. Use of STS and SSRs as rapid and reliable preselection tools in a marker-assisted selection backcross scheme. Plant Mol. Biol. Rep. 15:154-162.

Schon, C.C., A.E. Melchinger, J. Boppenmaier, E. Brunklaus-Jung, R.G. Herrmann, and J.F. Seitzer. 1994. RFLP mapping in maize: Quantitative trait loci affecting testcross performance of elite European flint lines. Crop Sci. 34:378-389.

Stam, P. 1979. Interference in genetic crossing over and chromosome mapping. Genetics 92:873-594.

Stam, P., and A.C. Zeven. 1981. The theoretical proportion of the donor genome in near-isogeneic lines of self-fertilizers bred by backcrossing. Euphytica 30:227-238.

Stuber, C.W. 1995. Mapping and manipulating quantitative traits in maize. Trends Genetics 11:477-481.

Tanksley, S.D., N.D. Young, A.H. Patterson, and M.W. Bonierbale. 1989. RFLP mapping in plant breeding: new tools for an old science. Bio/Technology 7:257-263.

Visscher, P.M., C.S. Haley, and R. Thompson. 1996. Marker-assisted introgression in backcross breeding programs. Genetics 144: 1923-1932.

Matthias Frisch, Martin Bohn, and Albrecht E. Melchinger (*)

Institute of Plant Breeding, Seed Science, and Population Genetics, University of Hohenheim, 70593 Stuttgart, Germany. Received 24 Nov. 1998. (*) Corresponding author (
COPYRIGHT 1999 Crop Science Society of America
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 1999 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Frisch, Matthias; Bohn, Martin; Melchinger, Albrecht E.
Publication:Crop Science
Article Type:Statistical Data Included
Geographic Code:1USA
Date:Sep 1, 1999
Previous Article:Variance Effective Population Size under Mixed Self and Random Mating with Applications to Genetic Conservation of Species.
Next Article:Mass Selection for Improvement of Grain Yield and Protein in a Maize Population.

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters