Single-nucleotide polymorphism allele frequencies determined by quantitative kinetic assay of pooled DNA.
The 5' nuclease assay (TagMan[R]) uses fluorogenic allele-specific detection probes that allow PCR amplification and allele detection in a single procedure (11). To discriminate alleles, each fluorogenic probe is labeled with a different fluorescent reporter dye. The reaction is kinetically monitored in "real time" to quantify the ratio of the two alleles (12). Taking advantage of the specificity and sensitivity of kinetic sequence detection, we developed a method for estimating SNP allele frequencies in pooled DNA samples. Accuracy was evaluated using constructed pools in which allele frequencies were incrementally varied and by comparing the allele frequencies determined from pooled DNA with allele frequencies determined by individual genotyping. For allele frequency determinations, we found that the pooled DNA sample approach is rapid, accurate, and reproducible and requires no post-PCR processing.
Genomic DNA from 460 individual Finnish individuals was used. Before creating pools, we measured the DNA concentration in each sample by SYBR Green I (Molecular Probes) fluorescence (13) in a CytoFlour 4000 multiwell plate reader (PerSeptive Biosystems). The DNA concentrations for the calibrators used for these assays were determined spectrophotometrically. Coefficients of determination ([R.sup.2]) between SYBR Green I fluorescence and DNA concentration were >0.99.
Individual DNA samples were diluted to 5 mg/L (5 ng/[micro]L) with Tris-EDTA buffer. To prepare dilution pools, we mixed different amounts of genomic DNA from known homozygotes for the two alleles. In this way, we created pools with allele 1 and allele 2 frequencies ranging from 0% to 100% in increments of 10%. To prepare population pools, we added 5 ng (1 [micro]L) of each DNA to the pool.
Three SNPs were allelotyped using a quantitative kinetic PCR assay. Two SNPs were derived from the catechol-O-methyltransferase gene (COMT 1883C [right arrow] G and COMT 1243G [right arrow] A). The third SNP was from the dopamine receptor 2 gene (DRD2 TagIB A [right arrow] G). PCR primers and allele-specific detection probes were designed based on the nucleotide sequences of COMT (GenBank accession no. A0005663) and DRD2 (GenBank accession no. AF050737), respectively, using Primer Express software [Applied Biosystems (ABI)]. The primer and probe sequences for COMT 1883C [right arrow] G were as follows (the lowercase, bold letters indicate SNPs):
Forward primer, 5'-GGGGGCCTACTGTGGCTACT-3'
Reverse primer, 5'-TCAGGCATGCACACCTTGTC-3'
Allele C probe, FAM-CGA000TcATCACCATCGAGATCA-TAMRA
Allele G probe, VIC-CGA000TgATCACCATCGAGATCA-TAMRA
The primer and probe sequences for COMT 1243G [right arrow] A were as follows:
Forward primer, 5'-AGGCACAAGGCTGGCATT-3'
Reverse primer, 5'-CCACACGCCCCTTTGCT-3'
Allele G probe, FAM-T000CCTCTGCgAACACAAGG-TAMRA
Allele A probe, VIC-ACCTT000CCTCTGCaAACACAAG-TAMRA
The primer and probe sequences for DRD2 TagIB A-G were as follows:
Forward primer, 5'-GCCCCTCCTCTCCGTTCTC-3'
Reverse primer, 5'-TCATGTGGTTCCTGCTGCC-3'
Allele A probe, FAM-TCAGAATCACCTATTCaAAAGGCGAATCC-TAMRA
Allele G probe, VIC-CAGAATCACCTATTCgAAAGGCGAATCC-TAMRA
FAM is 6-carboxyfluorescein, and TAMRA is 6-carboxytetramethylrhodamine.
All PCR primers and probes were obtained from ABI. PCR amplification and allele detection were performed with an ABI Prism 7700 Sequence Detection System. The 25-[micro]L reaction was composed of PCR Master Mixture (ABI) diluted in Tris-EDTA containing 900 nM each of the forward and reverse primers, 100-200 nM each of the allele-specific probes, depending on the assay, and 10 ng of pooled DNA template. The general amplification conditions were an initial incubation step of 2 min at 50 [degrees]C to allow uracil DNA glycosylase-mediated elimination of exogenous PCR product contamination (14) and an enzyme heat activation step of 5 min at 95 [degrees]C, followed by 35 cycles of 30 s at 95 [degrees]C for denaturation and 1 min at 60 [degrees]C for annealing and extension. To decrease crossover hybridization of allele-specific probes, the annealing temperature was optimized at 60-65 [degrees]C. For each experiment, pools were quantified in triplicate, and four independent experiments were performed for each SNP.
For allele-specific quantification, the threshold amplification cycle (Ct) was determined based on relative fluorescence intensity values ([delta] Rn) for each allele. Ct values were determined from the first PCR amplification cycle at which [delta] Rn was detected above a threshold. Ct can be a fractional cycle number. The mean Ct values from pools prepared in triplicate were used to calculate allele frequencies based on the equation of Germer et al. (8): allele frequency = 1/([2.sup.[delta]Ct'] + 1), where [delta] Ct is the Ct of allele 1 minus the Ct of allele 2 and [delta] Ct' is [delta] Ct corrected by subtracting the [delta] Ct determined for the particular allele in heterozygous reference DNA. The "2" in the denominator is properly "1 + the initial replication efficiency", but initial efficiency is close to 100%. The average error (discrepancy) of allele frequency estimation was calculated as the mean of the absolute values for differences between measured allele frequencies from expected values. The percentage of relative error was calculated using the absolute value of the discrepancy, divided by the expected frequency, multiplied by 100. The SD was determined from four independent experiments.
As shown in Table 1, minor alleles with frequencies >10% were accurately quantified. The mean discrepancy between expected and measured allele frequencies among pools was 1.7% for COMT 1883C [right arrow] G, 3.4% for COMT 1243G [right arrow] A, and 2.7% for DRD2 TagIB A [right arrow] G. At an allele frequency of 10%, the discrepancies between expected and measured allele frequencies were 2.5%, 2.3%, and 3.0% for COMT 1883C [right arrow] G, COMT 1243G [right arrow] A, and DRD2 TagIB A [right arrow] G, respectively, but the relative errors were high: 25% for COMT 1883C [right arrow] G, 23% for COMT 1243G [right arrow] A, and 30% for DRD2 TagIB A [right arrow] G. The relative errors for each SNP are shown in Table 1. The coefficient of determination (RZ) between expected and measured allele frequencies was high across the full range of allele frequencies: 0.997, 0.985, and 0.994 for COMT 1883C [right arrow] G, COMT 1243G [right arrow] A, and DRD2 TagIB A [right arrow] G, respectively. Five pools ranging in size from 67 to 460 individuals were created and allelotyped for the three SNPs (Table 1). Each pool was allelotyped in triplicate, and individual experiments were repeated four times. Allele frequencies in pools were compared with frequencies obtained by individual genotyping using the same amplification primers and detection probe combinations and using identical reaction conditions. Individual genotypes were validated as >99% reproducible by duplicate genotyping of 384 samples for each locus. Allele frequency comparisons of pooled samples and known estimates are shown in Table 1. The mean absolute discrepancies were 2.4% for COMT 1883C [right arrow] G, 2.2% for COMT 1243G [right arrow] A, and 2.7% for DRD2 TagIB A [right arrow] G. The relative errors for each SNP are also shown in Table 1. The SDs for four independent experiments were 1.4% for COMT 1883C [right arrow] G, 2.0% for COMT 1243G [right arrow].A, and 2.5% for DRD2 TagIB A [right arrow]. Accuracy was unrelated to the number of individuals in pools ([r.sup.2] between accuracy and sample pool size = 0.117).
Reproducibility and accuracy are critical aspects to consider in allelotyping and genotyping. Our purpose here was to evaluate the accuracy of pooled 5' nuclease allelotyping. Comparison with individual 5' nuclease genotyping was thus most appropriate for this purpose, but we cannot rule out the possibility of systematic error in any 5' nuclease assay that might be developed. This is because the 5' nuclease assay has the potential for a genotyping "miscall" if a second unknown SNP is located close to the targeted SNP (15), where it might affect the annealing of probe or primers. However, none of the three SNPs we typed has an adjacent abundant second SNP, as revealed by searches in the Celera and dbSNP databases. The reproducibility of individual genotyping with the 5' nuclease assay is >99% based on our observations and those of Ranade et al. (16). In comparison with the quantitative accuracy of pooled allelotyping with 5' nuclease, the accuracy with pooled allele frequencies for microsatellite polymorphisms is lower because of DNA polymerase "stutter" and differential allele amplification. In contrast, SNPs have a simple sequence structure that is readily amenable to the resolution of individual quantitative allele signals. When we used the optimized 5' nuclease assay conditions for the three SNPs allelotyped in this study, there was little crossover signal between alleles, and because SNPs are diallelic, there is only one allele-allele interaction to consider.
The main advantage of 5' nuclease pooled allelotyping is that it is a rapid, single-step procedure, requiring a minimum of reagents and no post-PCR processing. Another pooled allelotyping method, which uses kinetic PCR and fluorescence-coupled detection of allele-specific products, requires separate reactions for each allele (8). The separate reactions create the possibility for variation in detection efficiency of the two alleles, e.g., if one reaction fails. Most critically, the 5' nuclease method is well suited for real-time monitoring of allele-specific signals during DNA amplification, permitting more accurate allele frequency determinations than end-point methods and enabling the assay to be applied across a large dynamic range of allele frequencies. For a representative SNP, the pooled 5' nuclease assay should permit accurate allelotyping in the allele frequency range of 10-90%. A drawback of 5' nuclease genotyping is the need for two specific fluorescently tagged probes for each SNP. The initial cost of these individual probes is intrinsically high. With greater use of the 5'nuclease assay, however, large-scale applications will lead to lower costs.
In conclusion, DNA pooling with the 5' nuclease assay is a highly accurate and potentially inexpensive method for allelotyping. This pooled allelotyping method makes use of the same reagents, equipment, and general methodology required for individual genotyping. Accurate measurement of DNA concentrations and careful preparation of pools are critical for the success of pooled allelotyping. Application of 5' nuclease allelotyping to large numbers of SNP loci, using both pooled and individual samples, would probably depend on the commercial availability of the fluorescently labeled probes at lower cost, a practical possibility. This method can markedly reduce the costs of reagents, save time, and conserve precious DNA samples. It may also be a practical approach for use in whole genome-wide association studies.
(1.) Wang DG, Fan JB, Siao CJ, Berno A, Young P, Sapolsky R, et al. Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome. Science 1998;280:1077-82.
(2.) Arnheim N, Strange C, Erlich H. Use of pooled DNA samples to detect linkage disequilibrium of polymorphic restriction fragments and human disease: studies of HLA class II loci. Proc Natl Acad Sci 1985;82:6970-4.
(3.) Risch N, Teng J. The relative power of family-based and case-control designs for linkage disequilibrium studies of complex diseases I. DNA pooling. Genome Res 1998;8:1273-88.
(4.) Shaw SH, Carrasquillo MM, Kashuk C, Puffenberger EG, Chakravarti A. Allele frequency distributions in pooled DNA samples: applications to mapping complex disease genes. Genome Res 1998;8:111-23.
(5.) Barcellos LF, Klitz W, Field LL, Tobias R, Bowcock AM, Wilson R, et al. Association mapping of disease loci, by use of a pooled DNA genomic screen. Am J Hum Genet 1997;61:734-47.
(6.) Collins HE, Li H, Inda SE, Anderson J, Laiho K, Tuomilehto J, et al. A simple and accurate method for determination of microsatellite total allele content differences between DNA pools. Hum Genet 2000;106:218-26.
(7.) Giordano M, Mellai M, Hoogendoorn B, Monigliano-Ricciardi P. Determination of SNP allele frequencies in pooled DNAs by primer extension genotyping and denaturing high-performance liquid chromatography. J Biochem Biophys Methods 2001;47:101-10.
(8.) Germer S, Holland MJ, Higuchi R. High-throughput SNP allele frequency determination in pooled DNA samples by kinetic PCR. Genome Res 2000; 10:258-66.
(9.) Zhou G, Kamahori M, Okano K, Chuan G, Harada K, Kambara H. Quantitative detection of single nucleotide polymorphisms for a pooled sample by a bioluminometric assay coupled with modified primer extension reactions (BAMPER). Nucleic Acids Res 2001;29:e93.
(10.) Uhl GR, Liu Q, Walther D, Hess J, Naiman D. Polysubstance abuse-vulnerability gene: genome scans for association, using 1,004 subjects and 1,494 single-nucleotide polymorphisms. Am J Hum Genet 2001;69:1290300.
(11.) Livak KJ. Allelic discrimination using fluorogenic probes and the 5'-nuclease assay. Genet Anal 1999;14:143-9.
(12.) Heid, CA, Stevens J, Livak KJ, Williams PM. Real time quantitative PCR. Genome Res 1996;6:986-94.
(13.) Vitzthum F, Geiger G, Bisswanger H, Brunner H, Bernhagen J. A quantitative fluorescence-based microplate assay for the determination of double-stranded DNA using SYBR Green I and a standard ultraviolet transilluminator gel imaging system. Anal Biochem 1999;276:59-64.
(14.) Longo MC, Berninger MS, Hartley JH. Use of uracil DNA glycosylase to control carry-over contamination in polymerase chain reactions. Gene 1990; 93:125-8.
(15.) Teupser D, Rupprecht W, Lohse P, Thietry J. Florescence-based detection of the CETP TaqIB polymorphism: false positive with the TacMan-based exonuclease assay attributable to a previously unknown gene variant. Clin Chem 2001;47:852-7.
(16.) Ranade K, Chang MS, Ting CT, Pei D, Hsiao CH, Pesich R, et al. High-throughput genotyping with single nucleotide polymorphisms. Genome Res 2001;11:1262-8.
Ke Xu, * Robert H. Lipsky, Walid Mangal, Erica Ferro, and David Goldman (Laboratory of Neurogenetics, National Institute on Alcohol Abuse and Alcoholism, Rockville, MD 20852; * address correspondence to this author at: NIH/NIAAA/DICBR/LNG,12420 Parklawn Dr., Park 5 Bldg., Room 451, MSC 8110, Rockville, MD 20852; fax 301-443-8579, e-mail firstname.lastname@example.org)
Table 1. Allele frequencies of three SNPs in constructed and population-based sample pools. (a) Constructed pools (frequency of allele 1) COMT 1883C [arrow right] G COMT 1243G [arrow right] A Expected, Measured, Relative Measured, Relative % % error, (b) % error, % [%.sup.2] 100 99.1 0.9 96.3 3.7 90 90.9 1.0 91.6 1.8 80 78.7 1.6 83.3 4.1 70 71.9 2.7 68.5 2.1 60 63.3 5.5 53.2 1.1 50 47.4 5.0 43.5 1.3 Control (c) 50.0 0.0 53.6 7.2 40 38.9 2.8 32.5 18.8 30 27.0 1.0 31.1 3.7 20 18.4 8.0 17.3 13.5 10 7.50 25.0 7.70 23.0 Constructed pools (frequency of allele 1) DRD2 Taq IB A [arrow right] G Expected, Measured, Relative % % error, % 100 97.1 2.9 90 86.2 4.2 80 81.0 1.3 70 69.4 0.9 60 64.9 8.2 50 49.0 2.0 Control (c) 43.9 12.2 40 42.6 6.5 30 28.9 3.7 20 20.8 4.0 10 7.00 30.0 Population-based pools (frequency of allele 1 from individual and pooled samples) COMT 1883C [arrow right] G Sample pool Individual, Pooled, Relative % % error, % A (d) 67.2 64.8 3.6 B 79.0 80.0 1.3 C 67.9 68.3 0.6 D 75.0 79.0 5.3 E 74.5 78.5 5.4 Population-based pools (frequency of allele 1 from individual and pooled samples) COMT 1243 G [arrow right] A Sample pool Individual, Pooled, Relative % % error, % A (d) 33.3 36.6 9.9 B 36.9 39.1 6.0 C 30.6 29.6 3.3 D 30.3 32.4 6.9 E 29.7 27.5 7.4 Population-based pools (frequency of allele 1 from individual and pooled samples) DRD2 TaqIB A [arrow right] G Sample pool Individual, Pooled, Relative % % error, % A (d) 17.9 16.7 6.7 B 16.0 11.9 25.6 C 16.2 14.0 13.6 D 16.9 13.6 19.5 E 18.0 15.3 15.0 (a) Allele frequencies and discrepancies are shown as percentages. 0.0% is a rounded value. (b) Relative error was calculated using the discrepancy of expected allele 1 frequencies minus measured allele 1 frequencies divided by the expected allele 1 frequencies. The data are expressed as percentages. (c) Control sample was from a heterozygous individual. Data are shown in bold for purposes of comparison. (d) Sample pool sizes for the three SNPs genotyped are given as a range. The sample pool for each SNP was slightly different: A, 67-83 samples; B, 83-87 samples; C, 136-153 samples; D, 147-170 samples; E, 460 samples.
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Technical Briefs|
|Author:||Xu, Ke; Lipsky, Robert H.; Mangal, Walid; Ferro, Erica; Goldman, David|
|Date:||Sep 1, 2002|
|Previous Article:||Two assays for urinary N-acetyl-[beta]-D-glucosaminidase compared.|
|Next Article:||Web-based competency assessment system for microscopic urinalysis.|