Genomewide pattern of synonymous nucleotide substitution in two complete genomes of Mycobacterium tuberculosis. (Dispatches).Comparison of the pattern of synonymous nucleotide substitution between two complete genomes of Mycobacterium tuberculosis Mycobacterium tuberculosis n. Tubercic bacillus. Mycobacterium tuberculosis at 3,298 putatively orthologous loci loci [L.] plural of locus. loci Plural of locus, see there showed a mean percent difference per synonymous site of 0.000328 [+ or -] 0.000022. Although 80.5% of loci showed no synonymous or nonsynonymous nucleotide differences, the level of polymorphism polymorphism, of minerals, property of crystallizing in two or more distinct forms. Calcium carbonate is dimorphous (two forms), crystallizing as calcite or aragonite. Titanium dioxide is trimorphous; its three forms are brookite, anatase (or octahedrite), and rutile. observed at other loci was greater than suggested by previous studies of a small number of loci. This level of nucleotide difference leads to the conservative estimate that the common ancestor of these two genotypes occurred approximately 35,000 ago, which is twice as high as some recent estimates of the time of origin of this species. Our results suggest that a large number of loci should be examined for an accurate assessment of the level of nucleotide diversity Nucleotide diversity is a concept in molecular genetics which is used to measure the degree of polymorphism within a population. It was first introduced by Nei and Li in 1979. in natural populations of pathogenic microorganisms. ********** Surveys of genetic diversity in the pathogenic bacterium Mycobacterium tuberculosis have revealed a contradictory picture. In spite of known polymorphism at the phenotypic level and abundant polymorphism associated with repetitive elements (1), surveys of single nucleotide polymorphism Noun 1. single nucleotide polymorphism - (genetics) genetic variation in a DNA sequence that occurs when a single nucleotide in a genome is altered; SNPs are usually considered to be point mutations that have been evolutionarily successful enough to recur in a in protein-coding genes have shown surprisingly low levels of polymorphism in comparison with other eubacterial species (2). The apparent low level of nucleotide polymorphism has led to the hypothesis that the ancestor of this species occurred quite recently, perhaps 15,000-20,000 years ago (2,3). However, if the number of substitutions per site is low, the error of estimation of this number would be expected to be substantial unless a very large number of sites are surveyed. We addressed the question of polymorphism in M. tuberculosis M. tuberculosis, n the bacterium responsible for tuberculosis, generally a respiratory infection in man; nonrespiratory tuberculosis is considered an indicator disease for AIDS. See also tuberculosis. by comparing protein-coding genes in two completely sequenced genotypes, H37Rv and CDC See Control Data, century date change and Back Orifice. CDC - Control Data Corporation 1551 (4). Methods We applied the BLASTP program (5) to identify, for each predicted protein sequence in the H37Rv genome (GenBank accession no. AL123456), the closest homolog hom·o·log n. Variant of homologue. in the CDC 1551 genome (GenBank accession no. AE000516). Following GenBank annotations, we compared 3,972 predicted proteins in H37Rv with 4,187 predicted proteins in CDC 1551. We used a strict search criterion (E = [10.sup.-5]) to identify truly orthologous gene pairs. We aligned (6) the putative orthologous pairs of amino acid amino acid (əmē`nō), any one of a class of simple organic compounds containing carbon, hydrogen, oxygen, nitrogen, and in certain cases sulfur. These compounds are the building blocks of proteins. sequences (n=3,428), then imposed this alignment on the DNA sequences. Visual inspection of amino acid alignments showed that certain alignments, usually near the N-terminus or C-terminus, had regions of very low sequence identity. Examination of the DNA sequences of the corresponding genes showed that these regions of low identity were typically caused by a frameshift in one of the two genomes relative to the other. Whether these frameshifts are biologically real or result from sequencing error was uncertain; therefore, we eliminated 119 such gene pairs from our data set. For the remaining gene pairs (n=3,309), we computed the proportion of synonymous substitutions per synonymous site ([p.sub.S]) and the proportion of nonsynonymous substitutions per site ([p.sub.N]) by using Nei and Gojobori's method (7). Because values of [p.sub.S] and [p.sub.N] were very low in most cases, we did not correct for multiple hits. Because [p.sub.S] values appeared to fall into two groups (see Results), we used a simple probabilistic (probability) probabilistic - Relating to, or governed by, probability. The behaviour of a probabilistic system cannot be predicted exactly but the probability of certain behaviours is known. Such systems may be simulated using pseudorandom numbers. model to separate these two sets of gene pairs. We assumed that the probability of synonymous substitution followed two separate binomial distributions, designated models A and B, with probabilities of "success" (i.e., of a synonymous difference) designated [p.sub.A] and [p.sub.B], respectively. Using the Bayes equation, for each gene pair with a given Ps value, we computed the probability that model A applies, given the observed Ps: P(A|[p.sub.S]) = ([p.sub.SA])[f.sub.A]/[([p.sub.SA])[f.sub.A] + ([p.sub.SB])[f.sub.B]], where [f.sub.A] is the frequency of cases to which model A applies, [f.sub.B] the frequency of cases to which model B applies, [p.sub.SA] is the binomial probability Binomial probability typically deals with the probability of several successive decisions, each of which has two possible outcomes. Definition The probability of an event can be expressed as a binomial probability if its outcomes can be broken down into two probabilities of obtaining the observed [p.sub.S], given the number of synonymous sites in the gene and a probability of a synonymous difference equal to [p.sub.A]; and [p.sub.SB] is the binomial probability of obtaining the observed [p.sub.S], given the number of synonymous sites in the gene and a probability of a synonymous difference equal to [p.sub.B]. The probability that model B applies, given PS, is P(B|[p.sub.S]) = ([p.sub.SB])[f.sub.B]/[([p.sub.SA])[f.sub.A] + ([p.sub.SB])[f.sub.B]] By comparing these two probabilities, we assigned each gene pair to one of two groups assumed to evolve according to according to prep. 1. As stated or indicated by; on the authority of: according to historians. 2. In keeping with: according to instructions. 3. the two models, respectively (Groups A and B). We reassigned group membership in iterative fashion, computing [p.sub.A] and [p.sub.B] from the mean [p.sub.S] values for each group. We started the process with [f.sub.A] = 0.995 and continued until group memberships were stable. Results Of 3,309 pairs of putatively homologous homologous /ho·mol·o·gous/ (ho-mol´ah-gus) 1. corresponding in structure, position, origin, etc. 2. allogeneic. ho·mol·o·gous adj. 1. protein-coding genes in the H37Rv and CDC 1551 genomes of M. tuberculosis, 2,662 (80.5%) showed no synonymous or nonsynonymous nucleotide differences between the two genomes, and 3,010 (91.0%) showed no synonymous differences between the two genomes. However, in a small number of gene pairs, the proportion of synonymous differences per synonymous site ([p.sub.S]) was surprisingly high. In 13 (0.4%) gene pairs, [p.sub.S] was >0.01, and in 3 gene pairs [p.sub.S] exceeded 4%. These extreme [p.sub.S] values seen in a small number of gene pairs are much higher than generally observed between alleles at neutrally evolving loci in eukaryotes (8). Thus, the comparison of protein-coding genes between the two M. tuberculosis genomes suggested the existence of two distinct groups of gene pairs: a large group having few or no synonymous differences and a much smaller group with a substantial degree of synonymous divergence. We used a simple probabilistic model (see Methods) to separate these two sets of gene pairs, designated Group A and Group B, respectively (Figure). The application of this method showed 11 loci with unusually high [p.sub.S] values and probabilities of assignment to group A of <50% (Figure). [FIGURE OMITTED] We assumed that Group A members are truly orthologous gene pairs that diverged at the time of the common ancestor of the H37Rv and CDC1551 genomes. Group A included 3,298 pairs, with mean [p.sub.S] for all genes of 0.000328 [+ or -] 0.000022 standard error. When [p.sub.S] was estimated for the 3,298 genes concatenated together (a total of 934,413 synonymous sites), an estimate of [p.sub.S] = 0.000348 [+ or -] 0.00019 was obtained. The range of [p.sub.S] values in Group A was between zero and 0.012; a total of 288 loci in Group A had [p.sub.S] values other than zero. These results show a substantial level of nucleotide diversity, approximately half the level of nucleotide diversity in humans (9). Rates of nucleotide substitution per unit time are difficult to estimate in bacteria given the lack of calibration from the fossil record (10). To obtain an estimate of the rate of synonymous nucleotide substitution, we used published data on comparisons of Escherichia coli Escherichia coli (ĕsh'ərĭk`ēə kō`lī), common bacterium that normally inhabits the intestinal tracts of humans and animals, but can cause infection in other parts of the body, especially the urinary tract. and Salmonella typhimurium Salmonella ty·phi·mu·ri·um n. A bacterium that causes food poisoning. (11,12), which are believed to have diverged approximately 100 million years ago (13,14) (Table 1). This procedure yielded estimates for the last common ancestor of H37Rv and CDC1551 in the range of 34,000-38,000 years (Table 1). These estimates are approximately twice previous estimates of the age of the common ancestor of worldwide M. tuberculosis (2,3). To obtain the observed mean [p.sub.S] value between H37Rv and CDC1551 within 15,000-20,000 years would require a rate of synonymous substitution approximately twice that observed in Enterobacteria en·ter·o·bac·te·ri·um n. pl. en·ter·o·bac·te·ri·a Any of various gram-negative rod-shaped bacteria of the family Enterobacteriaceae that includes some pathogens of plants and animals, such as the colon bacillus and salmonella. . Group B consisted of 11 gene pairs with mean [p.sub.S] of 0.0286 [+ or -] 0.0050 (Table 2). In Enterobacteria, a negative correlation Noun 1. negative correlation - a correlation in which large values of one variable are associated with small values of the other; the correlation coefficient is between 0 and -1 indirect correlation exists between observed proportions of synonymous difference and codon codon: see nucleic acid. bias (11). In the case of Mycobacterium mycobacterium Any of the rod-shaped bacteria that make up the genus Mycobacterium. The two most important species cause tuberculosis and leprosy in humans; another species causes tuberculosis in both cattle and humans. , codon bias results mainly from the very high third position G+C content of most genes (15). In our data, however, we observed no correlation between [p.sub.S] and proportion G+C at third codon positions (r = - 0.010; not significant). Discussion A number of additional possibilities may explain the occurrence of gene pairs with higher than expected [p.sub.S] values: 1) Balanced polymorphism balanced polymorphism n. A system of genes in which two alleles are maintained in stable equilibrium because the heterozygote is more fit than either of the homozygotes. . Selectively maintained polymorphisms are expected to be much older than neutral polymorphisms and may even predate speciation speciation Formation of new and distinct species, whereby a single evolutionary line splits into two or more genetically independent ones. One of the fundamental processes of evolution, speciation may occur in many ways. events (16). In the case of haploid haploid /hap·loid/ (hap´loid) 1. having half the number of chromosomes characteristically found in the somatic (diploid) cells of an organism; typical of the gametes of a species whose union restores the diploid number. organisms such as bacteria, balancing selection Balancing selection refers to forms of natural selection which work to maintain genetic polymorphisms (or multiple alleles) within a population. Balancing selection is in contrast to directional selection which favor a single allele. would take the form of frequency-dependent selection rather than overdominant selection. 2) Differential deletion. In a multi-gene family, if one member of an orthologous pair of genes were deleted in one genotype genotype (jēn`ətīp'): see genetics. genotype Genetic makeup of an organism. The genotype determines the hereditary potentials and limitations of an individual. , the gene pairs would involve paralogous, not orthologous comparisons. 3) Horizontal gene transfer “HGT” redirects here. For other uses, see HGT (disambiguation). Horizontal gene transfer (HGT), also Lateral gene transfer (LGT), is any process in which an organism transfers genetic material to another cell that is not its offspring. . A gene obtained by one of the two genotypes from another bacterial species would be expected to be more divergent than other genes in that genotype. One indication of a balanced polymorphism is a higher rate of nonsynonymous than synonymous substitution (8). There was no strong evidence of such selection in the present case; [p.sub.S] was greater than [p.sub.N] at 10 of the 11 loci, and [p.sub.N] exceeded [p.sub.S] only slightly at one locus (Table 2). In addition, we compared [p.sub.S] and [p.sub.N] in sliding windows of 30 codons along the length of these genes. No regions were observed in which [p.sub.N] was greater than [p.sub.S] (data not shown). Thus, there was no evidence of positive selection acting on specific regions of these genes. On the other hand, differential deletion can probably explain some cases, most notably members of the PE multi-gene family (11) (Table 2). The remaining gene pairs are possibly cases of horizontal gene transfer (Table 2), for which there is some recent evidence in M. tuberculosis (17). Presumably pre·sum·a·ble adj. That can be presumed or taken for granted; reasonable as a supposition: presumable causes of the disaster. a related species of Mycobacterium was the source of such gene transfers. Our results did not support the hypothesis that the common ancestor of M. tuberculosis was relatively recent (2). Rather, the pattern of nucleotide substitution at synonymous sites suggested a divergence time for the two available genotypes of this species approximately 35,000 years ago. Since H37Rv and CDC1551 represent two genotypes sampled from within the species, they are probably not the most divergent genotypes possible. Thus, the last common ancestor of M. tuberculosis likely occurred considerably earlier than 35,000 years ago. While the difference between an estimate of 15,000-20,000 years and one of 35,000 years is not large on an evolutionary time scale, such a difference is substantial on the scale of human history. For example, the existence of two genotypes in the current population of M. tuberculosis with a common ancestor at 35,000 years is evidence against the hypothesis that M. tuberculosis arose, presumably from M. bovis, at the time of human domestication domestication Process of hereditary reorganization of wild animals and plants into forms more accommodating to the interests of people. In its strictest sense, it refers to the initial stage of human mastery of wild animals and plants. of cattle (18). Our result is thus consistent with phylogenetic phy·lo·ge·net·ic adj. 1. Of or relating to phylogeny or phylogenetics. 2. Relating to or based on evolutionary development or history. analyses based on insertion-deletion events, which suggest that the M. tuberculosis lineage was a human pathogen well before the origin of M. bovis (19). Thus, along with recent evidence of an ancient origin and extensive polymorphism in the malaria parasite Plasmodium falciparum Plasmodium fal·cip·a·rum n. A protozoan that causes falciparum malaria. (20,21), our study provides evidence against the long-held view that virulent pathogens are invariably in·var·i·a·ble adj. Not changing or subject to change; constant. in·var i·a·bil evolutionarily recent (22).Our estimate is conservative because the rate of synonymous substitution may actually be lower in Mycobacterium than in Enterobacteriaceae, given the highly skewed skewed curve of a usually unimodal distribution with one tail drawn out more than the other and the median will lie above or below the mean. skewed Epidemiology adjective Referring to an asymmetrical distribution of a population or of data G+C content in the former. Furthermore, our estimate of the mean proportion of synonymous difference was conservative because we excluded 119 loci with potential frameshifts between the two genotypes as well as a set of 12 loci with unusually high [p.sub.S] values. In addition to the 12 loci assigned to our Group B, certain other loci might also have originated from horizontal gene transfer. However, even if horizontal gene transfer has occurred at other loci besides those in Group B, eliminating further loci with relatively high [p.sub.S] values from Group A will not affect the results greatly. For example, if we eliminate the 10 loci with highest [p.sub.S] values from Group A, mean [p.sub.S] will be reduced only to 0.000299, and the estimated age of the common ancestor will be barely affected. The degree of polymorphism observed in this study is unlikely to have been substantially influenced by sequencing errors. The error rate for finished sequences from the Institute for Genomic Research (where CDC1551 was sequenced) has been independently estimated at <1 in 88,000 bases (23). Assuming a similar error rate for both 4.4 mega-bp M. tuberculosis genomes, we would expect to see approximately 100 differences between them due to sequencing errors. Approximately 21 such differences would be expected in the 938,778 synonymous sites in Group A and Group B genes. In fact, 411 synonymous differences were observed at these sites; thus, even if present, sequencing errors are likely to have made up only a small fraction (approximately 5%) of the total synonymous polymorphism. At such a rate, sequencing errors would have little effect on our estimates of nucleotide diversity at synonymous sites or the age of the common ancestor of the two genomes. In addition, the hypothesis that the single nucleotide polymorphisms (SNPs) observed between these genotypes are real received strong support from a recent study that observed a number of the same SNPs in clinical isolates (24). Moreover, since sequencing errors are expected to occur at random with respect to the reading frame of coding sequences, the fact that mean [p.sub.S] exceeded mean [p.sub.N] in both Group A and Group B was strong evidence against the hypothesis that a substantial proportion of the observed polymorphism was due to sequencing error. Simple considerations of probability can explain why earlier studies produced relatively low estimates of this species' age. If we assume that the per-site probability of a synonymous difference between two M. tuberculosis genomes is equal to the mean [p.sub.S] observed between H37Rv and CDC1551 (0.000328), then the probability is approximately 95% that no synonymous differences will be seen in a gene with 150 synonymous sites. The probability that no synonymous differences will be seen in 10 such loci chosen at random is approximately 60%, and the probability that no synonymous differences will be observed at 20 such loci is approximately 37%. On the other hand, the probability that no synonymous differences will be seen at 100 such loci is <1%. These calculations emphasize the need to examine a very large number of nucleotide sites to obtain a reliable estimate of nucleotide diversity and thus of the age of the most recent common ancestor The most recent common ancestor (MRCA) of any set of organisms is the most recent individual from which all organisms in the group are directly descended. The term is most frequently used of humans. in cases where the frequency of substitution is less than one in a thousand. Even when the frequency of substitution is between one in a thousand and one in a hundred, substantial stochastic By guesswork; by chance; using or containing random values. stochastic - probabilistic error is possible if the number of loci examined is small. Thus, any study that estimates population parameters from nucleotide sequence data needs to survey a substantial number of loci. These considerations are particularly important in the case of pathogenic microorganisms, where a number of factors (including both natural selection and horizontal gene transfer) may lead to substantial differences among loci with respect to the level of nucleotide diversity. Comparison of two complete genomes of M. tuberculosis showed a greater extent of sequence polymorphism than would be expected on the basis of previous studies, in turn suggesting that analysis of additional genomes will likely show further polymorphism. Polymorphism in any species of pathogen may complicate therapeutic strategies because it implies the existence of variation on which selection can act, including selection imposed by human vaccines and pharmacologic agents (20). On the other hand, known polymorphisms may prove useful to investigators in reconstructing the evolutionary relationships among clinical isolates and in providing markers for understanding the genetic basis of complex phenotypic traits.
Table 1. Estimates(a) of the divergence time of the H37Rv and
CDC1551 genotypes of Mycobacterium tuberculosis
Synomymous
Reference No. loci substitutions/site/yr
11 67 4.7 [+ or -] 0.2 X [10.sup.-9]
12 128 4.4 X [10.sup.-9]
Divergence time
Reference (H37Rv and CDC1551)
11 34,900 [+ or -] 2,300 (b)
(33,500-36,400) (c)
12 37,300 [+ or -] 2,500 (b)
(a) Based on synonymous substitutions between Escherichia coli and
Salmonella typhimurium, assumed to have diverged 100 million years
ago (13,14).
(b) Estimates are shown [+ or -] standard error, based on standard
error of mean [p.sub.S].
(c) Range based on standard error of rate estimate.
Table 2. Proteins for which the nearest homologous comparison between
the H37Rv and CDC1551 genotypes of Mycobacterium tuberculosis has a
high [p.sub.S] value (Group B)
Accession nos. Protein function
Probable differential
deletion
NP_216309, NP_335079 unknown
NP_215713, NP_335504 unknown
NP_216319, NP_336310 PE repeat family
NP_215965, NP_335949 PE repeat family
Possible horizontal
gene transfer
NP_214910, NP_334815 unknown
NP_216104, NP_336077 unknown
NP_216281, NP_336535 unknown
NP_215835, NP_335809 adenylate cyclase
NP_216564, NP_336573 polyketide synthase
NP_217029, NP_337080 unknown
NP_216862, NP_335679 unknown
Mean [+ or -] S.E.
Accession nos. [p.sub.S]
Probable differential
deletion
NP_216309, NP_335079 0.0470
NP_215713, NP_335504 0.0628
NP_216319, NP_336310 0.0115
NP_215965, NP_335949 0.0448
Possible horizontal
gene transfer
NP_214910, NP_334815 0.0226
NP_216104, NP_336077 0.0265
NP_216281, NP_336535 0.0210
NP_215835, NP_335809 0.0229
NP_216564, NP_336573 0.0093
NP_217029, NP_337080 0.0148
NP_216862, NP_335679 0.0313
Mean [+ or -] S.E. 0.0286 [+ or -] 0.0050
Accession nos. [p.sub.N]
Probable differential
deletion
NP_216309, NP_335079 0.0000
NP_215713, NP_335504 0.0043
NP_216319, NP_336310 0.0094
NP_215965, NP_335949 0.0185
Possible horizontal
gene transfer
NP_214910, NP_334815 0.0105
NP_216104, NP_336077 0.0084
NP_216281, NP_336535 0.0161
NP_215835, NP_335809 0.0068
NP_216564, NP_336573 0.0036
NP_217029, NP_337080 0.0156
NP_216862, NP_335679 0.0000
Mean [+ or -] S.E. 0.0085 [+ or -],
0.0019 (a)
(a) Paired sample t-test of the hypothesis that [p.sub.S] = [p.sub.N],
p<0.01. The quantities [p.sub.S] and [p.sub.N] are the proportion of
nucleotide difference per synonymous site and per nonsynonymous site,
respectively.
This research was supported by National Institutes of Health grants GM34940 and GM66710 to A.L.H. and AI01430 and AI1046669 to M.M. References (1.) Kato-Maeda M, Bifani P J, Kreiswirth BN, Small PN. The nature and consequence of genetic diversity within Mycobacterium tuberculosis. J Clin Invest 2001;107:533-7. (2.) Sreevatsan S, Pan X, Stockbauer KE, Connell ND, Kreiswirth BN, Whittam TS, et al. Restricted structural gene polymorphism in the Mycobacterium tuberculosis complex indicates evolutionarily recent dissemination. Proc Natl Acad Sci U S A 1997;94:9869-74. (3.) Kapur V, Whittam TS, Musser JM. Is Mycobacterium tuberculosis 15,000 years old.? J Infect Dis 1994; 170:1348-9. (4.) Betts JC, Dodson P, Quan S, Lewis AP, Thomas P J, Duncan K, et al. Comparison of the proteome pro·te·ome n. The complete set of proteins that are produced by the genes of an organism. proteome the entire complement of proteins produced by a cell. of Mycobacterium tuberculosis strain H37Rv with clinical isolate CDC 1551. Microbiology 2000;146:3205-16. (5.) Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, et al. Gapped BLAST and PSI-BLAST PSI-BLAST Position Specific Iterated Basic Local Alignment Search Tool : a new generation of protein database search programs. Nucleic Acids Nucleic acids The cellular molecules DNA and RNA that act as coded instructions for the production of proteins and are copied for transmission of inherited traits. Res 1997;25:3389-402. (6.) Thompson JD, Higgins DG, Gibson TJ. CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment A multiple sequence alignment (MSA) is a sequence alignment of three or more biological sequences, generally protein, DNA, or RNA. In general, the input set of query sequences are assumed to have an evolutionary relationship by which they share a lineage and are descended from a through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 1994;22:4673-80. (7.) Nei M, Gojobori T. Simple methods for estimating the numbers of synonymous and nonsynonymous nucleotide substitutions. Mol Biol Evol 1986;3:418-26. (8.) Hughes AL. Adaptive evolution of genes and genomes. New York New York, state, United States New York, Middle Atlantic state of the United States. It is bordered by Vermont, Massachusetts, Connecticut, and the Atlantic Ocean (E), New Jersey and Pennsylvania (S), Lakes Erie and Ontario and the Canadian province of : Oxford University Press; 1999. (9.) The International SNP SNP Scottish National Party Noun 1. SNP - (genetics) genetic variation in a DNA sequence that occurs when a single nucleotide in a genome is altered; SNPs are usually considered to be point mutations that have been evolutionarily Map Working Group. A map of human genome The human genome is the genome of Homo sapiens, which is composed of 24 distinct pairs of chromosomes (22 autosomal + X + Y) with a total of approximately 3 billion DNA base pairs containing an estimated 20,000–25,000 genes. sequence variation containing 1.42 million single nucleotide polymorphisms. Nature 2001;409:928-33. (10.) Ochman H, Elwyn S, Moran NA. Calibrating bacterial evolution. Proc Natl Acad Sci U S A 1999;96:12638-43. (11.) Sharp PM. Determinants of DNA sequence divergence between Escherichia coli and Salmonella typhimurium: codon usage, map position, and concerted evolution. J Mol Evol 1991 ;33:23-33. (12.) Smith NGC NGC New General Catalogue (of Nebulae and Star Clusters; astronomy) NGC National Geographic Channel (TV) NGC National Guideline Clearinghouse , Eyre-Walker A. Nucleotide substitution rate estimation in enterobacteria: approximate and maximum-likelihood methods lead to similar conclusions. Mol Biol Evol 2001;18:2124-6. (13.) Ochman H, Wilson AC. Evolution in bacteria: evidence for a universal substitution rate in cellular organisms. J Mol Evol 1987;26:74-86. (14.) Doolittle RF, Feng D-F, Tsang S, Cho G, Little E. Determining divergence times of the major kingdoms of living organisms: with a protein clock. Science 1996;271:470-7. (15.) Cole ST, Brosch R, Parkhill J, Gamier T, Churcher C, Harris D, et al. Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature 1998;393:537-44. (16.) Takahata N, Nei M. Allelic al·lele n. One member of a pair or series of genes that occupy a specific position on a specific chromosome. [German Allel, short for Allelomorph, allelomorph, from English genealogy genealogy (jē'nēŏl`əjē, –ăl`–, jĕ–), the study of family lineage. Genealogies have existed since ancient times. under overdominant and frequency- dependent selection and polymorphism of major histocompatibility complex major histocompatibility complex n. Abbr. MHC A chromosomal segment that codes for cell-surface histocompatibility antigens and is the principal determinant of tissue type and transplant compatibility. Also called HLA complex. loci. Genetics 1990;124:967-78. (17.) Le Dantec C, Winter N, Gicquel B, Vincent V, Picardeau P. Genomic sequence and transcriptional analysis of a 23-kilobase mycobacterial mycobacterial emanating from or pertaining to mycobacterium. mycobacterial granuloma may be caused by Mycobacterium tuberculosis (see cutaneous tuberculosis), M. linear plasmid: evidence for horizontal transfer and identification of plasmid maintenance systems. J Bacteriol 2001;183:2157-64. (18.) Stead WW. The origin and erratic global spread of tuberculosis. How the past explains the present and is the key to the future. Clin Chest Med 1997;18:65-77. (19.) Brosch R, Gordon SV, Marmiesse M, Brodin P, Buchrieser C, Eiglmeier K, et al. A new evolutionary scenario for the Mycobacterium tuberculosis complex. Proc Natl Acad Sci U S A 2002;99:3684-9. (20.) Hughes AL, Verra F. Very large long-term effective population size in the virulent human malaria parasite Plasmodium falciparum. Proc R Soc Lond B Biol Sci 2001;268:1855-60. (21.) Mu J, Duan J, Makova KD, Joy DA, Huynh CQ, Branch OH, et al. Chromosome- wide SNPs reveal an ancient origin for Plasmodium falciparum. Nature 2002;418:323-6. (22.) Burnet burnet, hardy perennial herb of the family Rosaceae (rose) found in temperate regions, usually with white or greenish flowers. The European species are sometimes cultivated for the leaves, which are used in salads, for flavoring, and formerly as a poultice to stop M. Natural history of infectious disease Infectious disease A pathological condition spread among biological species. Infectious diseases, although varied in their effects, are always associated with viruses, bacteria, fungi, protozoa, multicellular parasites and aberrant proteins known as prions. . Cambridge: Cambridge University Press Cambridge University Press (known colloquially as CUP) is a publisher given a Royal Charter by Henry VIII in 1534, and one of the two privileged presses (the other being Oxford University Press). ; 1940. (23.) Read TD, Salzberg SL, Pop M, Shumway M, Umayam L, Jiang L, et al. Comparative genome sequencing for discovery of novel polymorphisms in Bacillus anthracis Bacillus anthracis Infectious disease A gram-positive organism which causes often fatal infections when its endospores–resistant to heat, drying, UV light, gamma radiation, and many disinfectants–enter the body and cause septicemia Military medicine . Science 2002;296:2028-33. (24.) Fleischmann RD, Alland D, Eisen JA, Carpenter L, White O, Peterson J, et al. Sequencing of the M. tuberculosis genome: comparison of a recent clinical isolate with the laboratory strain. J Bacteriol 2002. In press. Austin L. Hughes, * Robert Friedman, * and Megan Murray ([dagger]) * University of South Carolina
• • , Columbia, South Carolina Columbia is the state capital and largest city of South Carolina. As of 2006, estimates for the population of the city proper is 122,819[1]. Columbia is the county seat of Richland County, but a small portion of the city extends into Lexington County. , USA; and ([dagger]) Harvard School of Public Health The Harvard School of Public Health is (colloquially, HSPH) is one of the professional graduate schools of Harvard University. Located in Longwood Area of the Boston, Massachusetts neighborhood of Mission Hill, next to Harvard Medical School and Cambridge, Massachusetts, , Boston, Massachusetts “Boston” redirects here. For other uses, see Boston (disambiguation). Boston is the capital and most populous city of Massachusetts.[3] The largest city in New England, Boston is considered the unofficial economic and cultural center of the entire New , USA Dr. Hughes is director of the Biotechnology Institute The Biotechnology Institute is an independent nonprofit organization founded to teach the public about the benefits of biotechnology. It was created in 1998 by the biotechnology industry and is located in Arlington, Virginia. and professor in the Department of Biological Sciences at the University of South Carolina. His research uses computational analysis of molecular sequence data to understand host-parasite co-evolution, genome evolution, and the population biology Population biology is a study of biological populations of organisms, especially in terms of biodiversity, evolution, and environmental biology. Malthus can almost be considered an early population biologist, even though his training was in economics and the term population of human pathogens. Address for correspondence: Austin L. Hughes, Department of Biological Sciences, University of South Carolina, Coker Life Sciences Bldg., 700 Sumter St., Columbia, SC 29208, USA; fax: 803-777-4002; e-mail: austin@biol.sc.edu |
|
||||||||||||||||||

i·a·bil
Printer friendly
Cite/link
Email
Feedback
Reader Opinion