Printer Friendly
The Free Library
14,503,922 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Global analysis of microbial translation initiation regions.


The availability of genomic sequences from multiple bacteria has allowed global comparisons of patterns. Here we present a graphical comparison of normalized base frequencies in the vicinity of translation starts for both eubacteria eubacteria

Term formerly used to describe and differentiate the true bacteria from the archaebacteria. Today, the true bacteria form the domain Bacteria, and the archaebacteria (also an obsolete term) form the domain Archaea.
 and archae. The results show that most eubacterial Open Reading Frames (ORFs) are preceded by a distinctly recognizable Shine-Dalgarno (SD) sequence pattern. However, some eubacteria deviate from this arrangement and have diminished SD patterns or completely lack this sequence. On the other hand, some archae seem to use both SD sequences and leaderless transcripts in their translation initiation process. This is dependent on the position of a gene within an operon. Most archae seem to have other regular sequences located upstream from the typical SD location. Both eubacteria and archae have a surprising repetitive pattern seen within the averaged ORFs. The eubacterial and archaeal averaged patterns are slightly different from each other, and individual organisms within each domain vary from the averages. Nevertheless, the existence of such a periodicity periodicity /pe·ri·o·dic·i·ty/ (per?e-ah-dis´i-te) recurrence at regular intervals of time.

pe·ri·o·dic·i·ty
n.
1.
 within ORFs may allow the development of new techniques to identify actual genes from ORFs.

Keywords: translation initiation, eubacterial, archaeal, Shine-Dalgarno, alignment

**********

Eubacteria initiate their translational process by binding mRNA to the small ribosomal subunit. This occurs because of a complementarity com·ple·men·tar·i·ty
n.
1. The correspondence or similarity between nucleotides or strands of nucleotides of DNA and RNA molecules that allows precise pairing.

2.
 between a sequence at the 3' end of 16S rRNA and the Shine-Dalgarno (SD) sequence just 5' to the initiation codon codon: see nucleic acid.  (Shine and Dalgarno, 1974; Gualerzi and Pon, 1990). However, rarely, some eubacteria have been shown to lack an untranslated leader of sufficient length to contain an SD sequence (Van Etten and Janssen, 1998). In addition, there is some evidence that Mycoplasma mycoplasma

Any of the bacteria that make up the genus Mycoplasma. They are among the smallest of bacterial organisms. The cell varies from a spherical or pear shape to that of a slender branched filament.
 species may have a high proportion of leaderless transcripts. On the other hand, archae may more routinely use heterogenous (spelling) heterogenous - It's spelled heterogeneous.  mechanisms for translation initiation (Saito and Tomita, 1999). Work on two different species, Pyrobaculum aerophilum (Slupska et al., 2001) and Sulfolobus solfataricus (Tolstrup et al., 2000), has shown that while many genes seem to have SD sequences in the proper location, a significant number of others are likely to have leaderless transcripts.

The availability of large numbers of complete genomic sequences provides the opportunity to search for patterns within and among genomes. In particular there were 56 eubacterial and 11 archaeal sequences that were published by the middle of January, 2002 (http://www.ncbi.nlm.nih.gov:80/PMGifs/Genomes/micr.html). Newly sequenced genomes are subjected to computational methods that produce a collection of open reading frames (ORFs) that presumably pre·sum·a·ble  
adj.
That can be presumed or taken for granted; reasonable as a supposition: presumable causes of the disaster.
 correspond to the genes in the organisms. This type of analysis has already led to insights into the translational process (Saito and Tomita, 1999; Sakai et al., 2001; Ma et al., 2002).

We construct a matrix using a maximum likelihood statistical approach (Hertz and Stormo, 1996) and combine it with a graphical representation of the results to show results averaged over all ORFs for all available sequenced microbial microbial

pertaining to or emanating from a microbe.


microbial digestion
the breakdown of organic material, especially feedstuffs, by microbial organisms.
 genomes. This approach can reveal common patterns and deviations from these patterns for microorganisms. We discuss the results as they relate to translation initiation mechanisms.

MATERIALS AND METHODS

Sequences for the following genomes were available as of 01-22-02 at the Entrez Genomes section of the National Center for Biotechnology Information The National Center for Biotechnology Information (NCBI) is part of the United States National Library of Medicine (NLM), a branch of the National Institutes of Health. The NCBI is located in Bethesda, Maryland and was founded in 1988.  (NCBI) Web site (http://www.ncbi.nlm.nih.gov:80/PMGifs/Genomes/micr.html) (hereafter referred to as the NCBI Microbial Genomes web site):

Agrobacterium tumefaciens Cereon circular chromosome (Goodner et al., 2001)

Agrobacterium tumefaciens Dupont circular chromosome (Wood et al., 2001)

Aeropyrum pernix K1 (Kawarabayasi et al, 1999)

Aquifex aeclicus chromosome (Deckert et al., 1998)

Archaeoglobus fulgidus (Kienk et al., 1997)

Bacillus bacillus (bəsĭl`əs), any rod-shaped bacterium or, more particularly, a rod-shaped bacterium of the genus Bacillus. Some bacterium in the genus cause disease, for example B.  halodurans C-125 (Takami et al., 2000)

Bacillus subtilis (Kunst et al., 1997)

Borrelia burgdorferi Borrelia burg·dor·fe·ri
n.
A spirochete causing Lyme disease in humans.


Borrelia burgdorferi The spirochete agent of Lyme disease, which contains several outer membrane proteins and a highly immunogenic flagellar
 chromosome (Fraser et al., 1997)

Brucella melitensis Brucella mel·i·ten·sis
n.
A bacterium causing brucellosis in humans, abortion in goats, and a wasting disease in chickens.
 chromosome I, chromosome II (DelVecchio et al., 2002)

Buchnera sp. APS (Shigenobu et al., 2000)

Campylobacter jejuni Campylobacter jejuni Vibrio jejuni, Campylobacter fetus ssp jejuni A curved or spiral gram-negative bacillus with a single polar flagellum Epidemiology Linked to contact with domestic and farm animals, unpasteurized milk, primates, day care  (Parkhill et al., 2000)

Caulobacter crescentus (Nierman et al, 2001)

Chlamydophila pneumoniae CWL CWL Catholic Women's League
CWL Campus Wide Login
CWL Center for Writing and Learning
CWL Concealed Weapons License
CWL Cardiff, Wales, United Kingdom - Cardiff-Wales (Airport Code)
CWL Congestion Window Limit
CWL Crying With Laughter
029 (Kalman et al., 1999)

Chlamydophila pneumoniae AR39 (Read et al., 2000)

Chlamydophila pneumoniae J138 (Shirai et al., 2000)

Chlamydia trachomatis Chlamydia tra·cho·ma·tis
n.
A species of Chlamydia that causes trachoma, inclusion conjunctivitis, lymphogranuloma venereum, nonspecific urethritis, and proctitis in humans.
 (Stephens et al., 1998)

Chlamydia muridarum chromosome (Read et al., 2000)

Clostridium acetobutylicum chromosome (Nolling et al., 2001)

Deinococcus radiodurans R1 chromosome 1, chromosome 2 (White et al., 1999)

Escherichia coli Escherichia coli (ĕsh'ərĭk`ēə kō`lī), common bacterium that normally inhabits the intestinal tracts of humans and animals, but can cause infection in other parts of the body, especially the urinary tract.  K12 (Blattner et al., 1997)

Escherichia coli O157:H7 EDL See nonlinear video editing.

(language) EDL -

1. Experiment Description Language.

2. Event Description Language.
933 (Perna et al., 2001)

Escherichia coli 0157:117 (Hayashi et al., 2001)

Halobacterium Halobacterium

obligate halophiles which spoil meat of high salt content.
 sp. NRC-1 (Fleiscbmann et al., 1995)

Haemophilus influenzae Haemophilus in·flu·en·zae
n.
A gram-negative, rod-shaped bacterium of the genus Haemophilus, especially Haemophilus influenzae type b, that occurs in the human respiratory tract and causes acute respiratory infections, acute conjunctivitis, and
 (Ng et al, 2000)

Helicobacter pylori Helicobacter pylori
A gramnegative rod-shaped bacterium that lives in the tissues of the stomach and causes inflammation of the stomach lining.

Mentioned in: Indigestion, Ulcers

Helicobacter pylori
 26695 (Tomb et al., 1997)

Helicobacter pylori J99 (Alm et al., 1999)

Lactococcus lactis subsp. lactis (Bolotin et al., 2001)

Listeria Listeria /Lis·te·ria/ (lis-ter´e-ah) a genus of gram-negative bacteria (family Corynebacterium); L. monocyto´genes causes listeriosis.

Lis·te·ri·a
n.
 monocytogenes EGD-e (Glaser et al., 2001)

Listeria innocua (Glaser et al., 2001)

Methanobacterium thermoautotrophicum (Smith et al., 1997)

Methanococcus jannaschii chromosome (Bult et al., 1996)

Mesorhizobium loti chromosome (http://www.kazusa.or.jp/rhizobase/)

Mycobacterium tuberculosis Mycobacterium tuberculosis
n.
Tubercic bacillus.


Mycobacterium tuberculosis
 H37Rv (Cole et al., 1998)

Mycobacterium tuberculosis CDC See Control Data, century date change and Back Orifice.

CDC - Control Data Corporation
1551 (Fleischmann, R.D., D. Alland, J.A. Eisen, L. Carpenter, O. White, J. Peterson, R. DeBoy, R. Dodson, M. Gwinn, D. Haft, et al., Whole genome comparison of Mycobacterium tuberculosis clinical and laboratory strains. Unpublished (listed in NCBI Microbial Genomes website).

Mycobacterium leprae Mycobacterium lep·rae
n.
Hansen's bacillus.


Mycobacterium leprae Infectious disease The mycobacterium that causes leprosy. See Leprosy.
 (Cole et al., 2001)

Mycoplasma genitalium (Fraser et al., 1995)

Mycoplasma pneumoniae Mycoplasma pneu·mo·ni·ae
n.
A microorganism causing primary atypical pneumonia in humans.
 (Himmelreich et al., 1996)

Mycoplasma pulmonis (Chambaud et al., 2001)

Neisseria meningitidis Neisseria men·in·git·i·dis
n.
The bacteria that is the causative agent of cerebrospinal meningitis; meningococcus.


Neisseria meningitidis 
 MC58 (Tettelin et al., 2000)

Neisseria meningitidis Z2491 (Parkhill et al., 2000)

Nostoc nostoc

Any of the cyanobacteria that make up the genus Nostoc. The cells are arranged in beadlike chains grouped together in a gelatinous mass. Ranging from microscopic to walnut-sized, nostoc masses may be found on soil and floating in still water.
 sp. PCC PCC prothrombin complex concentrate.  7120 (Kaneko et al., 2001)

Pasteurella multocida Pasteurella mul·to·ci·da
n.
A bacterium that causes fowl cholera and hemorrhagic septicemia in warm-blooded animals.
 (May et al., 2001)

Pseudomonas aeruginosa Pseudomonas aeruginosa A normal soil inhabitant and human saprophyte that may contaminate various solutions in a hospital, causing opportunistic infection in weakened Pts Clinical Infective endocarditis in IVDAs, RTIs, UTIs, bacteremia, meningitis, 'malignant'  (Stover stover

stalks of maize plants from which mature corn cobs have been harvested as grain, or grain sorghum plants from which heads have also been removed. The stover is usually fed by turning the cattle into the field and is subject to fungal infection, sometimes causing mycotoxicosis.
 et al., 2000)

Pyrococcus abyssi (Heilig, R. Pyrococcus abyssi genome sequence: insights into archaeal chromosome structure and evolution Unpublished (listed in NCBI Microbial Genomes website).

Pyrococus horikoshii (Kawarabayasi et al., 1998)

Rickettsia conorii Rickettsia co·no·ri·i
n.
A bacterium that causes boutonneuse fever in humans.
 Malish 7 (Ogata et al., 2001)

Rickettsia prowazekii Rickettsia pro·wa·zek·i·i
n.
A bacterium that causes epidemic typhus fever.
 (Ogata et al., 2001)

Ralstonia solanacearum (Salanoubat, M., S. Genin, F. Artiguenave, J. Gouzy, S. Mangenot, M. Arlat, A. Billault, P. Brottier, J.C. Camus, L. Cattolico, et al., Genome sequence of the plant pathogen Ralstonia solanacearum Unpublished (listed in NCBI Microbial Genomes website).

Salmonella typhimurium Salmonella ty·phi·mu·ri·um
n.
A bacterium that causes food poisoning.
 LT2 (McClelland et al., 2001)

Salmonella typhi Salmonella ty·phi
n.
Typhoid bacillus.
 (Parkhill et al., 2001)

Sinorhizobium meliloti (Capela et al., 2001)

Staphylococcus aureus Staphylococcus au·re·us
n.
A bacterium that causes furunculosis, pyemia, osteomyelitis, suppuration of wounds, and food poisoning.


Staphylococcus aureus Staphylococcus pyogenes
 N315 (Kuroda et al., 2001)

Staphylococcus aureus Mu5O (Kuroda et al., 2001)

Streptococcus pneumoniae Streptococcus pneu·mo·ni·ae
n.
Pneumococcus.


Streptococcus pneumoniae Microbiology A pathogenic streptococcus with 90 serotypes associated with pneumonia, bacteremia, meningitis Transmission Person to person Incidence
 TIGR TIGR The Institute for Genomic Research
TIGR Treasury Investment Growth Receipt
TIGR This Is Getting Ridiculous
TIGR Thermally Induced Gallium Removal
TIGR TSPI Interface for GPS/RAJPO
4 (Tettelin et al., 2001)

Streptococcus pneumoniae R6 (Hoskins et al., 2001)

Streptococcus pyogenes Streptococcus py·og·e·nes
n.
A bacterium that causes the formation of pus or of fatal septicemias.


Streptococcus pyogenes
A common bacterium that causes strep throat and can also cause tonsillitis.
 (Ferretti et al., 2001)

Sulfolobus solfataricus (She et al., 2001)

Sulfolobus tokodaii (Kawarabayasi et al., 2001)

Synechocystis PCC6803 (Kaneko et al., 1996)

Thermoplasma acidophilum (Ruepp et al., 2000)

Thermoplasma volcanium (Kawashima et al., 1999)

Treponema pallidum Treponema pal·li·dum
n.
A spirochete that causes syphilis in humans.


Treponema pallidum Infectious disease The spirochete that causes syphilis Epidemiology 9000 cases/yrs–US, primarily in the SE US.
 (Fraser et al., 1998)

Thermotoga maritima (Nelson et al., 1999)

Ureaplasma urealyticum Ureaplasma urealyticum T strain mycoplasma Microbiology A species of small gram-negative bacteria of the family Mycoplasmataceae that lack a cell wall and catabolize urea–to ammonia; U urealyticum  (Glass et al., 2000)

Vibrio cholerae Vibrio chol·er·ae
n.
A bacterium that causes Asiatic cholera in humans; Koch's bacillus.


Vibrio cholerae Infectious disease The Vibrio
 chromosome I, chromosome II (Heidelberg et al., 2000)

Xylella fastidiosa chromosome (Simpson et al., 2000)

Yersinia pestis Yersinia pes·tis
n.
A bacterium that causes plague and is transmitted from rats to humans by the rat flea Xenopsylla cheopis. Also called Pasteurella pestis.
 chromosome (Parkhill et al., 2001)

Software was developed using Pert and run on Sun 0S 5.7. Data was read from standard formatted data files found at the Entrez-Genome site hosted by the NCBI (NCBI Microbial Genomes website) and the information needed to align the open reading frames was extracted. Each eubacterial and archaeal DNA sequence DNA sequence Genetics The precise order of bases–A,T,G,C–in a segment of DNA, gene, chromosome, or an entire genome. See Base pair, Base sequence analysis, Chromosome, Gene, Genome.  was downloaded in FASTA format (*.fna) and the open reading frame information was selected from the same site in ProTein Table (*.ptt) format. No reformatting of the data was required. The only input required by the software was the location of the genome files and the start/stop locations for the alignment (in this case -70 to +50 where o is the first base in the start codon). The program first removed all the end-line characters and calculated the G-C G-C Commandant of the Coast Guard  content of the strain. Next, each ORF segment was aligned with the start codon beginning at the zero location (ORFs indicated as being in the opposite direction were computed as reverse compliments and aligned). The bases at each position were then totaled b y summing over all ORFs. The sums were divided by the total number of ORFs to find the real frequency and then normalized by dividing by the expected base content at each position by using the G-C content of the organism. Next, a log probability of any given base in the region was calculated by taking the log of these normalized values (Staden, 1984; Hertz and Stormo, 1996; Stormo, 2000). The values were then output into a data file. This was repeated for each genomic file. An average for each bacterial domain was calculated from all the frequencies in that domain. A total of 58 eubacterial sequences were run requiring 1598 seconds and totaling 146,335 ORFs. A total of 11 archaeal sequences were run requiring 235 seconds and totaling 23,361 ORFs.

Subtraction subtraction, fundamental operation of arithmetic; the inverse of addition. If a and b are real numbers (see number), then the number ab is that number (called the difference) which when added to b (the subtractor) equals  of individual organismal patterns from the domain average was accomplished by direct subtraction of frequencies at each base. The result was reploted to show the differences in frequencies.

Sulfolobus solfataricus and Halobacterium sp. NRC-1 genomes were examined to identify ORFs that were likely to be members of operons. In order to qualify, ORFs must have been assigned an identity at the NCBI Microbial Genomes web site and be part of a recognizable cluster of related genes. The selected genes were related by function (as in collections of ribosomal protein genes or subunits of a protein like NADH dehydrogenase) or by being part of a pathway (like proteins in the cobalamin cobalamin: see coenzyme; vitamin.  biosynthetic bi·o·syn·the·sis  
n.
Formation of a chemical compound by a living organism. Also called biogenesis.



bi
 pathway). The gene cluster had to be closely linked with fewer than ten bases separating the genes (many overlapped slightly). The initial gene in the putative operon had to be separated by 50 to 100 bases from the preceding gene and had to have no identifiable functional relationship with it. No hypothetical protein genes were used.

A version of the program used for this work is accessible at the following web site: http://www. msstate.edu/dept/biochemistry/CBIG/. In addition the aligned genomes of all microbial species examined are available at this site.

RESULTS

Examination of individual eubacterial genomic sequence patterns and the eubacterial average over all reported ORFs (Figure 1) reveals distinctive patterns near the start codon. The averaged start codons themselves show the expected overwhelming presence of A, T, and especially G in the third position (base number 2 in the representation used). There is a general enrichment of A's both upstream and downstream from the start codon with the obvious exception of the Shine-Dalgarno region and a decline in A as the last base before the start. There is also a general decrease in G's in this start codon proximal region except for SD. The expected SD sequences upstream from start contribute to a distinctive pattern. The canonical sequence of AGGAGG may vary somewhat in its location with respect to translation start. The net result is a distribution of this sequence over a range of locations in the genomic patterns and in the eubacterial average. The high G content of SD yields a broad G peak centered around -9 to -10 bases upstream from start. There is arise in A content upstream of this peak and a definite decline within the peak. T and C frequencies drop off dramatically in the SD region.

Individual eubacterial genomes vary from the average pattern. A few eubacteria maintain a preponderance of A's over G's throughout the SD region. This is true for Caulobacter crescentus, Mycobacterium tuberculosis, and Xylella fastidioasa (Figure 2). These still have the decline in T's and C's as observed for others. Some eubacteria, e.g. Bacillus subtilis (Figure 3) and Staphylococcus aureus (see Figure 1), have SD regions whose center is shifted slightly upstream from the average and have a higher frequency of G's and lower frequency of T's and C's in this region as compared to the average. This is highlighted by subtracting the organismal pattern from the eubacterial average (Figure 3).

Synechocystis PCC6803 has a clear deviation from the average pattern (Figure 4). While there is a rise in G's at SD, it is not nearly so pronounced as in the average. The decline in C's is also not conspicuous. There is no obvious explanation for this anomaly. Deinococcus radiodurans also presents an anomalous pattern (Figure 4). Instead of a G peak in the SD region, it displays a prominent A peak and only small decreases in T and C. There is a T peak at position -7. The patterns are consistent for both chromosomes.

Other notable exceptions from the eubacterial pattern are seen in two Mycoplasma species, Mycoplasma genitalium and Mycoplasma pneumoniae (Figure 4). While both evidence the general rise in A's and decline in G's near the start codon, each lacks evidence of SD sequences. In contrast Mycoplasma pulmonis fits the standard eubacterial pattern except that a larger proportion of its ORFs start with something other than ATG ATG antithymocyte globulin.
lymphocyte immune globulin (antithymocyte globulin equine, ATG, ATG equine, LIG)

Atgam

Pharmacologic class: Immunoglobulin

Therapeutic class: Immunosuppressant
. Removal of the M. genitalium and M. pneumoniae M. pneumoniae,
n a species of
Mycoplasma causing mycoplasma pneumonia, which is characterized by symptoms of an upper respiratory infection with a dry cough and fever.
 frequencies from the eubacterial average has little affect on the pattern since together they only contribute 1173 ORFs out of the almost 150,000 found in the overall average.

The average over all archaeal sequences shows a much less uniform pattern than for eubacteria (Figure 5). While this could be due to the smaller sample size of archaeal sequences, the number of ORFs involved is still over 23,000. The variations are more likely due to the diverse nature of the archaeal kingdom. It is also likely due to diverse translation initiation mechanisms for different classes of genes within individual archae (Tolstrup et al., 2000; Slupska et al., 2001).

There are some similarities between the archaeal and eubacterial averages. There is an enrichment in A's immediately upstream and downstream from the start codon. The last base before this codon is depleted de·plete  
tr.v. de·plet·ed, de·plet·ing, de·pletes
To decrease the fullness of; use up or empty out.



[Latin d
 in A. There is a decrease in G before the start, but, in contrast to eubacteria, there is no general depletion in G after the start codon. While there is a clear G peak and decline in A's and C's in the SD region, there is no drop in T's here. In addition the SD pattern is not so pronounced as in the eubacterial case. Individual archae lack distinct SD patterns. Thermoplasma acidophilum, Aeropyrum pernix, and Halobacterium sp. NRC-1 have none of the standard base distributions here. On the other hand, with the exception of Aeropyrum pernix, they and the other archae have apparent consistent structure in the region upstream from the SD location that is seen in no eubacteria (Figure 5). From -27 to -30 there is a preponderance of T's and -19 to -26 and -31 to -35 shows a preference for A in all the archae. T hese regions seem to show a diminished amount of C and G although this is not pronounced.

Examination of the results for Sulfolobus solfataricus reveals some of the complexity of archaeal genes (Figure 6). Its genomic average is very similar to the overall archaeal average. However, a sampling of 63 genes that are likely to be transcribed as internal members of polycistronic mRNAs shows a pattern that presents a clear SD signal with little upstream structure. Sampling 40 genes that are likely to be the initial sequences in polycistrons presents a noisy average but reveals no SD pattern and suggests the possible presence of the A-T A-T Ataxia Telangiectasia (form of muscular weakness)  rich and C-G poor region from -35 to -19. Combination of the patterns from the two gene classes would yield the sequence structures found upstream from the translation start signal in the overall genomic pattern for Sulfolobus.

Sampling 72 genes that appear to be internal and 29 that appear to be the first in an operon in Halobacterium sp. NRC-1 gives a similar result (data not shown). An A-T rich region is seen from -38 to -25 with the typical T peak surrounded by A peaks when the operon initiating genes are averaged. A G rich region from -12 to -8 and correspondingly C poor region is suggestive of suggestive of Decision making adjective Referring to a pattern by LM or imaging, that the interpreter associates with a particular–usually malignant lesion. See Aunt Millie approach, Defensive medicine.  an SD region for genes internal to operons. This region is obscured in the overall average.

The gene sequences downstream from the start codon show a surprising regularity in all bacteria examined. This is not an artifact of the alignment; combination of random genomic sequences (data not shown) presents only noise. A similar regularity has been observed by others for individual genes and organisms (Fickett, 1984; Tsonis et al., 1991; Suckow et al., 1998).

DISCUSSION

The statistical and graphical approach used here shows averages of the sequenced eubacterial and archaeal genomes. Patterns that relate to the translational processes are readily visualized. Comparisons with individual organisms reveal possible deviations from the standard processes. As expected, most eubacteria have a distinct indication of a Shine-Dalgamo region located just upstream from the initiation codon. There is also a general enrichment of A's near the start with the exception of a clear decrease at position -1. There is no clear evidence of a downstream box that has been suggested to be a translation enhancer in the region of +7 to +12 (Sprengart et al., 1996).

Deviations from the average results for eubacteria could be caused by several factors. Misidentification of ORFs by various gene finding programs could produce noise in the data. An ORF may be incorrectly extended in the 5' direction beyond the actual start codon simply because there is an open reading frame available beyond that codon. This would place the real SD site within the gene when the alignment is done. Some sequences identified as ORFs may not be real genes. This would also introduce noise in the alignment. Conceivably, some genes may contain translational elements that are quite different from the average. It might be an interesting application of this technique to identify OREs that deviate in some statistical way from the average. Do they cluster in some fashion? Do they have unusual characteristics as genes or do they look like misassigned sequences?

Some organisms deviate in definite but not critical ways from the average. A few have a G peak in the SD region that is less than the A peak but still have clear decreases in T and C. B. subtilis and S. aureus The aureus (pl. aurei) was a gold coin of ancient Rome valued at 25 silver denarii. The aureus was regularly issued from the 1st century BC to the beginning of the 4th century AD, when it was replaced by the solidus.  have intense but shifted SD signals. This is most clearly seen by subtracting the average results from the organismal results. These variations from the average are not likely to represent large scale differences in translation initiation.

Some organismal averages are distinct from the eubacterial average. Synechocystis shows an exceptionally weak SD pattern and also has reasonably prominent C frequencies on either side of the start codon. Other work has suggested that this bacterium is somewhat different in its translation initiation and may have different classes of genes (Osada et al., 1999; Sakai et al., 2001). However, examination of only the highly expressed class of genes (Mrazek et al., 2001) (data not shown) does not show much variation from the overall average. Deinococcus radiodurans shows the typical lower values for C and T frequencies in the SD region, but there is an A peak here for both chromosomes. G tends to increase around -8 to -9. This is unlike any other eubacterial genome. Examination of the 16S ribosomal RNA ribosomal RNA
n.
See rRNA.


ribosomal RNA (rī´bōsō´m
 genes in both Synechocystis and Deinococcus shows the expected AGGAGG sequence (NCBI Microbial Genomes website) and so altered complementarity is not to be expected. The regularity seen within the coding region is al so clearly different for Deinococcus. These two eubacteria are not related in any fashion with Synechocystis being a cyanobacterium cy·a·no·bac·te·ri·um  
n. pl. cy·a·no·bac·te·ri·a
A photosynthetic bacterium of the class Coccogoneae or Hormogoneae, generally blue-green in color and in some species capable of nitrogen fixation.
 and Deinococcus being one of the most unusual eubacterial extremophiles (White et al., 1999). It will be interesting to compare these results with any newly sequenced cyanobacteria cyanobacteria (sī'ənōbăktĭr`ēə, sī-ăn'ō–) or blue-green algae, photosynthetic bacteria that contain chlorophyll.  to see if a unique translation initiation pattern emerges.

The results seen with Mycoplasma genitalium and Mycoplasma pneumoniae highlight their high frequency of leaderless transcripts (Weiner et al., 2000). This phenomena, whereby an mRNA is produced with few bases in front of the start codon, has been observed in every type of organism but is usually rare (Van Etten and Janssen, 1998). Translation initiation seems to be accomplished through the actions of either bacterial or eukaryotic eukaryotic /eu·kary·ot·ic/ (u?kar-e-ot´ik) pertaining to a eukaryon or to a eukaryote.

eukaryotic

pertaining to eukaryosis.


eukaryotic cells
see cell.
 equivalents of IF-2 (Kyrpides and Woese, 1998; Grill et al., 2000; Grill et al., 2001). These two Mycoplasma species show no indication at all of an SD region. While Synechocystis and Deinococcus are unusual, they still evidence a diminished G peak; M genitalium and M pneumoniae have no distinguishable G peak. The other species of mycoplasma in this study, Mycoplasma pulmonis and the related organism Ureaplasma urealyticum, have a readily distinguishable SD pattern and therefore are not likely to have leaderless transcripts.

Archeal translation initiation is not as well understood as eubacterial initiation. Genomic analysis has suggested that at least some archae use a combination of eubacterial and eukaryotic mechanisms (Salin et al., 1991; Saito and Tomita, 1999). Some archaeal initiation factors are homologs of the eukaryotic equivalents (Bult et al., 1996; Keeling et al., 1998; Kyrpides and Woese, 1998) yet many archaeal genes have clear SD regions, appropriately complimentary regions at the 3' end of their 16S rRNA molecules, and no 5' CAP structure. Some archaeal transcripts are leaderless (May and Dennis, 1990; Condo et al., 1999; Slupska et al., 2001). Recent work has suggested that in at least some archea, the specific translation initiation mechanism depends on whether the gene in question is located internal to an operon or is an isolated gene or the first gene in an operon (Tolstrup et al., 2000; Slupska et al. 2001).

The results here reflect both the diversity of the archaeal domain and the diversity of mechanisms within individual organisms. Not all of the archae, especially Halobacterium and to a lesser extent Thermoplasma acidophilum and Aeropyrum pernix, have a clearly defined SD region. However, with the exception of Aeropyrum pernix, all the archaeal genomes have at least some indication of common sequence elements further upstream from an expected SD site. There are A rich regions located around -34 and -24 and a T rich region centered at -28. The T region and the -24 A region could readily correspond to an A box structure (consensus TTTA TTTA Transition Training and Technical Assistance project
TTTA Trinidad & Tobago Tartan Army
TTTA Timber Trade Training Association (UK) 
(A or T)A) (Wich et al., 1986; Reiter et al., 1988; Thomm and Wich, 1988; Reiter et al., 1990) slightly diffuse out due to heterogeneity in location. This element represents a transcription start signal yielding leaderless transcripts having no SD region and is found upstream of isolated genes and first genes in operons in Pyrobaculum aerophilum (Slupska et al., 2001) and Sulfolobu s solfataricus (Tolstrup et al., 2000; She et al., 2001). It was not found upstream of internal genes in operons. The results presented here for Sulfolobus and Halobacterium support this conclusion and lead to the suggestion that each genome result and the overall archaeal average represents a superposition su·per·po·si·tion  
n.
1. The act of superposing or the state of being superposed: "Yet another technique in the forensic specialist's repertoire is photo superposition" 
 of at least the two classes of genes. The SD region centered at -9 is from internal members of operons; the putative A box (the combination of the T peak at -28 and the A peak at -24) is the transcription signal for leaderless products of isolated genes and first genes in operons. The A region at -34 may be another type of transcriptional signal since it is likely to be too far upstream to be involved in translation initiation. It seems that most if not all archae studied here may use at least two mechanisms for translation initiation based on these results.

The repetitive pattern seen with the averaged coding sequences is consistently present in all organisms examined. It is especially interesting that a pattern is seen even when all OREs over both the eubacteria and archae are averaged (Figures 1 and 5). However, the observed patterns are slightly different in each case. There is a clear periodicity of three bases seen and this is likely due to codon structure. Nevertheless, there is no reason to expect such a regularity in base positions. Both eubacteria and archae have higher than expected G frequencies in the first codon position and lower than expected in the second position. This begins immediately after the start codon in archae but becomes prominent in eubacteria only after several codons. Conversely, T is elevated in the second position but diminished in the first position. The other bases show lower effects in the averages. Individual organisms can show patterns that are distinct from the averages. Figure 2 shows that Caulobacter crescentus has regular ly higher C and lower A in the third position. Figure 5 shows that Halobacterium has higher C and lower T in the third position and both higher A and T in the second position. The set of all 69 organisms examined has much variety but each member always shows some regular pattern.

The pattern observed internal to the ORFs is not seen upstream nor is it seen in random sequences. The pattern is observable even if few OREs are examined as is seen in Figure 6 when 63 and 40 ORFs are aligned. In particular, the G repeat stands out from the noisy background. Preliminary work shows that alignment of multiple codons within one gene also shows periodicity. This agrees with work done for some genes in Pyrococcus (Suckow et al., 1998). There has been some suggestion that examination of periodicity within ORFs could assist in gene identification (Tiwari et al., 1997). It remains to be seen whether eukaryotes will have similar patterns.

Detailed analysis should allow comparison of individual ORFs to organismal averages. This may prove useful in identifying ORFs that represent outliers in the data. Are they atypical because they are members of a particular gene class? The Sulfolobus and Halobacterium genes that begin operons show this characteristic. On the other hand, further study of some abnormal OREs might eventually show that they have been incorrectly identified as genes.

The combination of a visual approach with a maximum likelihood statistical treatment of aligned OREs can be powerful in revealing global patterns around translation initiation sites in both eubacteria and archae. The availability of large numbers of genomes allows formulation of reasonable averages for the two domains considered here. Since a broad range of bacteria were included, it is not likely that the results are skewed skewed

curve of a usually unimodal distribution with one tail drawn out more than the other and the median will lie above or below the mean.

skewed Epidemiology adjective Referring to an asymmetrical distribution of a population or of data
 in any particular way. No bacterium contributed more than 8% of the total ORFs in question.

ACKNOWLEDGMENTS

The authors would like to thank Dr. Carolyn Boyle for her advice on statistical validity of methods used. This work was supported by NSF NSF - National Science Foundation  Award No. EPS (Encapsulated PostScript) A PostScript file format used to transfer a graphic image between applications and platforms. EPS files contain PostScript code as well as an optional preview image in TIFF, WMF, PICT or EPSI, the latter being an ASCII-only format. 0082979. This is publication number J10086 of the Mississippi Agricultural and Forestry Experiment Station.

LITERATURE CITED

Alm, R.A., L.-S.L. Ling, D.T Moir, B.L. King, E.D. Brown, P.C. Doig, D .R. Smith, B. Noonan, B.C. Guild, B.L. deJonge, et al. 1999. Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori. Nature 397:176-180.

Blattner, F.R., G. Plunkett, III, C.A. Bloch, N.T. Pema, V. Burland, M. Riley, i. Collado-Vides, J.D. Glasner, C.K. Rode, G.F. Mayhew, et al. 1997. The complete genome sequence of Escherichia coli K-12. Science 277:1453-1474.

Bolotin, A., P. Wincker, S. Mauger, O. Jaillon, K. Malarme, J. Weissenbach, S.D. Ehrlich, and A. Sorokin. 2001. The complete genome sequence of the lactic acid lactic acid, CH3CHOHCO2H, a colorless liquid organic acid. It is miscible with water or ethanol. Lactic acid is a fermentation product of lactose (milk sugar); it is present in sour milk, koumiss, leban, yogurt, and cottage cheese.  bacterium Lactococcus lactis ssp. lactis IL1403. Genome Res. 11:731-753.

Bult, C.J., O. White, G.J. Olsen, L. Zhou, R.D. Fleischmann, G.G. Sutton, J.A. Blake, L.M. FitzGerald, R.A. Clayton J.D. Gocayne, et al. 1996. Complete genome sequence of the methanogenic archaeon ar·chae·on or Ar·chae·on  
n. pl. ar·chae·a
Any of a group of bacterialike microorganisms comprising a division of the Prokaryotae and usually thriving in extreme environments, often classified as a separate domain in taxonomic
, Methanococcus jannaschii. Science 273:1058-1073.

Capela, D., F. Barloy-Hubler, J. Gouzy, G. Bothe, F. Ampe, J. Batut, P. Boistard, A. Becker, M. Boutry, E. Cadieu, et al. 2001. Analysis of the chromosome sequence of the legume legume (lĕ`gym, lĭgy  symbiont symbiont /sym·bi·ont/ (sim´bi-ont) (sim´be-ont) an organism living in a state of symbiosis.

symbiont

an organism or species living in a state of symbiosis.
 Sinorhizobium meliloti strain 1021. Proc. Natl. Acad. Sci. USA 98:9877-9882.

Chambaud, I., R. Heilig, S. Ferris, V. Barbe, D. Samson, F. Galisson, I. Moszer, K. Dybvig, H. Wroblewski, A. Viari, et al. 2001. The complete genome sequence of the murine murine /mu·rine/ (mur´en) pertaining to, derived from, or characteristic of mice or rats.

mu·rine
adj.
 respiratory pathogen Mycoplasinapulmonis. Nucleic Acids Nucleic acids
The cellular molecules DNA and RNA that act as coded instructions for the production of proteins and are copied for transmission of inherited traits.
 Res. 29:2145-2153.

Cole, S.T., R. Brosch, J. Parkhill, T. Gamier, C. Churcher, D. Harris, S.V. Gordon, K. Eiglmeier, S. Gas, C.E. Barry, III, et al. 1998. Deciphering the biology of Mycobacterium tuberculosis from the complete genome sequence. Nature 393:537-544.

Cole, S.T., K. Eiglmeier, J. Parkhill, K.D. James, N.R. Thomson, P.R. Wheeler, N. Honore, T. Ganier, C. Churcher, D. Harris, et al. 2001. Massive gene decay in the leprosy bacillus. Nature 409:1007-1011.

Condo, I., A. Ciammaruconi, D. Benelli, D. Ruggero, and P Londei. 1999. Cis-acting signals controlling translational initiation in the thermophilic ther·mo·phil·ic
adj.
Requiring high temperatures for normal development, as certain bacteria.
 archaeon Sulfolobus solfataricus. Mol. Microbiol. 34:77-84.

Deckert, G., P.V. Warren, T. Gaasterland, W.G. Young, A.L. Lenox, D.E. Graham, R. Overbeek, M.A. Snead, M. Keller, M. Aujay, et al. 1998. The complete genome of the hyperthermophilic bacterium Aquifex aeolicus. Nature 392:353-358.

DelVecchio, V.G., V. Kapatral, R.J. Redkar, G. Patra, C. Mujer, T. Los, N. Ivanova, I. Anderson, A. Bhattacharyya, A. Lykidis, et al. 2002. The genome sequence of the facultative intracellular pathogen Brucella melitensis. Proc. Natl. Acad. Sci. USA 99:443-448.

Ferretti, J.J., W.M. McShan, D. Adjic, D. Savic, G. Savic, K. Lyon, C. Primeaux, S.S. Sezate, A.N. Surorov, S. Kenton, et al. 2001. Complete genome sequence of an Ml strainof Streptococcus pyogenes. Proc. Natl. Acad. Sci. USA 98:4658-4663.

Fickett, J.W. 1984. Fast optimal alignment. Nucleic Acids Res. 12:175-179.

Fleischmann, R.D., M.D. Adams, O. White, R.A. Clayton, E.F. Kirkness, A.R. Kerlavage, C.J. Bult, J.-F. Tomb, B.A. Dougherty, J.M. Merrick, et al. 1995. Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269:496-512.

Fraser, C.M., S. Casjens, W.M. Huang, G.G. Sutton, R.A. Clayton, R. Lathigra, 0. White, K.A. Ketchum, R. Dodson, E.K. Hickey, M. Gwinn, B. Dougherty, et al. 1997. Genomic sequence of a Lyme disease Lyme disease, a nonfatal bacterial infection that causes symptoms ranging from fever and headache to a painful swelling of the joints. The first American case of Lyme's characteristic rash was documented in 1970 and the disease was first identified in a cluster at  spirochaete Noun 1. spirochaete - parasitic or free-living bacteria; many pathogenic to humans and other animals
spirochete

eubacteria, eubacterium, true bacteria - a large group of bacteria having rigid cell walls; motile types have flagella
, Borrelia burgdorferi. Nature 390:580-586.

Fraser, C.M., J.D. Gocayne, O. White, M.D. Adams, R.A. Clayton, R.D. Fleischmann, C.J. Bult, A.R. Kerlavage, G.G. Sutton, J.M. Kelley, et al. 1995. The minimal gene complement of Mycoplasma genitalium. Science 270:397-403.

Fraser, C.M., S.J. Norris, G.M. Weinstock, O. White, G.G. Sutton, R. Dodson, M. Gwinn, E. K. Hickey, R. Clayton, K.A. Ketchum, et al. 1998. Complete genome sequence of Treponema pallidum, the syphilis spirochete spirochete

Any of an order (Spirochaetales) of spiral-shaped bacteria. Some are serious pathogens for humans, causing such diseases as syphilis, yaws, and relapsing fever. Spirochetes are gram-negative (see gram stain) and motile.
. Science 281:375-388.

Glaser, P., L. Frangeul, C. Buchrieser, A. Amend, F. Baquero, P. Berche, H. Bloecker, P. Brandt, T. Chakraborty, A. Charbit, et al. 2001. Comparative genomics of Listeria species. Science 294:849-852.

Glass, J.I., E.J. Leflcowitz, J.S. Glass, C.R. Heiner, E.Y Chen, and G.H. Cassell. 2000. The complete sequence of the mucosal pathogen Ureaplasma urealyticum. Nature 407:757-762.

Goodner, B., G. Hinkle, S. Gattung, N. Miller, M. Blanchard, B. Qurollo, B.S. Goldman, Y. Cao, M. Askenazi, C. Halling, et al. 2001. Genome Sequence of the Plant Pathogen and Biotechnology Agent Agrobacterium tumefaciens C58. Science 294:2323-2328.

Grill, S., C.O. Gualerzi, P. Londei, and U. Blasi. 2000. Selective stimulation of translation of leaderless mRNA by initiation factor 2: evolutionary implications for translation. EMBOJ EMBOJ European Molecular Biology Organization Journal . 19(l5):4101-4110.

Grill, S., I. Moll, D. Hasenohrl, C.O. Gualerziand, and U. Blasi. 2001. Modulation of ribosomal recruitment to 5'-terminal start codons by translation initiation factors IF2 and IF3. FEBS FEBS Federation of European Biochemical Societies  Lett. 495:167-171.

Gualerzi, C.O., and C.L. Pon. 1990. Initiation of mRNA translation in prokaryotes. Biochemistry 29:5881-5889.

Hayashi, T., K. Makino, M. Ohnishi, K. Kurokawa, K. Ishii, K. Yokoyama, C.-G. Han, E. Ohtsubo, K. Nakayama, T Murata, et al. 2001. Complete genome sequence of enterohemorrhagic Escherichia coli enterohemorrhagic Escherichia coli EHEC Any of the E coli serotypes–eg O29, O39, O145 that produces shiga-like toxins, causing bloody inflammatory diarrhea, evoking a HUS. See Escherichia coli O157:H7, Hemolytic uremic syndrome.  O157:H7 and genomic comparison with a laboratory strain K-12. DNA DNA: see nucleic acid.
DNA
 or deoxyribonucleic acid

One of two types of nucleic acid (the other is RNA); a complex organic compound found in all living cells and many viruses. It is the chemical substance of genes.
 Res. 8:11-22.

Heidelberg, J.E., J.A. Eisen, W.C. Nelson, R.A. Clayton, M.L. Gwinn, RJ. Dodson, D.H. Haft, E.K. Hickey, J.D. Peterson, L.A. Umayam, et al. 2000. DNA Sequence of both chromosomes of the cholera pathogen J'lbrio cholerae. Nature 406:477-483.

Hertz, G.Z., and G.D. Stormo. 1996. Escherichia coli promoter sequences: analysis and prediction. Methods Enzymol. 273:30-42.

Himmelreich, R., H. Hilbert, H. Plagens, E. Pirkl, B.C. Li, and R. Herrmann. 1996. Complete sequence analysis of the genome of the bacterium Mycoplasma pneumoniae. Nucleic Acids Res. 24:4420-4449.

Hoskins, J.A., W. Albom, Jr., J. Arnold, L. Blaszczak, S. Burgett, B.S. DeHoff, S. Estrem, L. Fritz, D.-J Eu, W Fuller, et al. 2001. The Genome of the Bacterium Streptococcus pneumoniae strain R6. J. Bacteriol. 183:5709-5717.

Kalman, S., W. Mitchell, R. Marathe, C. Lammel, J. Fan, R.W. Hyman, L. Olinger, J. Grimwood, R.W. Davis, and R.S. Stephens. 1999. Comparative genomes of Chlamydia pneumoniae Chlamydia pneumoniae C psittaci TWAR A pathogen that causes pneumonia, asymptomatic RTIs, pharyngitis, otitis media  and C. trachomatis. Nat. Genet genet: see civet. . 21:385-389.

Kaneko, T., Y. Nakamura, C.P. Wolk, T. Kuritz, S. Sasamoto, A. Watanabe, M. Iriguchi, A. Ishikawa, K. Kawashima, T Kimura, et al. 2001. Complete Genomic Sequence of the Filamentous filamentous /fil·a·men·tous/ (fil?ah-men´tus) composed of long, threadlike structures.

filamentous

composed of long, threadlike structures.
 Nitrogen-fixing Cyanobacterium Anabaena Anabaena

Genus of blue-green algae (cyanobacteria). Found as plankton in shallow water and on moist soil, they occur in both solitary and colonial forms and are capable of nitrogen fixation.
 sp. strain PCC 7120. DNA Res. 8:205-213.

Kaneko, T., S. Sato, H. Kotani, A. Tanaka, E. Asamizu, Y. Nakamura, N. Miyajima, M. Hirosawa, M. Sugiura, S. Sasamoto, et al. 1996. Sequence analysis of the genome of the unicellular unicellular /uni·cel·lu·lar/ (-sel´u-ler) made up of a single cell, as the bacteria.

u·ni·cel·lu·lar
adj.
Having or consisting of a single cell, as the protozoans; one-celled.
 cyanobacterium Synechocystis sp. strain PCC6803. II. Sequence determination of the entire genome and assignment of potential protein-coding regions. DNA Res. 3:109-136.

Kawarabayasi, Y., Y. Hino, H. Horikawa, K. Jin-no, M. Takahashi, M. Sekine, S. Baba, Ankai, A., H. Kosugi, A. Hosoyama, et al. 2001. Complete genome sequence of an aerobic thermoacidophilic crenarchaeon, Sulfolobus tokodaii strain 7. DNA Res. 8:123-140.

Kawarabayasi, Y, Y Hino, H. Horikawa, S. Yamazaki, Y Haikawa, K. Jin-no, M. Takahashi, M. Sekine, S. Baba, A. Ankai, et al. 1999. Complete Genome Sequence of an Aerobic Hyper-thermophilic Crenarchaeon, Aeropyrum pemix Kl. DNA Res. 6:83-101.

Kawarabayasi, Y, M. Sawada, H. Horikawa, Y Haikawa, Y Hino, S. Yamamoto, M. Sekine, S. Baba, H. Kosugi, A. Hosoyama, et al. 1998. Complete Sequence and Gene Organization of the Genome of a Hyper-thermophilic Archaebacterium ar·chae·bac·te·ri·um  
n. pl. ar·chae·bac·te·ri·a
An archaeon.



[archae(o)- + bacterium.
, Pyrococcus horikoshii OT3. DNA Research 5:55-76.

Kawashima, T., Y Yamamoto, H. Aramaki, T. Nunoshiba, T. Kawamoto, K. Watanabe, M. Yamazaki, K. Kanehori, N. Amano, Y. Ohya, K. Makino, and M. Suzuki. 1999. Determination of the complete genomic DNA sequence of Thermoplasma volvanium GSS (storage) GSS - Group-Sweeping Scheduling. 1. Proc. Jpn. Acad. 75:213-218.

Keeling, P.J., N.M. Fast, and G.I. McFadden. 1998. Evolutionary relationship between translation initiation factor elF2gamma and selenocysteine-specific elongation factor SELB: change of function in translation factors. J. Mol. Evol. 47:649-655.

Klenk, H.P., R.A. Clayton, J.-F. Tomb, O. White, K.E. Nelson, K.A. Ketchum, R.J. Dodson, M. Gwinn, E.K. Hickey J.D. Peterson, et al. 1997. The complete genome sequence of the hyperthermophilic, sulphate-reducing archaeon Archaeoglobus fulgidus. Nature 390:364-370.

Kunst, F., N. Ogasawara, I. Moszer, A.M. Albertini, G. Alloni, V. Azevedo, M.G. Bertero, P. Bessieres, A. Bolotin, S. Borchert, et al. 1997. The complete genome sequence of the gram-positive bacterium Bacillus subtilis. Nature 390:249-256.

Kuroda, M., T. Ohta, I. Uchiyama, T. Baba, H. Yuzawa,, I. Kobayashi, L. Cui, A. Oguchi, K. Aoki, Y Nagai, et al. 2001. Whole genome sequencing of meticillin-resistant Stapylococcus aureus. The Lancet 357:1225-1240.

Kyrpides, N.C., and C.R. Woese. 1998. Archaeal translation initiation revisited: the initiation factor 2 and eukaryotic initiation factor There exist many more eukaryotic initiation factors (eIF) than prokaryotic initiation factors due to greater biological complexity. Processes eIF is involved in include: formation of initation complexes with 5' mRNA and complexing with Met-tRNAi  2B alpha-beta-delta subunit families. Proc. Natl. Acad. Sci. USA 95:3726-3730.

Ma, J., A. Campbell, and S. Karlin. 2002. Correlations between Shine-Dalgarno sequences and gene features such as predicted expression levels and operon structures. J. Bacteriol. 184:5733-5745.

May, B.P., and P.P. Dermis dermis: see skin. . 1990. Unusual evolution of a superoxide superoxide /su·per·ox·ide/ (-ok´sid) any compound containing the highly reactive and extremely toxic oxygen radical O2-, a common intermediate in numerous biological oxidations.

su·per·ox·ide
n.
 dismutase-like gene from the extremely halophilic halophilic

pertaining to or characterized by an affinity for salt; requiring a high concentration of salt for optimal growth.
 archaebacterium Halobacterium cutirubrum. J. Bacteriol. 172:3725-3729.

May, B.J., Q. Zhang, L. Li, M.L. Paustian, TS. Whittam, and V.S. Kapur. 2001. Complete nucleotide sequence of an avian isolate of Pasteurella multocida. Proc. Natl. Acad. Sci. USA 98:3460-3465.

McClelland, M., K.E. Sanderson, J. Spieth, S.W. Clifton, P. Latreille, L. Courtney, S. Powollik, J. Ali, M. Dante, F. Du, et al. 2001. The complete genome sequence of Salmonella enterica serovar Typhimurium LT2: features revealed by comparison to related genomes. Nature 413:852-856.

Mrazek, J., D. Bhaya, A.R. Grossman, S. Karlin. 2001. Highly expressed and alien genes of the Synechocystis genome. Nucleic Acids Res. 29:1590-1601.

Nelson, K.E., R.A. Clayton, S.R. Gill, M.L. Gwinn, R.J. Dodson, D.H. Haft, E.K. Hickey, J.D. Peterson, W.C. Nelson, K.A. Ketchum, et al. 1999. Evidence for lateral gene transfer between Archaea archaea: see Archaebacteria.
archaea

A group of prokaryotes whose members differ from bacteria, the most prominent prokaryotes, in certain physical, physiological, and genetic features. The archaea may be aquatic or terrestrial microorganisms.
 and Bacteria from genome sequence of Thermotoga maritima. Nature 399:323-329.

Ng, W.V., S.P. Kennedy, G.G. Mahairas, B. Berquist, M. Pan, H.D. Shukla, S.R. Lasky, N.S. Baliga, V. Thorsson, J. Sbrogna, et al. 2000. Genome sequence of Halobacterium species NRC-l. Proc. Natl. Acad. Sci. USA 97:12176-12181.

Nierman, W.C., T.V. Feldblyum, I.T. Paulsen, K.E. Nelson, J. Eisen, J.F. Heidelberg, M. Alley, N. Ohta, J.R. Maddock, I. Potocka, et al. 2001. Complete Genome Sequence of Caulobacter crescentus. Proc. Natl. Acad. Sci. USA 98:4136-4141.

Nolling, J., G. Breton, M.V. Omelchenko, K.S. Markarova, Q. Zeng, R. Gibson, H.M. Lee, J. Dubois, D. Qiu, J. Hitti, et al. 2001. Genome Sequence and Comparative Analysis of the Solvent-Producing Bacterium Clostridium acetobutylicum. J. Bacteriol. 183:4823-4838.

Ogata, H., S. Audic, P. Renesto-Audiffren, P.-E. Fournier, V. Barbe, D. Samson, V. Roux Roux , Pierre Paul Émile 1853-1933.

French bacteriologist. His work with the diphtheria bacillus led to the development of antitoxins to neutralize pathogenic toxins.
, P. Cossart, J. Weissenbach, J.M. Claverie, and D. Raoult. 2001. Mechanisms of Evolution in Rickettsia conorii and Rickettsia prowazekii. Science 293:2093-2098.

Osada, Y., R. Saito, and M. Tomita. 1999. Analysis of base-pairing potentials between 16S rRNA and 5' UTR UTR Untranslated Region (genetics)
UTR Unicode Technical Report
UTR Unique Taxpayer Reference (UK Inland Revenue)
UTR Unable to Reach
UTR Unable to Reproduce
UTR University Technical Representative
 for translation initiation in various prokaryotes. Bioinformatics 15:578-581.

Parkhill, J., M. Achtman, K.D. James, S.D. Bentley, C. Churcher, S.R. Klee, G. Morelli, D. Basham, D. Brown, T. Chillingworth, et al. 2000. Complete DNA sequence of a serogroup A strain of Neisseria menigitidis Z2491. Nature 404:502-506.

Parkhill, J., G. Dougan, K.D. James, N.R. Thomson, D. Pickard, J. Wain, C. Churcher, K.L. Mungall, S.D. Bentley, M.T.G. Holden, et al. 2001. Complete genome sequence of a multiple drug resistant Salmonella enterica serovar Typhi CT18. Nature 413:848-852.

Parkhill, J., B.W. Wren, K. Mungall, J.M. Ketley, C. Churcher, D. Basham, T. Chillingworth, R.M. Davies, T. Feltwell, S. Holroyd, et al. 2000. The genome sequence of the food-home pathogen Campylobacter jejuni reveals hypervariable sequences. Nature 403:665-668.

Parkhill, J., B.W. Wren, N.R. Thomson, R.W. Titball, M.T.G. Holden, M.B. Prentice, M. Sebaihia, K.D. James, C. Churcher, K.L. Mungall, et al. 2001. Genome sequence of Yersinia pestis, the causative agent of plague. Nature 413:523-527.

Perna, N.T., G. Plunkett, III, V. Burland, B. Mau, J.D. Glasner, D.J. Rose, G.E. Mayhew, P.S. Evans, J. Gregor, H.A. Kirkpatrick, et al. 2001. Genome sequence of enterohaemorrhagic Escherichia coli 01 57:H7. Nature 409:529-533.

Read, T.D., R. Brunham, C. Shen Shen, in the Bible, place, perhaps close to Bethel, near which Samuel set up the stone Ebenezer. , S.R. Gill, J.F. Heidelberg, 0. White, E.K. Hickey, J. Peterson, L.A. Umayam, T. Utterback, et al. 2000. Genome sequences of Chiamydia trachomatis MoPn and Chiamydia pneumoniae AR3 9. Nucleic Acids Res. 28:1397-1406.

Read, T.D., R.C. Brunham, C. Shen, S.R. Gill, J.F. Heidelberg, O. White, E.K. Hickey, J. Peterson, L.A. Umayam, T. Utterback, et al. 2000. Genome sequences of Chiarnydia trachomatis MoPn and Chiamydia pneumoniae AR39. Nucleic Acids Res. 28:1397-1406.

Reiter, W.D, U. Hudepohl, and W Zillig. 1990. Mutational analysis of an archaebacterial promoter: essential role of a TATA box TATA box

a eukaryotic DNA sequence usually TATAAATA, similar to the Pribnow box of Escherichia coli, occurring in the promoter region 25 to 35 bases upstream from the transcriptional start site that binds the general transcription factor TFIID which begins the formation of
 for transcription efficiency and start-site selection in vitro in vitro /in vi·tro/ (in ve´tro) [L.] within a glass; observable in a test tube; in an artificial environment.

in vi·tro
adj.
In an artificial environment outside a living organism.
. Proc. Nati. Acad. Sci. USA 87:9509-9513.

Reiter, W.D., P. Palm, and W. Zillig. 1988. Transcription termination in the archaebacterium Sulfolobus: signal structures and linkage to transcription initiation. Nucleic Acids Res. 16:2445-2459.

Ruepp, A., W Graml, M.L. Santos-Martinez, K.K. Koretke, C. Voilcer, H.W. Mewes, D. Frishman, S. Stocker, A.N. Lupas, and W. Baumeister. 2000. The genome sequence of the thermoacidophilic scavenger Thermoplasma acidophilum. Nature 407:508-513.

Saito, R., and M. Tomita. 1999. Computer analyses of complete genomes suggest that some archaebacteria Archaebacteria (är'kēbăktĭr`ēə), diverse group of bacteria (prokaryotes), sometimes called the archaea and considered a major group unto themselves.  employ both eukaryotic and eubacterial mechanisms in translation initiation. Gene 238:79-83.

Sakai, H., C. Imamura, Y Osada, R. Saito, T. Washia, and M. Tomita. 2001. Correlation between Shine-Dalgamo sequence conservation and codon usage of bacterial genes. J. Mol. Evol. 52:164-170.

Salin, M.L., M.V. Duke, D.P. Ma, and J.A. Boyle. 1991. Halobacterium halobium Mn-SOD gene: archaebacterial and eubacterial features. Free Rad. Res. Commun. 12-13 Pt 1:443-449.

She, Q., R.K. Singh, E Confalonieri, Y Zivanovic, G. Allard, M.J. Awayez, C.C. Chan-Weiher, I.G. Clausen, B.A. Curtis, A. De Moors, et al. 2001. The complete genome of the crenarchaeon Sulfolobus solfataricus P2. Proc. Natl. Acad. Sci. USA 98:7835-7840.

Shigenobu, S., H. Watanabe, M. Hattori, Y Sakaki, and H. Ishikawa. 2000. Genome sequence of the endocellular bacterial symbiont of aphids Buchnera sp. APS. Nature 407:81-86.

Shine, J., and L. Dalgarno. 1974. The 3'-terminal sequence of Escherichia coli16S ribosomal RNA: complementarity to nonsense triplets and ribosome ribosome: see cell; nucleic acid.
ribosome

Tiny particle, the site of protein synthesis, that is present in large numbers in living cells. They occur both as free particles within cells and, in eukaryotes, as particles attached to the membranes of
 binding sites. Proc. Natl. Acad. Sci. USA 71:1342-1346.

Shirai, M., H. Hirakawa, M. Kimoto, M. Tabuchi, F. Kishi, K. Ouchi, T. Shiba, K. Ishii, M. Hattori, S. Kuhara, and T. Nakazawa. 2000. Comparison of whole genome sequences of Chlamydia chlamydia (kləmĭd`ēə), genus of microorganisms that cause a variety of diseases in humans and other animals. Psittacosis, or parrot fever, caused by the species Chlamydia psittaci,  pneurnoniae J 138 from Japan and CWLO29 from USA. Nucleic Acids Res. 28:2311-2314.

Simpson, A.J.G., F.C. Reinach, P. Arruda, F.A. Abreu, M. Acencio, R. Alvarenga, L.M.C. Alves, J.E. Araya, G.S. Baia, C.S. Baptista, et al. 2000. The genome sequence of the plant pat hogen Xylella fastidiosa. Nature 406:151-157.

Slupska, M.M., A.G. King, S. Fitz-Gibbon, J. Besemer, M. Borodovsky, and J.H. Miller. 2001. Leaderless transcripts of the crenarchaeal hyperthermophile Pyrobaculum aerophilum. J. Mol. Biol. 309:347-360.

Smith, D.R., L.A. Doucette-Stamm, C. Deloughery H.-M. Lee, J. Dubois, T. Aldredge, R. Bashirzadeh, D. Blakely, R. Cook, K. Gilbert, et al. 1997. Complete genome sequence of Met hanoba cterium thermoautotrophicum deltaH: functional analysis and comparative genomics. J. Bacteriol.179:7135-7155.

Sprengart, M.L., E. Fuchs, and A.G. Porter. 1996. The downstream box: an efficient and independent translation initiation signal in Escherichia coli. EMBO J. 15:665-674.

Staden, R. 1984. Computer methods to locate signals in nucleic acid nucleic acid, any of a group of organic substances found in the chromosomes of living cells and viruses that play a central role in the storage and replication of hereditary information and in the expression of this information through protein synthesis.  sequences. Nucleic Acids Res. 12:505-519.

Stephens, R.S., S. Kalman, C. Lammel, J. Fan, R. Marathe, L. Aravind, W. Mitchell, L. Olinger, R.L. Tatusov, Q. Zhao, et al. 1998. Genome sequence of an obligate obligate /ob·li·gate/ (ob´li-gat) pertaining to or characterized by the ability to survive only in a particular environment or to assume only a particular role, as an obligate anaerobe.  intracellular pathogen of humans: Chiamydia trachomatis. Science 282:754-759.

Stormo, G.D. 2000. DNA binding sites: representation and discovery. Bioinformatics 16:16-23.

Stover, C.K., X.-Q.T. Pham, A.L. Erwin, S.D. Mizoguchi, P Warrener war·ren·er  
n.
1. One who owns or keeps a rabbit warren.

2. A gamekeeper.

Noun 1. warrener - maintains a rabbit warren
game warden, gamekeeper - a person employed to take care of game and wildlife
, M.J. Hickey, F.S.L. Brinkman, WO. Hufnagle, D.J. Kowalik, M. Lagrou, et al. 2000. Complete genome sequence of Pseudornonas aeruginosa PAO PAO Peak acid output, see there  1, an opportunistic pathogen. Nature 406:959-964.

Suckow, J.M., N. Amano, Y Ohfuku, J. Kakinuma, H. Koike, and M. Suzuki. 1998. A transcription frame-based analysis of the genomic DNA sequence of a hyper-thermophilic archaeon for the identification of genes, pseudo-genes and operon structures. FEBS Lett. 426:86-92.

Takami, H., K. Nakasone, Y Takaki, G. Maeno, Y Sasaki, N. Masui, F. Fuji, C. Hirama, Y Nakamura, N. Ogasawara, et al. 2000. Complete genome sequence of the alkaliphilic bacterium Bacillus halodurans and genomic sequence comparison with Bacillus subtilis. Nucleic Acids Res. 28:4317-4331.

Tettelin, H., K.E. Nelson, I.T. Paulsen, J.A. Elsen, T.D. Read, S. Peterson, J. Heidelberg, R.T. DeBoy, D.H. Haft, R.J. Dodson, et al. 2001. Complete genome sequence of a virulent isolate of Streprococcus pneumoniae. Science 293:498-506.

Tettelin, H., N.J. Saunders, J. Heidelberg, A.C. Jeifries, K.E. Nelson, J.A. Eisen, K.A. Ketchum, D.W Hood, J.F. Peden, R.J. Dodson, et al. 2000. Complete genome sequence of Neisseria meningitidis serogroup B strain MC58. Science 287:1809-1815.

Thomm, M., and G. Wich. 1988. An archaebacterial promoter element for stable RNA RNA: see nucleic acid.
RNA
 in full ribonucleic acid

One of the two main types of nucleic acid (the other being DNA), which functions in cellular protein synthesis in all living cells and replaces DNA as the carrier of genetic
 genes with homology to the lATA box of higher eukaryotes. Nucleic Acids Res. 16:151-163.

Tiwari, S., S. Ramachandran, A. Bhattacharya, S. Bhattacharya, and R. Ramaswamy 1997. Prediction of probable genes by Fourier analysis of genomic sequences. Comput. AppI. Biosci. 13:263-270.

Tolstrup, N., C.W. Sensen, R.A. Garrett, and I.G. Clausen. 2000. Two different and highly organized mechanisms of translation initiation in the archaeon Sulfolobus solfataricus. Extremophiles 4:175-179.

Tomb, J.-F., O. White, A.R. Kerlavage, R.A. Clayton, G.G. Sutton, R.D. Fleischmann, K.A. Ketchum, H.P Klenk, S. Gill, B.A. Dougherty, et al. 1997. The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature 388:539-547.

Tsonis, A.A., J.B. Elsner, and P.A. Tsonis. 1991. Periodicity in DNA coding sequences: implications in gene evolution. J. Theor. Biol.151:323-331.

Van Etten, W.J., and G.R. Janssen. 1998. An AUG initiation codon, not codon-anticodon complementarity, is required for the translation of unleadered mRNA in Eschenichia coli. Mol. Microbiol. 27:987-1001.

Van Etten, W.J., and G.R. Janssen. 1998. An AUG initiation codon, not codon-anticodon complementarity, is required for the translation of unleadered mRNA in Escherichia coli. Mol. Microbiol. 27:987-1001.

Weiner, J., 3rd, R. Hermann, and G.F. Browning. 2000. Transcription in Mycoplasma pneumoniae. Nucleic Acids Res. 28:4488-4496.

White, O., J.A. Eisen, J.F. Heidelberg, E.K. Hickey, J.D. Peterson, R.J. Dodson, D.H. Haft, M.L. Gwinn, W.C. Nelson, D.L. Richardson, et al. 1999. Genome Sequence of the Radioresistant Bacterium Deinococcus radiodurans R1. Science 286:1571-1577.

Wich, G., H. Hummel hummel

entire, naturally polled deer.
, M. Jarsch, U. Bar, and A. Bock Noun 1. bock - a very strong lager traditionally brewed in the fall and aged through the winter for consumption in the spring
bock beer

lager beer, lager - a general term for beer made with bottom fermenting yeast (usually by decoction mashing); originally
. 1986. Transcription signals for stable RNA genes in Methanococcus. Nucleic Acids Res.14:2459-79.

Wood, D.W., J.C. Setubal, R. Kaul, D. Monks, L. Chen, G.E. Wood, Y Chen, L. Woo, J.P. Kitajima, V.K. Okura, et al. 2001. The Genome of the Natural Genetic Engineer Agrobacterium tumefaciens C58. Science 294:2317-2323.

Web Site References

http://www.ncbi.nlm.nih.gov:80/PMGifs/Genomes/micrhtml

NCBI Entrez-Genome Microbial Genomes

http://www.kazusa.or.jp/rhizobase/Rhizobase

http://www.msstate.edu/dept/biochemistry/CBIG/Mississippi

State University Computational Biology and Informatics Group

Alan P. Boyle, John John Boyle was born October 28, 1774, near Tazewell in Botetourt County, Virginia. He was admitted to the Kentucky bar in 1797 and established a legal practice in Lancaster, Kentucky, before entering government service.  A. Boyle (1)

(1.) Corresponding author. (FAX 662-325-8664; email jab@ra.msstate.edu)
COPYRIGHT 2003 Mississippi Academy of Sciences
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2003, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Author:Boyle, John A.
Publication:Journal of the Mississippi Academy of Sciences
Geographic Code:1U6MS
Date:Jul 1, 2003
Words:7721
Previous Article:Life on Mars: past, present, and future.
Next Article:A simple method of surveying plant populations for random amplified DNA polymorphisms.
Topics:



Related Articles
Aeromicrobiology: an assessment of a new meat research complex.
Microbial-Community Analysis--An Automated System.(Brief Article)
Using DNA Microarrays to Study Host-Microbe Interactions.
5'-proximal AUG sequences as translation initiation signals on mRNAs in Escherichia coli.(Statistical Data Included)
Microbial Activity a Component in Climate Change.(Brief Article)
Gait initiation in community-dwelling adults with Parkinson disease: comparison with older and younger adults without the disease. (Research...
Apply new nucleic acid-based technologies to detection.
Model optimizes moisture transport.
Molecular approaches improve product safety and quality, speed microbe detection.
Use spectroscopy to detect, differentiate Salmonella enterica serovars in apple juice.

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles