Using DNA Microarrays to Study Host-Microbe Interactions.Complete genomic sequences of microbial microbial pertaining to or emanating from a microbe. microbial digestion the breakdown of organic material, especially feedstuffs, by microbial organisms. pathogens and hosts offer sophisticated new strategies for studying host-pathogen interactions. DNA microarrays exploit primary sequence data to measure transcript levels and detect sequence polymorphisms, for every gene, simultaneously. The design and construction of a DNA microarray for any given microbial genome are straightforward. By monitoring microbial gene expression, one can predict the functions of uncharacterized genes, probe the physiologic adaptations made under various environmental conditions, identify virulence-associated genes, and test the effects of drugs. Similarly, by using host gene microarrays, one can explore host response at the level of gene expression and provide a molecular description of the events, that follow infection. Host profiling might also identify gene expression signatures unique for each pathogen, thus providing a novel tool for diagnosis, prognosis, and clinical management of infectious disease Infectious disease A pathological condition spread among biological species. Infectious diseases, although varied in their effects, are always associated with viruses, bacteria, fungi, protozoa, multicellular parasites and aberrant proteins known as prions. . The complex interaction between a microbial pathogen and a host is the underlying basis of infectious disease. By understanding the molecular details of this interaction, we can identify virulence-associated microbial genes and host-defense strategies and characterize the cues to which they respond and mechanisms by which they are regulated. This information will guide the design of a new generation of medical tools. Genomic sequencing will provide the data needed to unravel the complexities of the host-pathogen interaction. As of August 10, 2000, draft sequence was available for 87% of the human genome (http://www.ncbi.nlm.nih.gov/genome/ seq/), and at least 39 prokaryotic pro·kar·y·ote also pro·car·y·ote n. An organism of the kingdom Monera (or Prokaryotae), comprising the bacteria and cyanobacteria, characterized by the absence of a distinct, membrane-bound nucleus or membrane-bound organelles, and by DNA that genomes, including those of more than a dozen human pathogens, had been completely sequenced (http://www.tigr.org/tdb/mdb/mdbcomplete.html). The pace of gene discovery rapidly accelerates, but its potential for explaining life at the molecular level remains largely unrealized because our understanding of gene function lags increasingly far behind. For example, even in the heavily studied Escherichia coli Escherichia coli (ĕsh'ərĭk`ēə kō`lī), common bacterium that normally inhabits the intestinal tracts of humans and animals, but can cause infection in other parts of the body, especially the urinary tract. , no function has been assigned to more than one third of its genes (1). High-throughput methods for assessment of function are clearly required if this wealth of primary sequence information is to be used. Global profiling of gene expression is one attractive approach to assessing function. Because a gene is usually transcribed only when and where its function is required, determining the locations and conditions under which a gene is expressed allows inferences about its function. Several independent high-throughput methods for differential gene expression (including SAGE and differential display) may enable function annotation of sequenced genomes (2). DNA microarray hybridization hybridization /hy·brid·iza·tion/ (hi?brid-i-za´shun) 1. crossbreeding; the act or process of producing hybrids. 2. molecular hybridization 3. analysis stands out for its simplicity, comprehensiveness, data consistency, and high throughput. Transcription control plays a key role in host-pathogen interaction (3,4); thus, genomewide transcription profiling seems particularly appropriate for the study of this process. This review focuses on microarray-based approaches for studying transcription response because they hold exceptional promise for the study of infectious disease. Microarray-based genotyping applications, although expected to make substantial contributions in this field, are covered only briefly here. High-Density DNA Microarrays: Basic Tools First described in 1995 (5), high-density DNA microarray methods have already made a marked impact on many fields, including cellular physiology (6-11), cancer biology (12-17), and pharmacology (18,19). The first results of gene expression profiling of the host-pathogen interaction have just begun to emerge. Before exploring these results, we briefly review the methods. Technology The key unifying principle of all microarray experiments is that labeled nucleic acid nucleic acid, any of a group of organic substances found in the chromosomes of living cells and viruses that play a central role in the storage and replication of hereditary information and in the expression of this information through protein synthesis. molecules in solution hybridize hy·brid·ize intr. & tr.v. hy·brid·ized, hy·brid·iz·ing, hy·brid·iz·es 1. To produce or cause to produce hybrids; crossbreed. 2. , with high sensitivity and specificity, to complementary sequences immobilized on a solid substrate, thus facilitating parallel quantitative measurement of many different sequences in a complex mixture (20,21). Although several methods for building microarrays have been developed (22,23), two have prevailed. In one, DNA microarrays are constructed by physically attaching DNA DNA: see nucleic acid. DNA or deoxyribonucleic acid One of two types of nucleic acid (the other is RNA); a complex organic compound found in all living cells and many viruses. It is the chemical substance of genes. fragments such as library clones or polymerase chain reaction polymerase chain reaction (pŏl`ĭmərās') (PCR), laboratory process in which a particular DNA segment from a mixture of DNA chains is rapidly replicated, producing a large, readily analyzed sample of a piece of DNA; the process is (PCR PCR polymerase chain reaction. PCR abbr. polymerase chain reaction Polymerase chain reaction (PCR) ) products to a solid substrate (5) (Figure 1). By using a robotic arrayer and capillary printing tips, we can print at least 23,000 elements on a microscope slide (P. Brown, pers. comm.; Figure 2). In the other method, arrays are constructed by synthesizing single-stranded oligonucleotides in situ In place. When something is "in situ," it is in its original location. by use of photolithographic techniques (24,25). Advantages of the former method include relatively low cost and substantial flexibility (which explain its wide implementation in the academic setting); in addition, primary sequence information is not needed to print a DNA element. Advantages of the latter method include higher density ([is greater than] 280,000 features on a 1.28X1.28-cm array) and elimination of the need to collect and store cloned DNA or PCR products. Continued commercial interest in microarray technology promises increasing array element density, better detection sensitivity, and cheaper, faster methods. Technical descriptions of microarray construction methods and hybridization protocols are available (26-28; and http:// cmgm.stanford.edu/pbrown/mguide/index.html). [Figures 1-2 ILLUSTRATION OMITTED] Messenger RNA mes·sen·ger RNA n. See mRNA. from eukaryotic cells is usually specifically labeled by affinity purification of mRNA with an oligo-dT resin, followed by incorporation of dye-labeled nucleotides into cDNA molecules by reverse transcriptase Reverse transcriptase Any of the deoxyribonucleic acid (DNA) polymerases present in particles of retroviruses which are able to carry out DNA synthesis using an RNA template. (RT) with random or oligo-dT oligonucleotide primers (Figure 1). In prokaryotes, the absence of polyadenylation on transcripts makes labeling of mRNA more difficult. One method is labeling: of total RNA RNA: see nucleic acid. RNA in full ribonucleic acid One of the two main types of nucleic acid (the other being DNA), which functions in cellular protein synthesis in all living cells and replaces DNA as the carrier of genetic either by covalent co·va·lent adj. Of or relating to a chemical bond characterized by one or more pairs of shared electrons. linkage (29) or by incorporating dye-labeled nucleotides into complementary DNA complementary DNA n. cDNA. through RT and random oligonucleotide primers (30). In spite of the high copy number of labeled ribosomal and tRNA molecules in the hybridization reactions specific hybridization of mRNA to the array can be achieved under appropriate stringency. An alternative method is to prime reverse transcription reverse transcription n. The process by which DNA is synthesized from an RNA template. with a mixture of reverse-strand oligonucleotides specific for open reading frames (ORFs), either those used to construct the microarray (M. Laub and L. Shapiro, pers. comm.) or a minimally compilex mixture of octamers sufficient to hybridize to the 3' end of every ORF (31). This method results in higher signal-to-noise ratios by preferentially synthesizing cDNA from coding regions. For printed DNA microarrays, relative transcript abundance is measured by labeling two samples with different fluorescent dyes (e.g., Cy3 and Cy5), hybridizing them simultaneously, and determining the fluorescence ratio for each spot on the array (Figure 1). On oligonucleotide arrays, multiple probes from the same gene, each with a corresponding mismatch probe that serves as internal control, as well as labeled transcript of known amounts for standard genes makes quantitative measurement of transcript abundance possible after hybridizing a single labeled sample (25). For both techniques, use of fluorescent labeling enhances sensitivity and the dynamic range of measurement. Gene expression array experiments can also be performed by hybridizing a single labeled mRNA sample to "macroarrays" of DNA elements on positively charged filters (10,11,32-34). Because this format does not require any special arraying or scanning equipment, specialty arrays can be made and analyzed relatively cheaply. Human, mouse, and microbial macroarrays are also commercially available (SigmaGenosys, The Woodlands, TX; Research Genetics, Huntsville, AL; Clontech Laboratories, Palo Alto, CA; Genome Systems, St. Louis, MO). The major disadvantages of this format are reduced sensitivity (32), limited elements, and the need for higher concentrations of labeled cDNA. Microarray Data Analysis Microarrays are likely to become a standard tool of the microbiology laboratory. However, because genomewide datasets are large and comprehensive, analysis of an experiment can become daunting daunt tr.v. daunt·ed, daunt·ing, daunts To abate the courage of; discourage. See Synonyms at dismay. [Middle English daunten, from Old French danter, from Latin . Careful experimental design can simplify analysis and interpretation of the dataset by minimizing the number of variables that affect gene expression. For example, strain differences can be minimized by using isogenic isogenic /iso·gen·ic/ (-jen´ik) syngeneic. isogenic (ī´sōjen´ik), adj originating from a common source; possessing the same genetic composition. mutants, tissue complexity can be reduced by studying clonal cell lines, and complex regulatory pathways can be tamed by experimental modulation of transgene transgene a gene that has been incorporated into the genome of another organism. expression (6). Because microarray experiments result in such large amounts of data, false-positive results are likely. Analyzing multiple independent experiments may eliminate spurious results (32). Also important is validation of differentially expressed genes by independent methods. When checked by a number of methods including quantitative RT-PCR RT-PCR reverse transcriptase-polymerase chain reaction. See PCR1. (6, 35), Northern blotting Northern blotting Molecular biology A technique used to detect the presence of a specific mRNA sequence. See Blotting, Hybridization, Probe, RNA, Southern blotting. (33, 34, 36), and protein expression (33, 34), most differentially expressed genes have been confirmed. For example, 72 of 72 mRNAs found to be regulated in response to cytomegalovirus cytomegalovirus (sī'təmĕg'əlōvī`rəs), member of the herpesvirus family that can cause serious complications in persons with weakened immune systems. (CMV CMV cytomegalovirus. CMV abbr. 1. controlled mechanical ventilation 2. cytomegalovirus Cytomegalovirus (CMV) ) infection were confirmed by either prior reports or Northern blotting (37). Future challenges for microarray researchers will include developing databases and algorithms to manage and analyze vast genomic-scale datasets. Image Analysis Software The first step after hybridization is capturing an image of the array and from it, extracting numerical data for each element (Figure 1). Several software applications, including those packaged with most commercial scanners, can perform this task. However, not all programs use the same algorithms to calculate signal intensity, and each of the programs exports a different constellation of signal quality measurements, complicating comparisons between data acquired with different applications (38). If gene expression datasets are to be compared, these measurements must be standardized. Furthermore, standard, robust statistical methods must be developed for assigning significance values to gene expression measurements. Databases Although many laboratories are now capable of collecting microarray data, few have access to a database that can effectively meet their data requirements. With considerable investment of resources, a few full-featured, relational gene expression databases have been developed, but these are not available for public deposition of data (e.g., http://genome-www4.stanford.edu/MicroArray/ MDEV/index.html; http://www. nhgri.nih.gov/DIR/LCG/15K/HTML/dbase.html). Recently released, the freely available AMAD AMAD Activity Median Aerodynamic Diameter AMAD Airframe Mounted Accessory Drive AMAD Asesoria Mesoamericana de Desarollo (Middle American Counseling Agency for Progress, Guatemala) AMAD Automatic Mustard Agent Detector software package (http://www.microarrays.org/ software.html) provides basic microarray data storage and retrieval capabilities to the average laboratory. A grander goal for the community is establishing a consolidated resource for public distribution of microarray data (39-41). Again, the lack of a standard format for microarray data interferes with creating such a resource (38,39). The European Bioinformatics Institute The European Bioinformatics Institute (EBI) is a centre for research and services in bioinformatics, and is part of European Molecular Biology Laboratory (EMBL). It is a pioneer of novel and developmental bioinformatics research. , recognizing this obstacle, has proposed defining a standard based upon XML XML in full Extensible Markup Language. Markup language developed to be a simplified and more structural version of SGML. It incorporates features of HTML (e.g., hypertext linking), but is designed to overcome some of HTML's limitations. , a computer markup language that combines data and formatting in a single file for distribution over the World-Wide Web (40; http://www.ebi.ac.uk/arrayexpress/). Algorithms Inferring biologically meaningful information from microarray data requires sophisticated data exploration. Most global gene expression analyses have used some form of unsupervised clustering algorithm (16,42-44) to find genes coregulated across the dataset (Figure 1). A primary justification for this approach is that shared expression often implies shared function (38,43). In datasets containing many experiments, clustering can also group experiments on the basis of gene expression profiles, an approach that has been successful in classifying tumor-derived cell lines (19, 45) and tumor subtypes (12-17). When a coregulated class of genes is known, supervised clustering algorithms, which are trained to recognize known members of the class, can assign uncharacterized genes to that class. For example, a machine-learning method known as a support vector machine Please [improve the article] or discuss this issue on the talk page. has been used to classify yeast genes by function on the basis of shared regulation (46). Robust determination of coregulated gene clusters may be achieved by using a tiered approach: unsupervised clustering to identify coregulated genes followed by testing and refinement with supervised algorithms (47). Although clustering algorithms will continue to be a mainstay in the analysis of gene expression datasets, a wealth of other data-mining techniques have yet to be applied (38,48). Preliminary reports indicate that many algorithms and visualization methods are being developed, but their ability to extract biologic insight has yet to be established (49-51). The study of microbial pathogens, and prokaryotes in general, will require the development of some specialized analysis tools. First, the compact and modular structure of prokaryotic genomes--and in particular, the presence of operons and pathogenicity islands--suggests that important insights may be gained by mapping gene expression information onto genomic structure. In addition, because gene expression will be measured in many different pathogens, often under the same environmental conditions, tools for cross-species comparison of gene expression data will permit the detection of conserved transcription responses. Examining a Microorganism microorganism /mi·cro·or·gan·ism/ (-or´gah-nizm) a microscopic organism; those of medical interest include bacteria, fungi, and protozoa. : Application of DNA Microarrays Microarray technology promises to speed the study of uncharacterized or poorly characterized microbes by contributing to annotation of the microbial genome, enabling exploration of microbial physiology, and identifying candidate virulence factors. Designing a Microbial Genome Microarray Designing a whole-genome DNA microarray for a fully sequenced microbe microbe /mi·crobe/ (mi´krob) a microorganism, especially a pathogenic one such as a bacterium, protozoan, or fungus.micro´bialmicro´bic mi·crobe n. is conceptually straightforward. Several sensitive microbial gene-finding programs can quickly and accurately predict most ORFs (52-57). DNA fragments representing each of the ORFs can be obtained by PCR amplification that uses ORF-specific oligonucleotides, the design of which can be automated with primer design software such as Primer3 (58). Homology-searching algorithms should be used to choose regions of genes that will not cross-hybridize with other regions of the genome. After a simple purification step, PCR fragments can be arrayed by a robotic arrayer (5). This basic approach has been used to construct a 4,290-ORF E. coli E. coli: see Escherichia coli. E. coli in full Escherichia coli Species of bacterium that inhabits the stomach and intestines. E. coli can be transmitted by water, milk, food, or flies and other insects. microarray (10, 11) and a 3,834-ORF Mycobacterium tuberculosis Mycobacterium tuberculosis n. Tubercic bacillus. Mycobacterium tuberculosis microarray (30) as well as full-genome arrays for Helicobacter pylori Helicobacter pylori A gramnegative rod-shaped bacterium that lives in the tissues of the stomach and causes inflammation of the stomach lining. Mentioned in: Indigestion, Ulcers Helicobacter pylori (S. Fallow fallow a pale cream, light fawn, or pale yellow coat color in dogs. , pers. comm.) and Caulobacter crescentus (L. Shapiro, pers. comm.). Microarray fabrication based on photolithographic synthesis of oligonucleotides in situ is also a viable approach and has been successfully used for the production of an E. coli complete ORF chip (E. coli Genome Array, Affymetrix, Santa Clara, CA). The utility of microarrays is not restricted to fully sequenced organisms. A powerful screening tool can be obtained by arraying DNA libraries, as has been done for the eukaryotic eukaryotic /eu·kary·ot·ic/ (u?kar-e-ot´ik) pertaining to a eukaryon or to a eukaryote. eukaryotic pertaining to eukaryosis. eukaryotic cells see cell. pathogen, Plasmodium falciparum Plasmodium fal·cip·a·rum n. A protozoan that causes falciparum malaria. (59). A DNA microarray of 3,648 random genomic chines was used to identify [is greater than] 50 genes for which expression differed significantly between the trophozoite trophozoite /tropho·zo·ite/ (-zo´it) the active, motile feeding stage of a sporozoan parasite. tro·pho·zo·ite n. and gametocyte gametocyte /ga·me·to·cyte/ (-sit) 1. a cell capable of dividing to form gametes; an oocyte or spermatocyte. 2. stages. The major limitation of this approach is that the identity of any element of interest must be determined after the experiment. Annotating an·no·tate v. an·no·tat·ed, an·no·tat·ing, an·no·tates v.tr. To furnish (a literary work) with critical commentary or explanatory notes; gloss. v.intr. To gloss a text. the Function of a Microbial Genome For many pathogens, the number of genes for which function information is available is usually low. Moreover, the relative insufficiency of genetic tools can make obtaining such information difficult. However, because [is greater than] 70% of bacterial proteins have orthologs in other organisms (60,61), one can leverage extensive knowledge of function from the model organisms to infer function for a pathogen's genome. Similarity searches alone will predict functions of many genes. We expect the study of genomewide expression patterns to contribute even further to annotation of function. The rationale for this belief follows from the observation that shared expression often implies shared function (38). As suggested by Brown and Botstein (21), the inclusion of a gene with a characterized ortholog in a coregulated gene cluster can predict the function of the remaining genes in that cluster, thus bootstrapping Bootstrapping A procedure used to calculate the zero coupon yield curve from market figures. Notes: Since the T-bills offered by the government are not available for every time period, the bootstrapping method is used to fill in the missing figures in order to derive the the function annotation of the pathogen's genome. This assertion is borne out in a study of global gene expression in Saccharomyces Saccharomyces: see yeast. cerevisiae. Clustering of 2,467 gene expression profiles across a series of 78 experiments representing eight cellular processes demonstrated coregulation of genes that participated in shared cellular function (43). Therefore, the acquisition of a pathogen's gene expression data from even a modest number of experimental conditions may lead to testable hypotheses about function for a substantial number of genes, even those lacking sequence similarity to genes whose function has been characterized. Probing a Microbe's Physiologic State The assumption that genes are preferentially expressed when their function is required allows inference of gene function directly from physiologic gene response. For example, genes preferentially transcribed during the diauxic shift in yeast are predicted to contribute in the metabolic transition to respiration (9). Thus, gene expression studies will contribute to function annotation by identifying the specific environmental and physiologic conditions in which each gene is expressed. Furthermore, as annotation improves, the direction of this inference may be reversed, i.e., if information on function is known for many genes, genomic expression profiling may reveal the physiologic state of the organism. Two studies have used whole-genome DNA arrays to explore gene expression response to environmental stimuli in E. coli. First, treatment with isopropyl-[Beta]-D-thiogalactopyranoside (IPTG IPTG Isopropyl-Beta-d-Thiogalactopyranoside ) was shown to induce only the lac operon lac operon the lactose operon, a nucleotide sequence in Escherichia coli that controls the synthesis of the enzyme ß-galactosidase comprising binding sequence motifs for the cap protein, which activates transcription, the repressor protein, which inhibits , and to a lesser extent, the melibiose operon (11). In a second study, comparison of strains grown in minimal versus rich media revealed 344 genes that were differentially expressed between the two conditions: preferential expression of the translation apparatus in rich media and the amino acid amino acid (əmē`nō), any one of a class of simple organic compounds containing carbon, hydrogen, oxygen, nitrogen, and in certain cases sulfur. These compounds are the building blocks of proteins. biosynthetic bi·o·syn·the·sis n. Formation of a chemical compound by a living organism. Also called biogenesis. bi pathways in minimal media were entirely consistent with prior data (10). Finally, examination of gene expression during heat shock revealed 119 genes with altered expression levels, all but 35 of which were previously recognized as heat shock genes (11). These studies confirm that the physiologic state of bacteria can be inferred from gene expression data. In the first report of global gene expression monitoring in a bacterial pathogen, oligonucleotide microarrays were used to measure the relative transcript levels of 100 Streptococcus pneumoniae Streptococcus pneu·mo·ni·ae n. Pneumococcus. Streptococcus pneumoniae Microbiology A pathogenic streptococcus with 90 serotypes associated with pneumonia, bacteremia, meningitis Transmission Person to person Incidence genes during the development of natural competence and during stationary phase (29). The results confirmed induction of the cin operon and identified 11 genes differentially regulated in stationary versus exponential phase. Of course, gene expression monitoring is not restricted to the study of bacterial pathogens. Transcription of the CMV genome was measured during infection by using an array of 75-mer oligonucleotides representing each of the 226 predicted CMV ORFs (62). By blocking translation or DNA replication, the researchers revealed a detailed classification of CMV genes into four kinetic classes, in agreement with previous reports, and assigned many ORFs, for which expression data were not previously available, into these groups. Identifying Candidate Virulence Factors Because expression of virulence-associated genes is tightly regulated (4), measuring a pathogen's gene expression in microenvironments specific to the pathogen and germane ger·mane adj. Being both pertinent and fitting. See Synonyms at relevant. [Middle English germain, having the same parents, closely connected; see german2. to the disease process is critical. Exploration of pathogen gene expression in the host environment may be technically challenging because of the relatively small number of pathogens present in an infected animal (29). Until more sensitive detection protocols are developed, examining global gene expression will be more practical in environmental conditions that mimic aspects of the host environment, such as elevated temperature, iron limitation, and changes in pH (4, 63) and in cell culture models. In fact, a microarray has been used to monitor gene expression in M. tuberculosis M. tuberculosis, n the bacterium responsible for tuberculosis, generally a respiratory infection in man; nonrespiratory tuberculosis is considered an indicator disease for AIDS. See also tuberculosis. while it infects cultured monocytes monocytes, n.pl the largest of the white blood cells. They have one nucleus and a large amount of grayish-blue cytoplasm. Develop into macrophages and both consume foreign material and alert T cells to its presence. (64). Even after measurement of bacterial gene expression from infected hosts becomes feasible, the ex vivo datasets will facilitate deconstruction of the in vivo gene expression response into component responses, leading to detailed understanding of the pathways of virulence factor regulation. Identifying candidate virulence factors through a global gene expression method relies on two assumptions. First, because virulence-associated genes are often coordinately regulated (4), new virulence factors are likely to be coregulated with known ones. By clustering gene expression profiles across a large number of conditions, we can precisely monitor coregulation, thus revealing subtleties of regulation and leading to the identification of bona fide regulons. Second, because virulence-associated genes are tightly regulated (4), genes that are specifically expressed during infection or under conditions mimicking infection are candidate virulence factors. This assumption has been justified by numerous studies using in vivo expression technology (IVET IVET Initial Vocational Education and Training ) and differential fluorescence induction (DFI See Direct foreign investment. ), in which genes induced during infection are often required for virulence (4, 65). When RNA from in vivo microbial samples can be efficiently isolated and labeled, microarrays will provide substantial advantages over IVET and DFI technologies for identifying putative virulence factors, including immediate identification of differentially expressed genes and detection of temporal profiles of transcription induction and repression. As is demanded for candidate genes identified by any expression screening approach, a role in pathogenesis must be confirmed by mutation and subsequent assays of virulence. By identifying factors expressed in the host, microarray methods may also identify potential vaccine targets. Furthermore, one could identify candidate epitopes for vaccine development for intracellular pathogens by predicting whether genes that are preferentially expressed inside host leukocytes will encode promiscuous human leukocyte antigen human leukocyte antigen n. Abbr. HLA A gene product of the major histocompatibility complex; these antigens have been shown to have a strong influence on human allotransplantation, transfusions in refractory patients, and certain disease class II ligands (66). Gene expression studies may also reveal key regulatory differences that lead to differing virulence between closely related pathogen strains. For example, variations in virulence of Listeria Listeria /Lis·te·ria/ (lis-ter´e-ah) a genus of gram-negative bacteria (family Corynebacterium); L. monocyto´genes causes listeriosis. Lis·te·ri·a n. monocytogenes serotypes have been correlated with differential transcription of PrfA-regulated virulence genes (67, 68). However, because microarrays cannot measure expression of genes that are absent from the reference strain, genotypic differences such as horizontal transfer of virulence factors will not be detectable by this method. Pharmacogenomics Yet another application for microarrays is the study of drug effects on microbial cellular physiology, as revealed by global gene expression patterns (69). This approach has been used to identify drug-specific gene expression signatures in yeast and human cells (18,19,70). Correlation of gene expression with drug activity may suggest molecular details of drug action, and correlation of transcription profiles in untreated cells with drug response may reveal mechanisms for sensitivity and resistance (19). This approach has recently been used to characterize gene expression response in M. tuberculosis exposed to known inhibitors of the mycolic acid biosynthesis Biosynthesis The synthesis of more complex molecules from simpler ones in cells by a series of reactions mediated by enzymes. The overall economy and survival of the cell is governed by the interplay between the energy gained from the breakdown of compounds pathway, isoniazid isoniazid (ī'sōnī`əzĭd), drug used to treat tuberculosis. Also known as isonicotinic acid hydrazide, isoniazid is the most effective antituberculosis drug currently available. and ethionamide (30). Both of these compounds elicited a similar gene expression response profile, characterized by pronounced transcription induction of five adjacent genes encoding fatty acid biosynthesis enzymes. Because a proven isoniazid target, KasA, was among these genes, the authors proposed that the adjacent, coregulated loci might be targets for new anti-tuberculosis drugs. Finally, these results suggested that the mode of action of a novel compound may be inferred from gene expression response to that compound. Using microarrays to detect microbial polymorphisms linked to known drug-resistance phenotypes will also influence diagnosis and subsequent drug treatment. For example, an oligonucleotide array was used to detect mutant alleles of the M. tuberculosis rpoB gene, which are known to confer resistance to rifampicin rifampicin /rif·am·pi·cin/ (rif´am-pi-sin) rifampin. rifampin, rifampicin a derivative of rifamycin; an antibacterial and antifungal agent used in the treatment of mycobacterial infections, actinomycosis and histoplasmosis. (71). Microbial Genotyping One microarray application that interrogates DNA rather than RNA is the identification of genomic deletions in mutant strains and environmental isolates by measuring the number of DNA copies at each locus, a technique termed array-based comparative genome hybridization (72). This technique was used to identify several large deletions in a number of BCG vaccine strains and reconstruct their phylogeny (73). Oligonucleotide arrays have also been used for fine-scale genotyping of polymorphisms in related pathogens. Accurate identification of Mycobacterium mycobacterium Any of the rod-shaped bacteria that make up the genus Mycobacterium. The two most important species cause tuberculosis and leprosy in humans; another species causes tuberculosis in both cattle and humans. species using a GeneChip containing a set of 82 polymorphic oligonucleotides from the 16S ribosomal RNA gene demonstrated the potential power of this approach for molecular diagnostics (71). As additional microbial genome ORF microarrays become available, molecular surveys of the genomic structure of multiple strains will become far more precise and feasible. Two caveats should be mentioned: the ability to characterize genome insertions relative to the reference sequence is lacking, and the degree to which sequence variability can be characterized on the basis of microarray hybridization is unknown. Examining a Host: Application of DNA Microarrays Designing Microarrays for Host Organisms The currently described human DNA microarrays are largely composed of expressed sequence tags (ESTs). Culling ESTs from many different tissue sources and limiting representation of any single Unigene cluster (see http:// www.ncbi.nlm.nih.gov/UniGene/Hs.stats.shtml) have resulted in better than 50% representation of the predicted 80,000-100,000 human coding regions (28). A variety of human DNA and oligonucleotide microarrays are available commercially (e.g., Incyte, Palo Alto, CA; Affymetrix; NEN Nen, river, China Nen (nŭn) or Nonni (nôn`nē), river, 740 mi (1,191 km) long, rising in the Yilehuli (Ilkuri) Mts., N Heilongjiang prov. Life Science Products, Boston, MA). For in vivo studies of host response, infection of animal models will often be necessary. If the animal is a primate, human DNA microarrays might be used to monitor host gene expression because of the high level of primary sequence similarity between species. Sequence similarity is too low to permit reliable cross-hybridization with nonprimate vertebrates, but microarrays composed of mouse and rat sequences have been described (74) and are available (e.g., Incyte, Affymetrix). Understanding Pathogenesis Microarrays promise to accelerate our understanding of the host side of the host-pathogen interaction. A large fraction of the genome can be simultaneously interrogated, and clustering of the data may identify groups of genes that implicate im·pli·cate tr.v. im·pli·cat·ed, im·pli·cat·ing, im·pli·cates 1. To involve or connect intimately or incriminatingly: evidence that implicates others in the plot. 2. activation or repression of key regulatory pathways. Microarrays also allow the temporal sequence of transcription induction and repression to be followed, a prerequisite for determining the order of events following an encounter. Finally, ascertainment of the host cell's physiologic state, particularly apoptosis and necrosis, by genomewide profiling will facilitate separation of primary and secondary effects. One important caveat of studying transcription in any system is that post-transcription regulatory events cannot be detected. This is particularly important in the case of host response because many important host cell events, such as cytoskeletal cy`to`skel´e`tal a. 1. (Cell Biology) Of or pertaining to the cytoskeleton; as, cytoskeletal microtubules s>. rearrangements, occur after transcription (75). Therefore, some key aspects of the molecular program may not be easily characterized by gene expression profiling. Eventually, it may be possible to monitor simultaneously the levels, activities, and interactions of all proteins in the cell (76). Although analyzing gene expression of infected tissues is feasible, cellular heterogeneity may make analysis of host response complicated. Examining the response in infected cultured cells by using cell types most likely to encounter the pathogen may reduce the complexity of the system being examined. Results obtained in cell culture systems will be instrumental in interpreting gene expression profiles of specific cell types from whole tissue datasets. The first application of global gene expression methods to pathogenesis used oligonucleotide arrays to monitor gene expression in primary human fibroblasts Fibroblasts A type of cell found in connective tissue; produces collagen. Mentioned in: Skin Grafting infected by human CMV (37). The transcript abundance of 258 out of 6,600 human genes changed by more than fourfold compared to uninfected cells at either 8 or 24 hours after infection. Some of these changes, such as induction of cytokines Cytokines Chemicals made by the cells that act on other cells to stimulate or inhibit their function. Cytokines that stimulate growth are called "growth factors. , stress-inducible proteins, and many interferon-inducible genes, were consistent with induction of cellular immune responses. A similar experimental design has been used to examine the global effects of HIV-1 infection on cultured CD4-positive T cells. One study concluded that HIV-1 infection resulted in differential expression of 20 of the 1,506 human genes monitored and that most of these changes occurred only after 3 days in culture (36). In contrast, the preliminary results of an independent study using a similar design indicated that substantial HIV-induced transcription changes began very early after inoculation (77). The latter study confirmed activation of nuclear factor-[Kappa]B (NF-[Kappa]B), p68 kinase, and RNase L. DNA expression arrays have recently been used to examine the response of host cells to infection by bacterial pathogens. Transcription profiling of macrophages Macrophages White blood cells whose job is to destroy invading microorganisms. Listeria monocytogenes avoids being killed and can multiply within the macrophage. and epithelial cells infected by Salmonella confirmed increased expression of many proinflammatory cytokines and chemokines, signaling molecules, and transcription activators and identified several genes previously unrecognized to be regulated by infection (33,34). The macrophage macrophage /mac·ro·phage/ (mak´ro-faj) any of the large, mononuclear, highly phagocytic cells derived from monocytes that occur in the walls of blood vessels (adventitial cells) and in loose connective tissue (histiocytes, phagocytic study demonstrated that exposure to purified Salmonella lipopolysaccharide lipopolysaccharide /lipo·poly·sac·cha·ride/ (-pol?e-sak´ah-rid) 1. a molecule in which lipids and polysaccharides are linked. 2. resulted in a very similar response profile to whole cells and that activation of macrophages with gamma interferon before infection modified the response (34). In epithelial cells, overexpression of [Kappa]B (an inhibitor of NF-[Kappa]B) blocked induction of gene expression for a number of regulated genes, underscoring the importance of NF-[Kappa]B in the proinflammatory response (33). Similarly, the transcription response of human promyelocytic cells to L. monocytogenes infection has been determined by both oligonucleotide arrays and filter-based arrays (32). Comparison of these data with the Salmonella infection data suggests that the proinflammatory response is grossly conserved: in both cases Address for correspondence: Craig Cummings, VAPAHCS VAPAHCS Veterans Affairs Palo Alto Health Care System 154T, Building 100, Room D4-123, 3801 Miranda Ave., Palo Alto, CA 94304, USA; fax: 650-852-3291; e-mail: cummings@cmgm.stanford.edu. Craig A. Cummings(*) and David A. Relman(*)([dagger]) Stanford University, Stanford, California, USA; VA Palo Alto Health Care System, Palo Alto, California “Palo Alto” redirects here. For other uses, see Palo Alto (disambiguation). Palo Alto (IPA: /ˌpæloʊˈʔæltoʊ/, from Spanish: palo: "stick" and alto: "high", i.e. , USA |
|
||||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion