Printer Friendly

Zippers, scissors and xeroxes: from unravelling the double helix to reading the blueprint.


With the following exercise in understatement, James Watson and Francis Crick (1) ushered in a revolution in the biological, evolutionary and medical sciences: 'It has not escaped our notice that the specific pairing we have postulated immediately suggests a possible copying mechanism for the genetic material.' The determination of the structure of deoxyribonucleic acid (DNA) opened the door to whole new realms of research, methods of analysis and advances in technology. The development of various molecular markers in the fifty-five years since that rather short paper appeared in Nature has resulted in everything from genetic tests for cancer genes to forensic DNA fingerprinting techniques and finally to the 'blueprint' of the human genome.

In this paper, the term 'molecular marker' refers to small fragments of DNA sequence associated with a specific part of a genome (the whole hereditary information of an organism). It once took an entire doctoral degree to construct a restriction map of a plasmid (i.e. small, circular genomes such as the chloroplast in plants). A restriction map is a diagrammatic representation of known restriction sites (positions where the DNA can be 'cut' using enzymes) within a DNA sequence. Now, with advances in both our understanding of the DNA molecule, along with the plethora of available molecular markers, advances in sequencing methods, increased computer sophistication and processing power, it takes just days to sequence entire, albeit relatively small, genomes (usually from bacteria or viruses).

Science is a constantly expanding enterprise, with current researchers continually building on, utilising, expanding or re-interpreting the discoveries made in the past. Evolutionary botany, particularly the discipline of phylogenetic systematics, the area of research in which my PhD project is based, has evolved from the age of Natural Philosophy in the 1700s and 1800s and the work of eminent botanists and naturalists such as Joseph Banks (2) (from Captain James Cook's 1768-1771 voyage aboard the HMS Endeavour), Robert Brown, (3,4) George Bentham (5) and others. (6-8) The discipline uses data obtained from many sources, including morphology (physical attributes of leaves, flowers, fruits, etc.), anatomy (cell-level structure) as well as from molecular-based analysis. The process of gathering morphological data has not changed drastically from the time of Banks, apart from the increased resolution of modern light microscopes and the invention of the electron microscope. In contrast, molecular biology (for example, the study of the biology of DNA and proteins) is a rather young discipline, which has existed for less than sixty years.

In order to understand how modern molecular-based analysis has become so prominent in evolutionary biology, it is first necessary to revisit the history and development of the discipline. Only then can we begin to imagine future possibilities of techniques, applications and directions for research. Four major discoveries stand out like signposts in the short history of molecular-based, evolutionary biology: the identification of the structure of DNA; (1) isolation of restriction endonucleases; (9,10) development of the Sanger method of DNA sequencing; (11) and, finally, the invention of the polymerase chain reaction (PCR). (12) The mapping of our own genome, the first draft of which was published in 2001, (13,14) within ten years of starting the endeavour, would have been infinitely more difficult without these four discoveries. Each revolutionised the field of molecular biology: some, such as PCR, increased the ease of analysis, while others increased the level of analytical resolution up to and including the level of nucleotide sequence (i.e. restriction endonucleases, Sanger sequencing).


Initially, DNA was considered to be too simple to be the genetic material, the inherited molecule that encodes the information required for constructing all organisms on this planet. This was because the DNA 'language' has a four-letter 'alphabet' of nucleotides (deoxynucleotide triphosphates, dNTPs) or bases: two pyrimidines, cytosine (C) and thymine (T), and two purines, adenine (A) and guanine (G). Proteins, which can be composed of up to twenty different amino acids, were thought to be more suitable for the role. (15,16) The structure of DNA was deduced through a combination of model-building, Chargaff's rules and X-ray diffraction data. (1,17) Chargaff's rules are that: (1) the total amount of pyrimidines (T/C) always equals the total amount of purines (A/G); and (2), that the amount of T always equals the amount of A and the amounts of C and G are also always equal. (18-21)

The double helical structure of DNA looks rather like an immensely long ladder twisted into a helix, or coil (Fig. 1). The sides of the 'ladder' are formed by a backbone of sugar and phosphate molecules (Fig. 1a), which are held together by phosphodiester bonds in which the phosphate group (P[O.sub.4.sup.3-]) forms a bridge between the hydroxyl (-OH) groups on adjacent sugar residues. In essence, these bonds are the backbone of DNA. The 'rungs' of the ladder consist of pairs of nucleotide bases (A, T, C, G) joined weakly together by hydrogen bonds. Each base pair consists of one purine and one pyrimidine, always in the form of A with T and G with C; the G-C base pair has three hydrogen bonds, while the A-T base pair has two (Fig. 1b). This combination of purine and pyrimidine best accounted for both Chargraff's rules and the X-ray data. (1,21,22)


The discovery of the structure of DNA caused a lot of excitement for three simple reasons. First, the means of replicating the molecule is readily apparent from its structure, since each base can specify its complementary base on the opposite strand via hydrogen bonding--hence the observation by Watson and Crick referred to at the start of this paper. This essential property of the genetic material, the means of replicating itself in order to be passed from parent to offspring, had been a mystery until then. DNA replication is semi-conservative; that is, each 'daughter' DNA molecule contains one helix from the 'mother' molecule and one that is synthesised using the 'mother' as a template (Fig. 2). During replication, the two strands of the parental double helix unwind, like a zipper, and each specifies the order of nucleotides in the new strands by base-pairing rules (22) (Fig. 2). Second, the structure of DNA suggested that the order of nucleotides indicated the sequence of amino acids in the protein that ultimately resulted from that stretch of DNA, or gene. In essence, some sort of genetic code may be written in DNA as a sequence of nucleotides and then is translated into the amino acid 'language' of proteins. (22) Third, another enigma solved by the structure of DNA was the nature of mutation and the source of variation. The discovery of the structure of DNA, and its method of replication, was the springboard for the development of the plethora of molecular techniques that would follow in the next five decades.


In spite of the structure of DNA being resolved in the early 1950s, methods that directly analysed variation in DNA sequences did not arise until the 1970s. During this interval, the molecular methods of choice were immunological assays, protein or enzymatic electrophoresis. Immunological assays rely on the specific immune response properties of a test subject, often a rabbit, to identify changes in amino acid sequence, whereas electrophoresis is the process of separating proteins on the basis of their size and electrical charge. Such analyses were not actually investigating variation in DNA sequence; rather, these methods utilised the product of the DNA sequence after it has been transcribed, translated and modified into the final product (i.e. protein, enzyme, etc.). For over ten years, the vast majority of molecular methods that analysed variation in DNA sequence relied on the special properties of the second major discovery, restriction endonucleases.


The discovery of restriction endonucleases, one class of deoxyribonucleases, in the late 1960s sparked a revolution in the study of variation at the molecular level. Although their existence had been hypothesised almost twenty years earlier, (23,24) the first restriction endonucleases were not isolated until 1968. (9,25) Deoxyribonucleases (DNases) are enzymes that hydrolyse, or break down, the phosphodiester bonds between the nucleotides in a strand of DNA. There are two types of DNases: exonucleases act on the end of a DNA chain, while endonucleases attack interior linkages. (26) Only the latter is discussed in this paper.


Restriction endonucleases, also called restriction enzymes, restrict (cleave, cut) DNA at specific recognition sites or motifs (typically 4-8 nucleotides) (27) and generate DNA fragments that differ in size when mutations destroy or create restriction sites (28) (Fig 3). The term restriction comes from the restriction-modification (R-M) systems of phage infection in bacteria; phages, or bacteriophages, are viruses that infect bacteria. A primitive form of immune system, R-M systems consist of a restriction endonuclease and a matching modification enzyme, which recognises and modifies (generally by methylation) the DNA sequence recognised by the restriction endonuclease. (29) Methylation is the addition of a methyl group (-C[H.sub.3]) to bases of the DNA. (27) Modification protects the bacterial DNA from degradation by its own enzymes; foreign or unmodified DNA that has gained entry to the cell is restricted (cut) by the restriction endonuclease and further degraded by other enzymes. (29) The detection of foreign DNA and its degradation by restriction endonucleases occurs in most prokaryotic cells (i.e. bacteria); thus far, restriction endonucleases have not been detected in eukaryotes (i.e. plants, animals, etc.). (26)

There are three classes or types of restriction endonucleases, (10) but it is the type II restriction endonucleases that are the basis of recombinant DNA technology and the 'scissors' of modern molecular methods. First isolated in 1970, (30,31) type II are ubiquitous in prokaryotes; they recognise and cleave DNA at, or very close to, a specific nucleotide sequence 4-8 base pairs (bp) long (26,28,32) (Fig. 3). The smaller the recognition sequence, the smaller the average size of the fragments produced and thus the greater the number of different fragments generated by the digestion. For example, 4-base cutters typically generate smaller fragments than 5-base, 6-base or 8-base cutters because a particular four base-pair sequence would occur more frequently in the genome. (28) The majority of type II restriction endonucleases cleave DNA at symmetrical recognition sequences called palindromes (Fig. 3). In a palindromic sequence there is a 'horizontal' complementary arrangement of nucleotides. That is, if one strand is read from left to right, then the other strand will give the same sequence if read right to left. (10) There are many common English words that are also palindromes, such as 'mum', 'dad', 'eye', 'noon', 'madam' and 'racecar'. This is in addition to the complementary base pairing of the residues (i.e. A-T and G-C). Simply put, palindromic sequences are 'mirror images' of each other.

In 1978, Werner Arber, Dan Nathans and Hamilton Smith were jointly awarded the Nobel Prize in Physiology/Medicine for the discovery of 'restriction enzymes and their application to problems of molecular genetics'. (33) Each recipient was responsible for a distinct aspect of the discovery and implication of these 'molecular scissors': Arber for hypothesising the existence of the restriction-modification system, (34) Smith for testing and verifying Arber's hypothesis, (30,31) and Nathans for pioneering the application of restriction enzymes to genetics. (33,35) In addition, these pioneers also recognised the potential for the 'general usefulness [of restriction enzymes] in the analysis of DNA' and the possibility of producing 'sets of overlapping fragments and, by appropriate sequential digestion, to obtain quite small, specific fragments useful for the determination of nucleotide sequences'. Arber, Smith and Nathans 'laid the groundwork that has led to the current addiction to restriction endonucleases as routine, but essential, tools for molecular biologists.' (35,36)

In the early to mid-1970s, there was almost an explosive rate of discovery of new site-specific restriction endonucleases and rapid application of these to the isolation of genes, physical mapping of chromosomes, DNA nucleotide sequence analysis and in the restructuring of DNA molecules. (29) It is the 'absolute sequence specificity for both binding and cleaving reactions' of type II restriction endonucleases (10) that has made them a key component of modern molecular analysis and recombinant DNA technology for over ten years. Restriction Fragment Length Polymorphisms (RFLPs) and Amplified Fragment Length Polymorphisms (AFLPs) are two commonly used modern molecular methods for examining genetic diversity within species that utilise restriction endonucleases in the preparation stages. Clearly, the type II restriction endonucleases were 'one of Nature's greatest gifts to Science', (26) without which the decoding of any DNA sequence for any organism, let alone its entire genome (or 'blueprint'), would be infinitely more complex and time-consuming.


The actual nucleotide sequence of a DNA region is considered by some to be the ultimate molecular marker; in the strictest sense, any assay method that stops short of obtaining DNA sequence data could be considered to provide an indirect and incomplete picture of genetic information at the loci under study. (37)

Nucleic acid sequencing did not become a common method of analysis until the 1980s; (38) up until the mid-1970s, only stretches of DNA 15-20bp long had been sequenced. The advent of cloning, the process of isolating a defined DNA sequence and obtaining multiple copies of it through the use of a vector, changed everything. Vectors are usually plasmids, which are small, circular, double-stranded DNA molecules occurring naturally in bacteria. Virtually overnight, it became possible to obtain pure samples of defined fragments of chromosomal DNA and, suddenly, the development of efficient DNA sequencing methods became of paramount importance. (39)

The two primary methods of DNA sequencing were invented at approximately the same time on different sides of the Atlantic Ocean. Maxam-Gilbert sequencing (40) was developed at Harvard in the United States of America, while the Sanger method (11) was developed at the MRC Laboratory for Molecular Biology at Cambridge in Britain. (39) However, it was the Sanger method that achieved widespread application for both small and large-scale DNA sequencing projects, and it was also the method used to sequence the human genome.

Sanger Sequencing, also called Sanger Dideoxy Sequencing, is the controlled interruption of enzymatic DNA replication through the use of dideoxynucleotides in primer-directed DNA extension to produce discrete DNA fragments (11,41) (Fig. 4). Dideoxynucleotides (dideoxynucleotide triphosphates, ddNTPs) are modified nucleotides that terminate DNA synthesis after their incorporation into a strand being extended. ddNTPs lack the hydroxyl (-OH) group on the sugar residue; this means that unmodified nucleotides (dNTPs) are unable to be added via the joining of the phosphate (P[O.sub.4.sup.3-]) and -OH groups on the adjacent sugar residues of two nucleotides. The template DNA of unknown sequence is combined in a buffered solution with the four dNTPs (dATP, dCTP, dGTP, dTTP), the four ddNTPs (ddATP, ddCTP, ddGTP, ddTTP), primers and a DNA polymerase (Fig. 4a). Each of the ddNTPs is labelled with a different fluorescent-based dye (i.e. fluorescent tag or label). A primer is a short segment of DNA known to be complementary to a segment either within or immediately adjacent to the target sequence. The buffer solution provides suitable conditions for the optimum activity and stability of the DNA polymerase, an enzyme that adds nucleotides to new DNA strands during synthesis. The template DNA is then subjected to amplification of the target sequence by the polymerase chain reaction (PCR) to produce many millions of copies of the target DNA region of the template. The ddNTPs will be incorporated randomly into the new copies of the target sequence, therefore terminating DNA synthesis at different nucleotide positions and resulting in labelled DNA fragments of various lengths (Fig. 4b). (38,42-44)


The fluorescent dye-labelled DNA fragments then undergo capillary electrophoresis (CE): the labelled fragments pass through a capillary gel and are separated on the basis of size and electrical charge. Smaller fragments, where a ddNTP was incorporated early during the synthesis of a new strand, travel faster and reach the detection region first (Fig. 4c-d). As the DNA fragments travelling along the capillary pass into the detection region, a laser excites the fluorescent tags on the ddNTPS, which produce fluorescence emissions at different wavelengths of light (i.e. colours; Fig. 4c) that are then recorded by the detector. (38) The identity of the terminating nucleotide (ddNTP; A, G, C, T) is assigned based on the dye colour detected by the laser. The order and assigned colour of the fluorescing fragments, displayed as a series of coloured peaks on an automated DNA sequencer trace (electropherogram), reveals the actual DNA nucleotide sequence of the region under study. (42,43)

The development of fluorescent dye-based techniques, with their ability to multiplex samples and real-time data acquisition, (44) made Sanger sequencing amenable to automation. The ability to multiplex (i.e. the simultaneous analysis of multiple samples) in particular was crucial: instead of using four different lanes in a gel (one for each nucleotide), the DNA is placed into a single well with four different dyes. This means that for the same amount of gel space, four times the number of samples can be run. (44) Fluorescence-based DNA sequencing was used to obtain most of the primary sequence data generated as part of the Human Genome Project. (13,14)

Nucleotides are the basic unit of information encoded in organisms, and the potential size of informative data sets is immense. (38) The information obtained from DNA sequencing can be easily scored; automation of the sequencing process along with modern computer capabilities and processing power mean that large amounts of data can be generated, scored and analysed. (42) Comparisons between species are straightforward; the existence of 'universal primers' for some genes that all living organisms share means it is possible to sequence most species for some regions without any prior knowledge of DNA sequence. (42) The Sanger method is preferred for several reasons. It requires no prior sequence knowledge; the primers used during amplification via the polymerase chain reaction can also be used as primers during the sequencing stage; (38) and it is possible to read more DNA sequence information per gel due to better band resolution. Because of its simplicity, the Sanger method has proved to be the 'technique of choice' for DNA sequencing projects.

However, Sanger sequencing may one day be superseded by the new parallel sequencing techniques based on the 'sequence by synthesis' principle, where the sequence is determined based on the detection of nucleotide incorporation using a primer-directed polymerase extension. (45) In essence, the order of nucleotides is revealed during the synthesis of a new (and complementary) DNA strand. The presumed heir to the Sanger sequencing method is Pyrosequencing, (45-50) a four-enzyme real-time technique that monitors nucleotide incorporation during DNA synthesis via a detectable light signal (bioluminescence). (45,48) In contrast to Sanger sequencing, pyrosequencing detects the release of pyrophosphate (via bioluminescence) that accompanies the incorporation of a nucleotide by a DNA polymerase during synthesis of a new DNA strand. (48) Nucleotides are added sequentially to the sample (containing template DNA, primers, buffer, DNA polymerase and other enzymes) in a known order, with the detection of pyrophosphate release corresponding to the addition of the specific nucleotide (in the new strand) complementary to that nucleotide position in the template sequence. Bioluminescence is the production and emission of light by a living organism as the result of a chemical reaction during which chemical energy is converted to light energy (i.e. fireflies, anglerfish). This technology is still being refined, but already can sequence over 20 million bases in a single four-and-a-half hour run. (50) The development of pyrosequencing (45-49) and high sample-volume automated parallel sequencing techniques (i.e. 454 Life Sciences (50)), will again revolutionise the acquisition of nucleotide sequence information in the near future, to the effect that lack of sequence data would no longer be a limiting factor in analyses.

However, DNA sequencing would not be anywhere near as commonplace, nor as relatively straightforward as it is today without the development of a means of obtaining assayable quantities of DNA in a short timeframe. The development of the polymerase chain reaction (PCR) provided this means and allowed for molecular-based analysis to be performed by scientists interested in the relationships, classification and diversity of organisms without the need for extensive training in genetic techniques.


The fourth, and most recent, revolutionary event to take place in molecular biology was the invention/discovery of the polymerase chain reaction, more commonly and simply called PCR. The elegant simplicity of its theoretical basis and the way the procedure has changed the face of molecular biological analysis elevate PCR into the 'why didn't I think of that?' category of inventions and discoveries. PCR is a technique that enables, in a test tube, the amplification of any desired specific sequence of DNA from almost any biological source to assayable quantities in a matter of hours. (37,51) The polymerase chain reaction was invented, or perhaps more appropriately was run into, by Kary Mullis one Friday evening in April 1983 somewhere between Cloverdale and the Anderson Valley in Mendocino County, California in an 'exhilarating, Californian buckeye-scented "Eureka!" moment'. (12,52) Suddenly, a process existed whereby 'all [of] the DNA one could want could be provided in the space of an afternoon'. (12)

PCR is the enzymatic amplification of a DNA fragment flanked by two primers that hybridise to opposite strands of the target DNA sequence (53) (Fig. 5). The technique involves three main steps. First, double-stranded DNA is denatured by heating the sample, separating it into single strands (Fig. 5a). Second, the primers are annealed to sites flanking the region to be amplified (Fig. 5b). These primers are orientated so that the DNA will be synthesised on both strands in opposite directions; (53) i.e. the DNA is 'read' and synthesised both 'left to right' and 'right to left'. Next, strands complementary to the region between the f lanking primers (the template) are synthesised via the addition of nucleotides by a DNA polymerase (Fig. 5c). Repeated cycles of heat denaturation of the template, primer annealing and DNA polymerase extension result in the amplification of the segment defined by the 'outer' ends of the PCR primers; the extension product of each primer serves as a template to the other primer in the following cycle. (53) (Fig. 5 d-f)


These repetitive cycles of denaturation, annealing and extension may seem, at first, to be 'a little boring until the realisation occurs that this procedure [PCR] is catalysing a doubling [of template DNA] with each cycle!' (54) (Fig. 5 d-f). PCR thus results in the exponential accumulation of the specific target fragment of DNA, up to several million-fold in a few hours; (53,55) assuming a doubling of the amount of target DNA with each PCR cycle, after n cycles there is [2.sup.n] times as much target DNA as was present initially, where n is the number of cycles. In theory, starting with two double-stranded copies of a single-copy gene in genomic DNA, after 20 cycles of PCR (denaturation, annealing, extension) there would be over two million copies of the target DNA fragment present ([2.sup.20] = 2,097,152). In practice, however, the initial sample subjected to PCR would contain many more starting copies of the target DNA sequence.

During the development and initial application of PCR, the Klenow fragment of DNA Polymerase I from the bacterium Escherichia coli (E. coli) was used to amplify the target sequence; this was the main DNA polymerase used until the late 1980s. (11,56,57) However, the Klenow fragment is inactivated at the high temperatures necessary for successful denaturation of DNA, thus requiring the addition of polymerase after the denaturation step of each cycle. (57) From the beginning, it was recognised that a heat-stable DNA polymerase would be an almost invaluable asset; (55) the discovery of and the isolation of thermo-stable DNA polymerases, such as from Thermus aquaticus (Taq polymerase), eliminated this tedious and repetitive step. (58) Taq polymerase is now conveniently, and somewhat ironically, produced by genetically engineered E. coli bacteria. (12,59)

The use of Taq polymerase not only improved the overall performance of PCR by increasing the specificity, yield, sensitivity and length of target DNA that can be amplified; it also transformed the method by making PCR amenable to automation. (60) The use of heat-stable DNA polymerases, developed from T. aquaticus and other thermophilic bacteria, led to the development of simple automated thermal-cycling devices for carrying out PCR in a single tube containing all of the necessary reagents. (53,57) A thermocycler heats and cools the PCR sample to a defined series of temperature steps during the PCR cycle; these steps correspond to the denaturation, annealing and extension stages of PCR (Fig. 5a-c).

The invention of PCR in the mid to late 1980s revolutionised the fields of molecular, organismal and population biology. (12,55,56,60,61) Its discovery and development spurred at least three major breakthroughs in the acquisition of genetic markers. (51) First, when coupled with the further development of amplification primers and improved laboratory methods for the rapid sequencing of DNA fragments, PCR-based approaches afforded direct access to the evolutionary information held in nucleotide sequences for the first time. (62) Second, it was realised that PCR-based methods could be used to tap into the 'vast wellspring of genetic polymorphism' contained within microsatellites, a highly variable and abundant class of molecular marker. (37,63-65) Finally, because PCR can amplify DNA sequences from miniscule amounts of tissue, even from some fossils (i.e. Tyrannosaurus rex (66)), it has extended molecular applications to a much wider biological arena. (37) Many tissue collections, including those in museums and herbaria (repositories of plant specimens), thus contain samples that are potentially useful for long term studies involving species that cannot be collected or otherwise disturbed. (67)

The polymerase chain reaction is not a method of analysis in itself; it is a means of obtaining assayable quantities of DNA sequences from often limited source material and also speeds up the process of generating data. However, PCR is not merely 'a prelude to some form of DNA assay'. (37) Without the polymerase chain reaction, the modern molecular techniques employed in systematic studies today, including DNA sequencing and microsatellite analysis, would be almost nonexistent or very reduced in both scope and application.

Modern molecular protocols can enable a research student, from Honours to PhD level, to extract, amplify, sequence and analyse DNA regions from any species without the need for complex cloning strategies and prior sequence knowledge in the timeframe required. This allows all species, not just those of major commercial significance, to be investigated. My study of the holly grevilleas is just one example of evolution-related research currently being conducted in the Systematics Laboratory in the School of Botany at The University of Melbourne.


The 365 species of Grevillea make this the largest genus in the family Proteaceae and the third largest genus in the Australian vascular flora, after Eucalyptus and Acacia. (68,69) Detailed revisions have been published in recent years, but the limits of many species still require critical examination. (69,70) The advent of PCR and modern DNA sequencing techniques have opened the way for the application of molecular data-sets to the analysis of phylogenetic relationships and species variation in plants, but there is currently no published phylogeny for the genus. The holly grevilleas (also known as the 'G. aquifolium group') are a distinctive group of grevilleas from south-eastern Australia. They are a group of sixteen species of high conservation value with 'holly-like leaves and "toothbrush" inflorescences'. (70)

The main focus of my project, Grevillea aquifolium (Fig. 6), was named for its leaves, which resemble those of European Holly, Ilex aquifolium. The second-most widespread holly grevillea species, occurring in western Victoria and south-eastern South Australia, it is the most morphologically variable, and has often been confused with other holly grevilleas, including G. montis-cole, G. microstegia, and especially G. ilicifolia. (69,71) In South Australia, G. aquifolium is confined to several small populations in the lower southeast of the state, from near Robe to the coast southwest of Mt Gambier. (70) In Victoria it occurs mainly in the Stawell and Grampians area to the Little Desert, with outlying populations in the Portland district. (72) The species is of interest to the horticulture industry, with twelve horticultural forms currently recognised. The relatively broad geographical distribution of G. aquifolium is in contrast to many of the other species in the holly grevillea group, which are narrow endemics with very restricted distributions.

Over most of its geographical range, G. aquifolium is widespread and common, with many populations in State or Federally protected parks. However, several populations, especially at Cooack and Portland (Victoria), have been reduced by clearing for agriculture and are close to extinction. (71) In addition, if definable groups are found to be present within G. aquifolium and require formal recognition, then the conservation status and requirements of several populations may need to be reassessed.


My project combines fieldwork, morphological and molecular analyses to investigate relationships both within G. aquifolium and among the various members of the holly grevilleas. It will help to extend and refine the documentation of Australia's vascular flora and provide basic data that underpins assessment of conservation priorities. I am currently preparing samples for use in a microsatellite analysis to investigate the within-population variation of G. aquifolium across its geographical range.

Microsatellites consist of short sequences of DNA, usually 2-6bp long, that are repeated so as to give short arrays of 20-100bp at each locus and are randomly distributed throughout the nuclear, chloroplast and mitochondrial genomes. (73-75) Microsatellites may also be called Short Simple Repeats (SSRs), Simple Sequence Repeats (also SSRs) or Short Tandem Repeats (STRs). Microsatellites are usually highly polymorphic molecular markers, with many alleles at a particular locus; alleles are the different versions of a gene, or in this case, the different numbers of repeat units present. The variation detected in microsatellite analyses results from changes to the number of repeat units due to errors in DNA replication at the locus under study. (37,42,73)

Microsatellites are relatively abundant, highly reproducible markers that are thought to have a uniform coverage across the genome. They are considered to have high mutation rates compared with other DNA markers, making them useful for intra-population level studies of organisms. (42) Microsatellites have been applied to studies of population structure and estimations of gene diversity. They are ideally suited to analyses of gene flow because these markers show a high number of alleles per locus. Microsatellites have found wide application in wildlife and human forensics, parentage analysis and studies of population-level genetic variation in a broad range of species including other Grevillea species. (75-78)


In the 1960s, there was an explosion of interest in molecular techniques due to the introduction of analytical methods based on protein electrophoresis, mainly due to their simplicity, high-throughput rate of sample analysis, sensitivity and cost-effectiveness. Then, in the late 1970s, attention shifted to DNA analysis methods based on the use of restriction enzymes; while in the mid to late 1980s, the invention of PCR revolutionised the world of molecular biology. Access via DNA sequencing, first developed in the late 1970s, to the 'ultimate genetic data' (actual nucleotide sequences) was finally achieved in a viable manner with the introduction of automated PCR machines. In the early 1990s, the Human Genome Project began and a low resolution genetic linkage map of the human genome was published. Finally, in February 2001, the completed 'working draft' of the human genome was published jointly in Science and Nature as part of the Human Genome Project. (13,14) In contrast, and attracting far less mainstream media attention, the Arabidopsis Genome Initiative (79) published the complete sequence of the 'model plant system' Arabidopsis thaliana in December 2000, along with detailed analyses of the final three chromosomes. (80-82) The other two chromosomes of A. thaliana (83,84) had previously been published in 1999, three years after the Initiative had been formed in 1996.

Today, the list of available marker systems for use in studies of evolutionary biology is both extensive and diverse; however, 'no single molecular marker [or technique] is ideally suited to all research endeavours'. (37) DNA sequencing, as predicted in the early 1990s, (85) has become the dominant source of data for evolutionary-related studies, especially in phylogenetic systematics, to the detriment of non-molecular data (e.g. morphology). However, DNA sequencing, despite all of its advantages, is not always the best method for obtaining suitable molecular data. Choice of marker is dependent on a number of variables, including the scope of the project, taxonomic level under study (i.e. species versus family, one family verses all flowering plants), type of questions being asked, cost per specimen, etc.

To John Avise, (86) the 'ideal' genetic marker is one that represents non-coding DNA as opposed to a gene that is exposed to selective (evolutionary) pressures. However, for deep-level phylogenetic studies, functional or coding regions (i.e. the rbcL gene in plants) have many practical advantages over non-coding regions. 'Deep-level phylogenetic studies' focus on evolutionary relationships at the higher taxonomic levels such as Class (e.g. Mammalia, the mammals), Order (e.g. Order Proteales, of which Grevillea in Proteaceae is a member) and Kingdom (e.g. plants; animals, etc.). These are the opposite of 'shallow' phylogenetic studies, which focus on relationships at the generic (e.g. Grevillea) or Family (e.g. Proteaceae) level. Despite issues of selection, functional constraints mean that sequences from diverse groups can be relatively easily aligned and analysed. In addition, nonfunctional markers may not exhibit the required levels of variation needed for such deep-level studies. In general, the ideal molecular marker would be inexpensive and user-friendly; require no prior sequence knowledge; use very small amounts of sample; be capable of analysing DNA obtained from preserved or even fossil samples; preferably not be based on DNA separation (i.e. via electrophoresis); and be highly automated. (28,87) There is a need to move beyond the current technology (i.e. microsatellites) and to develop new methods that are ever more reliable, faster, cheaper and easier to use. Some believe that Single Nucleotide Polymorphism (SNPs) based analyses will overtake microsatellites as the preferred non-nucleotide sequence level marker in the future. (87,88) The versatility of pyrosequencing, (45-50) which can facilitate the collection of sequence data for analyses at both the nucleotide (i.e. SNP) and whole-genome level, may indeed become the preferred technology of the future. The aim now, as initiated by the National Human Genome Research Institute (USA) in 2004, is to reduce the cost of sequencing whole mammalian-sized genomes to approximately US $1000. (43)

Timing and context are all-important in scientific advance, and molecular approaches are just one of the many avenues toward understanding the natural history and evolutionary biology of life on our planet. (86) Studies of morphology (the physical form and structure of an organism as a whole), ecology (relations and interactions between organisms and their environment, including other organisms) and behaviour have undeniably shaped the great majority of scientific perceptions about the natural world. Molecular approaches are especially exciting at this particular time because they have opened new empirical windows and enabled novel insights into more traditional biological subjects. (86) However, a fact that is often overlooked is that most molecular-based analyses have proved to support, rather than contradict, earlier phylogenetic (i.e. evolutionary) hypotheses based on morphological or other sources of data. (86) Still, today 'it is difficult to imagine a time when our lab freezers were not well stocked with restriction enzymes, when DNA sequencing was not possible, or when genes were only accessible to geneticists and could not be simply cloned out by recombinant DNA technology'. (36) It only took forty-eight years to progress from the determination of the structure of DNA to the first draft of the human genome, our genetic blueprint. What science will be capable of in another forty-eight years is limited only by our ingenuity and imagination.


I would like to thank my supervisors, Dr Michael Bayly and Professor Pauline Ladiges (School of Botany, The University of Melbourne), as well as Ms Monique Hallett, for reading an earlier draft of the manuscript. I would also like to thank the two anonymous reviewers for their comments. Figures 1 and 2 were adapted from illustrations from the freely-available 'Talking Glossary of Genetic Terms', developed by the national Human Genome Research Institute (NHGRI; see Finally, I would like to acknowledge the support of the Australian Systematic Botany Society (ASBS) Hansjorg Eichler Scientific Research Fund and the Holsworth Wildlife Research Endowment (ANZ Charitable Trust Australia) in partially funding the microsatellite study.


(1) Watson, J D and F H C Crick, 'A structure for Deoxyribose Nucleic Acid', Nature, vol. 171, no. 4356, 1953, 737-38.

(2) Banks, J, Journal of the Right Hon. Sir Joseph Banks, Bart., K B, P R S: during Captain Cook's first voyage in H.M.S. Endeavour in 1768-1771 to Terra del Fuego, Otahite, New Zealand, Australia, the Dutch East Indies etc. J D Hooker (ed.), MacMillan, London, 1896.

(3) Brown, R, Prodromus Florae Novae Hollandiae Et Insulae Van Diemen. 1960 facsimile edition edition, vol. 1, Hafner, New York; H R Engelmann (J. Cramer) and Wheldon & Wesley Ltd, 1810.

(4) Brown, R, Supplementum Primum Prodromi Florae Novae Hollandiae: Exhibens Proteaceas Novas Quas in Australasia, J Johndon & Co, 1830.

(5) Bentham, G, 'Grevillea', Flora Australiensis: A Description of the Plants of the Australian Territory, vol. 5 (Myoporineae to Proteaceae), Reeve & Co, London, 1870, 417-89.

(6) Cunningham, A, 'A specimen of the indigenous botany of the mountainous country, between the colony round Port Jackson and the settlement of Bathurst; being a portion of the result of observations made in the months October, November and December, 1822', in B Field (ed.), Geographical Memoirs on New South Wales, by various hands, John Murray, London, 335-36.

(7.) Lindley, J, 'Grevillea aquifolium', Three Expeditions into the Interior of Eastern Australia; With Descriptions of the Recently Explored Region of Australia Felix, and of the Present Colony of New South Wales, T & W Boone, London, 1838, 178.

(8.) Mueller, F J H, 'Proteaceae', Fragmenta Phytographiae Australiae, vol. 6, Government Printer, Melbourne, 1868, 205-13.

(9) Linn, S, and W Arber, 'Host specificity of DNA produced by Escherichia coli, X. In vitro restriction of phage fd replicate form', Proceedings of the National Academy of Sciences USA, vol. 59, no. 4, 1968, 1300-06.

(10) Kessler, C, 'Class II restriction endonucleases', in G Obe and A Basler (eds), Cytogenetics: Basic and Applied Aspects, Springer-Verlag, Berlin, 1987, 225-79.

(11) Sanger, F, S Nicklen and R Coulson, 'DNA sequencing with chainterminating inhibitors', Proceedings of the National Academy of Sciences USA, vol. 74, no. 12, 1977, 5463-67.

(12) Mullis, K B, 'The unusual origin of the Polymerase Chain Reaction', Scientific American , vol. 267, no. 4, 1990, 36-43.

(13) Venter, J C, M D Adams, EW Myers, et al, 'The sequence of the human genome', Science, vol. 291, no. 5507, 2001, 1304-51.

(14) International Human Genome Sequencing Consortium (IHGSC), 'Initial sequencing and analysis of the human genome', Nature, vol. 409, 2001, 860-921.

(15) Kornberg, A and TA Baker, 'Chapter 1: DNA structure and function', DNA Replication, second edition, W H Freeman and Company, New York, 1992, 1-52.

(16) Knox, B, P Y Ladiges, B Evans and A Hardham, 'Chapter 10: Genes, chromosomes and DNA', in B Knox, P Y Ladiges and B Evans (eds), Biology, McGraw-Hill Australia, Sydney, 1997, 180-98.

(17) Franklin, R E and RG Gosling, 'Molecular configuration in sodium thymonucleate', Nature, vol. 171, no. 4356, 1953, 740-41.

(18) Chargaff, E, E Vischer, R Doniger, et al, 'The composition of the desoxypentose nucleic acids of thymus and spleen', Journal of Biological Chemistry, vol. 177, no. 1, 1949, 405-16.

(19) Vischer, E, S Zamenhof and E Chargaff, 'Microbial nucleic acids: the desoxypentose nucleic acids of avain tubercule bacilli and yeast', Journal of Biological Chemistry, vol, 177, no. 1, 1949, 429-38.

(20) Chargaff, E, 'Some recent studies on the composition and structure of nucleic acids', Journal of Cellular and Comparative Physiology, vol. 38 (supplement 1), 1951, 41-59.

(21) Chargaff, E, 'How genetics got a chemical education', Annals of the New York Academy of Sciences, vol. 325, 1979, 345-60.

(22) Griffiths, A J F, J H Miller, D T Suzuki, et al, 'Chapter 11: The structure of DNA', An Introduction to Genetic Analysis, W H Freeman and Company, New York, 1998, 313-40.

(23) Bertani, G and J J Weigle, 'Host controlled variation in bacterial viruses', Journal of Bacteriology, vol. 65, no. 2, 1953, 113-21.

(24) Luria, S E and M L Human, 'A nonhereditary, host-induced variation of bacterial viruses', Journal of Bacteriology, vol. 64, no. 4, 1952, 557-69.

(25) Meselson, M and R Yuan, 'DNA Restriction Enzyme from E. coli', Nature, vol. 217, no. 5134, 1968, 1110-14.

(26) Kornberg, A and T A Baker, 'Chapter 13: Deoxyribonucleases', DNA Replication, second edition, W H Freeman and Company, New York, 1992, 403-37.

(27) Weising, K, H Nybom, K Wolff and W Meyer, 'Chapter 2: Genetic variation at the DNA level', DNA Fingerprinting in Plants and Fungi, CRC Press, Boca Raton, 1995, 3-35.

(28) Parker, P G, A A Snow, M D Schug, et al, 'What molecules can tell us about populations: choosing and using a molecular marker', Ecology, vol. 79, no. 2, 1998, 361-82.

(29) Nathans, D and HO Smith, 'Restriction endonucleases in the analysis and restructuring of DNA molecules', Annual Review of Biochemistry, vol. 44, 1975, 273-93.

(30) Smith, H O and K W Wilcox, 'A restriction enzyme from Hemophilus influenzae', Journal of Molecular Biology, vol. 51, 1970, 379-91.

(31) Kelly, T J and H O Smith, 'A restriction enzyme from Hemophilus influenzae. II. Base sequence of the recognition site', Journal of Molecular Biology, vol. 51, 1970, 393-409.

(32) Roberts, J R, 'Restriction and modification enzymes and their recognition sequences', Nucleic Acids Research, vol. 12, 1984, r167-r204.

(33) Konforti, B, 'The servant with the scissors', Nature Structural and Molecular Biology, vol. 7, no. 2, 2000, 99-100.

(34) Arber, Wand D Dussoix, 'Host specificity of DNA produced by Escherichia coli. I Host controlled modification of bacteriophage l', Journal of Molecular Biology, vol. 5, 1962, 18-36.

(35) Danna, K and D Nathans, 'Specific cleavage of simian virus 40 DNA by restriction endonuclease of Hemophilus influenzae', Proceedings of the National Academy of Sciences USA, vol. 68, no. 12, 1971, 2913-17.

(36) Roberts, R J, 'How restriction enzymes became the workhorses of molecular biology', Proceedings of the National Academy of Sciences USA, vol. 102, no. 17, 2005, 5905-08.

(37) Avise, J C, 'Chapter 3: Molecular techniques', Molecular Markers, Natural History, and Evolution, second edition, Sinauer Associates, Sunderland, 2004, 55-114.

(38) Hillis, D M, B K Mable, A Larson, et al, 'Chapter 9: Nucleic acids IV: Sequencing and cloning', in D M Hillis, C Moritz and B K Mable (eds), Molecular Systematics, second edition, Sinauer Associates, Sunderland, 1996, 321-84.

(39) Brown, T A, 'DNA sequencing: The basics', R Beynon, T A Brown and C Howe (eds), The Basics, Oxford University Press, Oxford, 1994.

(40) Maxam, A M and W Gilbert, 'A new method for sequencing DNA', Proceedings of the National Academy of Sciences USA, vol. 74, no. 2, 1977, 560-64.

(41) Sanger, F, 'Determination of nucleotide sequences in DNA, Science, vol. 214, no. 4526, 1981, 1205-10.

(42) Lowe, A, S Harris and P Ashton, 'Chapter 2: Markers and sampling in ecological genetics', Ecological Genetics: Design, Analysis and Application, Blackwell Science, Carlton, 2004, 6-51.

(43) Metzker, M, 'Emerging technologies in DNA sequencing', Genome Research, vol. 15, no. 12, 2005, 1767-76.

(44) Nunnally, B K, 'Chapter 1: Introduction to DNA sequencing: Sanger and beyond', in B K Nunnally (ed.), Analytical Techniques in DNA Sequencing, Taylor & Francis, Boca Raton, 2005, 1-12.

(45) Ronaghi, M, M Uhlen and P Nyren, 'A sequencing method based on real-time pyrophosphate', Science, vol. 281, 1998, 363-64.

(46) Ronaghi, M, S Karamohamed, B Pettersson, et al, 'Real-time DNA sequencing using detection of pyrophosphate release', Analytical Biochemistry, vol. 242, 1996, 84-89.

(47) Ronaghi, M, 'Pyrosequencing sheds light on DNA sequencing', Genome Research, vol. 11, 2001, 3-11.

(48) Ahmadian, A, M Ehn and S Hober, 'Pyrosequencing: history, biochemistry and future', Clinica Chimica Acta, vol. 363, 2006, 83-94.

(49) Nyren, P, 'The history of pyrosequencing', Methods in Molecular Biology, vol. 373 , 2007, 1-13.

(50) 454 Life Sciences, '454 Life Sciences - Enabling Technology', http://, accessed 15 July 2008.

(51) Avise, J C, 'Chapter 2: The history of interest in genetic variation', Molecular Markers, Natural History, and Evolution, second edition, Sinauer Associates, Sunderland, 2004, 23-54.

(52) Mullis, K B, 'Chapter 35: PCR and scientific invention: The trial of DuPont vs. Cetus', in K B Mullis, F Ferre and R A Gibbs (eds), The Polymerase Chain Reaction, Birkhauser, Boston, 1994, 427-41.

(53) White, T J, N Arnheim and H A Erlich, 'The polymerase chain reaction', Trends in Genetics, vol. 5, no. 6, 1989, 185-89.

(54) Mullis, K B, F Faloona, S Scharf, et al, 'Specific enzymatic amplification of DNA in vitro: The polymerase chain reaction', Cold Spring Harbour Symposia on Quantitative Biology, vol. 51, 1986, 263-73.

(55) Mullis, K B and F A Faloona, 'Specific synthesis of DNA in vitro via a polymerase-catalyzed chain reaction', Methods in Enzymology, vol. 155, 1987, 335-50.

(56) Saiki, R K, S Scharf, F Faloona, et al, 'Enzymatic amplification of b-Globin genomic sequences and restriction site analysis for diagnosis of sickle cell anemia', Science, vol. 230, no. 4732, 1985, 1350-54.

(57) Erlich, H A, D Gelfand and J J Sninsky, 'Recent advances in the polymerase chain reaction', Science, vol. 252, 1991, 1643-51.

(58) Innis, M A, K B Myambo, D H Gelfand and M A D Brow, 'DNA sequencing with Thermus aquaticus DNA polymerase and direct sequencing of polymerase chain reaction-amplified DNA', Proceedings of the National Academy of Sciences USA, vol. 85, no. 24, 1988, 9436-40.

(59) Lawyer, F C, S Stoffel, R K Saiki, et al, 'Isolation, characterization, and expression in Escherichia coli of the DNA polymerase gene from Thermus aquaticus', The Journal of Biological Chemistry, vol. 264, no. 11, 1989, 6427-37.

(60) Saiki, R K, D H Gelfand, S Stoffel, et al, 'Primer-directed enzymatic amplification of DNA with a thermostable DNA polymerase', Science, vol. 239, 1988, 487-91.

(61) Arnheim, N, T White and W E Rainey, 'Application of PCR: Organismal and population biology', Bioscience, vol. 40, no. 3, 1990, 174-82.

(62) Kocher, T D, W K Thomas, A Meyer, et al, 'Dynamics of mitochondrial evolution in animals: Amplification and sequencing with conserved primers', Proceedings of the National Academy of Sciences USA, vol. 86, 1989, 6196-200.

(63) Litt, M and J A Luty, 'A hypervariable microsatellite revealed by in vitro amplification of a dinucleotide repeat within the cardiac muscle actin gene', American Journal of Human Genetics, vol. 44, no. 3, 1989, 397-401.

(64) Tautz, D, 'Hypervariability of simple sequences as a general source for polymorphic DNA markers', Nucleic Acids Research, vol. 17, no. 16, 1989, 6463-71.

(65) Weber, J L and P E May, 'Abundant class of human DNA polymorphisms which can be typed using the polymerase chain reaction', American Journal of Human Genetics, vol. 44, no. 3, 1989, 388-96.

(66) Organ, C L, M H Schweitzer, W Zheng, et al, 'Molecular phylogenetics of mastodon and Tyrannosaurus rex', Science, vol. 320, 2008, 499.

(67) Birt, T P and A J Baker, 'Chapter 3: Polymerase chain reaction,' A J Baker (ed.), Molecular Methods in Ecology, Blackwell Science Ltd., Oxford, 2000, 50-64.

(68) Australian Plant Name Index (APNI), databases/apni-search-full.html, accessed 2007.

(69) Makinson, R O, 'Grevillea', Flora of Australia, CSIRO Publishing, Melbourne, 1-460. Melbourne: CSIRO Publishing, 2000.

(70) McGillivray, D J and R O Makinson, Grevillea, A Taxonomic Revision, Melbourne University Press, Melbourne, 1993.

(71) Olde, P and N Marriott, The Grevillea Book, vol. 2, Kangaroo Press, Kenthurst, 1995.

(72) Makinson, R O, 'Grevillea', in N G Walsh and T J Entwisle (eds), Flora of Victoria, vol. 3 (Dicotyledons: Winteraceae to Myrtaceae), Inkata Press, Melbourne, 1996.

(73) Armour, J A L, S A Alegre, S Miles, et al, 'Minisatellites and mutation processes in tandemly repetitive DNA', in D B Goldstein and C Schlotterer (eds), Microsatellites: Evolution and Applications, Oxford University Press, Oxford, 1999, 24-33.

(74) Wang, Z, J L Weber, G Zhong and S D Tanksley, 'Survey of plant short tandem DNA repeats', Theoretical and Applied Genetics, vol. 88, no. 1, 1994, 1-6.

(75) Goldstein, D B and C Schlotterer (eds), Microsatellites: Evolution and Applications, Oxford University Press, New York, 1999.

(76) England, P R, A V Usher, R J Whelan and D J Ayre, 'Microsatellite diversity and genetic structure of fragmented populations of the rare, fire-dependent shrub Grevillea macleayana', Molecular Ecology, vol. 11, 2002, 967-77.

(77) Whelan, R J, D G Roberts, P R England and D J Ayre, 'The potential for genetic contamination vs. augmentation by native plants in urban gardens', Biological Conservation, vol. 128, 2006, 496-500.

(78) Hoebee, S E, Conservation Genetics of the Endangered Shrub Grevillea iaspicula McGill. (Proteaceae), PhD Thesis, Australian National University, 2002.

(79) The Arabidopsis Genome Initiative, 'Analysis of the genome sequence of the flowering plant Arabidopsis thaliana', Nature, vol. 408, no. 6814, 2000, 796-815.

(80) Theologis, A, J R Joseph, R Ecker, C J Palmk, et al, 'Sequence and analysis of chromosome 1 of the plant Arabidopsis thaliana', Nature, vol. 408, no. 6814, 2000, 816-20.

(81) Salanoubat, M, K Lemcke, M Rieger, et al, 'Sequence and analysis of chromosome 3 of the plant Arabidopsis thaliana', Nature, vol. 408, no. 6814, 2000, 820-22.

(82) Tabata, S, T Kaneko, Y Nakamura, et al, 'Sequence and analysis of chromosome 5 of the plant Arabidopsis thaliana', Nature, vol. 408, no. 6814, 2000, 823-26.

(83) Mayer, K, C Schuller, R Wambutt, et al, 'Sequence and analysis of chromosome 4 of the plant Arabidopsis thaliana', Nature, vol. 402, no. 6763, 1999, 769-77.

(84) Lin, X, S Kaul, S Rounsley, et al, 'Sequence and analysis of chromosome 2 of the plant Arabidopsis thaliana', Nature, vol. 402, no. 6763, 1999, 761-68.

(85) Miyamoto, M M and J Cracraft, 'Chapter 1: Phylogenetic inference, DNA sequence analysis, and the future of molecular systematics', in M M Miyamoto and J Cracraft (eds), Phylogenetic Analysis of DNA Sequences, Oxford University Press, New York, 3-17.

(86) Avise, J C, 'Preface to the first edition', Molecular Markers, Natural History and Evolution, second edition, Sinauer Associates, Sunderland, 2004.

(87) Bhattramakki, D and A Rafalski, 'Chapter 12: Discovery and application of single nucleotide polymorphism markers in plants', in R J Henry (ed.), Plant Genotyping: The DNA Fingerprinting of Plants, CABI, New York, 2001, 179-92.

(88) Kruglyak, L, 'The use of a genetic map of biallelic markers in linkage studies', Nature Genetics, vol. 17, 1997, 21-24.

Trisha Lee Downing

COPYRIGHT 2008 University of Melbourne Postgraduate Association
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2008 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Downing, Trisha Lee
Publication:Traffic (Parkville)
Article Type:Report
Geographic Code:1USA
Date:Jan 1, 2008
Previous Article:Another country.
Next Article:Re-imagining the female hysteric: Helene Cixous' Portrait of Dora.

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters