Printer Friendly

Does the high gene density in the sponge NK homeobox gene cluster reflect limited regulatory capacity?

Introduction

Our understanding of the evolutionary transition to metazoan multicellularity is currently being transformed by comparative analyses of whole genomes (e.g., Putnam et al., 2007; King et al., 2008). To understand the events leading to the emergence of the most recent common ancestor to all living animals, comparisons need to be made between metazoans and close relatives within the Holozoa, which include choanoflagellates and members of related opisthokont lineages (e.g., Cavalier-Smith et al., 1996; King and Carroll, 2001; Snell et al., 2001; Lang et al., 2002; Burger et al., 2003; Cavalier-Smith and Chao, 2003; King et al., 2003; Steenkamp et al, 2006). The Metazoa consists of at least two ancient lineages of extant animals, the Eumetazoa (cnidarians, placozoans, and bilaterian phyla) and phylum Porifera (sponges) (e.g., Cavalier-Smith et al., 1996; Borchiellini et al., 2001; Medina et al., 2001; Collins, 2002; Wallberg et al., 2004). The recent sequencing of the genomes of the choanoflagellate Monosiga brevicollis (King et al., 2008), the sponge Amphimedon queenslandica, the cnidarians Nematostella vectensis (Putnam et al., 2007) and Hydra magnipapillata, and the placozoan Trichoplax adhaerens is allowing us to decipher the genomic events underpinning the transition to metazoan multicellularity.

It appears that a raft of genomic innovations occurred in the stem lineage leading to the metazoan last common ancestor. Sponges possess metazoan-specific transcription factors and signaling pathways (Larroux et al., 2006; Nichols et al., 2006; Adamska et al., 2007a, b; Larroux et al., 2007, 2008) whose orthologs populate the developmental gene regulatory networks (GRNs) underlying bilaterian embryogenesis (Davidson, 2006, and references therein). In contrast, the Monosiga brevicollis genome contains only a very small subset of these gene classes, despite encoding many cell adhesion and communication protein domains that are otherwise restricted to metazoans (King and Carroll, 2001; King et al., 2003, 2008; King, 2004; Larroux et al., 2008). Amphimedon possesses a large majority of transcription factor gene classes that have previously been found only in eumetazoans, including homeobox genes belonging to ANTP, prd-like, Pax, POU, LIM-HD, Six, and TALE classes, as well as basic helix-loop helix (bHLH), Rel, nuclear receptor, Mef2, Ets, Sox, T-box, and Fox genes (Larroux et al., 2006, 2007, 2008; Simionato et al., 2007; Gauthier and Degnan, 2008). Likewise, genes encoding components of all the major eumetazoan developmental signaling pathways are present in the Amphimedon genome (Adamska et al., 2007a, b; Adamska, Richards, Gauthier, and Degnan, unpub1.). Together, these observations indicate that the repertoire of gene classes comprising the developmental regulatory toolkit evolved before the divergence of sponge and eumetazoan lineages. The genesis of many of these metazoan-specific gene classes may have provided the molecular preadaptations that enabled the evolution of metazoan development and multicellularity. On the basis of the number of developmental gene families represented in the sponge genome, it is not hard to envisage the last common ancestor to all extant metazoans being developmentally and morphologically complex (Adamska et al., 2007a; Degnan and Degnan, 2006; Degnan et al., 2005).

Despite the qualitative conservation of the metazoan developmental "regulome" between sponges and eumetazoans, there must be inherent differences in the genomes of the ancestors that gave rise to these lineages. Why is it that the eumetazoan lineage consists of a wide diversity of body plans while sponge morphologies represent modifications of a unique aquiferous body plan that lacks cellular and morphological features found elsewhere in the animal kingdom (Simpson, 1984)? Why has the sponge body plan remained relatively unchanged since well before the Cambrian (Li et al., 1998)? These fundamental differences in complexity and diversity should be manifested in extant genomes. One obvious difference is that representative eumetazoans have a much larger number of developmental genes compared to Amphimedon, with individual transcription factor and signaling pathway gene classes and families having expanded differentially early in the eumetazoan lineage (Kusserow et al., 2005; Magie et al., 2005; Miller et al., 2005; Chourrout et al., 2006; Kamm et al., 2006; Ryan et al., 2006; Larroux et al., 2007, 2008; Putnam et al., 2007; Yamada et al., 2007). This increase in gene repertoire through duplication and divergence may have allowed for (i) the expansion of ancient GRNs that originally would have underpinned the primary generation and patterning of differentiated cell types in the first metazoan embryos, and (ii) the co-option of duplicates into novel developmental roles. Cnidarians and bilaterians have strikingly similar gene memberships (Kusserow et al., 2005; Magie et al., 2005; Miller et al., 2005; Chourrout et al., 2006; Kamm et al., 2006; Ryan et al., 2006; Larroux et al., 2007; Putnam et al., 2007; Yamada et al., 2007, 2008), suggesting that body plan complexity cannot solely be attributed to the growth in gene class and family size. Another possibility is that individual genes expanded their regulatory systems, allowing for their expression in multiple developmental contexts (i.e., individual genes were co-opted into new developmental roles). This increase in regulatory information is likely to manifest as an increase in the length of DNA sequence responsible for the regulation of a particular gene.

Critical cis-regulatory DNA sequences tend to be located in the vicinity of the transcription start site, often just upstream, although other regulatory information can be localized at a great distance from the coding region (reviewed in Davidson, 2006). These sequences act as binding sites for sequence-specific transcription factors that directly control the activation and repression of transcription. Detailed experimental analyses of putative regulatory regions of particular genes in a handful of animals (i.e., sea urchin, ascidian, Caenorhabditis elegans, Drosophila, and model vertebrates) have defined the role of proximal and distant cis-regulatory elements in controlling gene expression. While such analyses currently are not possible in Amphimedon, we can use expression patterns as a proxy to predict the complexity of the regulatory information for a given gene. In bilaterians, most transcription factor genes are used in a number of developmental contexts, with each context requiring its own cis-regulatory module (Davidson, 2006). Here, we explore this concept using a set of NK homeobox genes that are clustered in the genome of Amphimedon (Larroux et al, 2007). NK genes are used at all levels of the bilaterian developmental program, from early germ layer formation to terminal differentiation, and are involved in cell fate determination and differentiation, patterning and morphogenetic processes (reviewed in Banerjee-Basu and Baxevanis, 2001; Jagla et al., 2001; Carroll et al., 2005; Garcia-Fernandez, 2005; Stanfel et al., 2005; Slack, 2006). They are expressed in all three germ layers, and their roles in mesodermal and nervous system development are well studied.

By comparing the upstream intergenic regions of NK genes in Amphimedon, the cnidarian Nematostella vectensis, and various bilaterians covering a range of genome sizes, we show that the Amphimedon genes have markedly smaller intergenic regions. To test whether this trend is specific to developmental genes, we also performed these analyses on five non-homeobox structural genes, which are clustered with the NK genes in Amphimedon. No significant differences were detected for these genes between different organisms, suggesting that the size difference in regulatory regions between sponges and eumetazoans applies specifically to regulatory genes that may belong to developmental GRNs. The relatively simple expression patterns of a selection of Amphimedon NK genes during embryogenesis support the contention that the small intergenic regions may reflect the limited cis-regulatory information associated with these genes. Genome-wide studies in Drosophila and C. elegans have made a similar correlation between gene regulatory complexity and intergenic sequence size (Nelson et al., 2004). Although this case study is restricted to a very small number of genes and thus may not reflect broad genomic principles, it does present a supposition that can be addressed with genome-wide comparative analyses.

Materials and Methods

Analysis of the Amphimedon NK cluster

The Amphimedon cluster of NK homeobox genes was initially assembled using an in-house assembly and scaffolding pipeline as described in Larroux et al. (2007). A draft genome assembly from the US Department of Energy Joint Genome Institute was later consulted to confirm the in-house assembly. We employed the AUGUSTUS gene prediction program to further systematically evaluate this assembly (Stanke et al., 2006) and manually modified models to incorporate regions of homology suggested by BLASTx alignments to sequences in the National Center for Biotechnology Information database. Where available, expressed sequence tags (ESTs) and 5' and 3' RACE sequences (homeobox genes only) were used to confirm coding sequences and to derive 5' and 3' untranslated regions.

Analysis of the expression of NK-related genes

Specimens of Amphimedon queenslandica Hooper and van Soest, 2006 (Porifera, Demospongiae, Haplosclerida, Niphatidae) were collected on Heron Island Reef, Great Barrier Reef, Australia, as described in Leys and Degnan (2002), and in situ hybridization was performed as described in Larroux et al. (2006). Detailed protocols and probe details are available upon request.

Comparative genomics

Sequence alignments (tBlastn) using homeodomains of proteins belonging to NK2, NK3, NK4, NK5, NK6, NK7, Msx, Hex, and Tlx families were derived from Lottia gigantea, Branchiostoma floridae, and Ciona intestinalis genome contigs available at http://genome.jgi-psf.org/ and from Caenorhabditis elegans and Drosophila melanogaster genomes at WormBase (http://www.wormbase.org/) and FlyBase (http://www.flybase.org/), respectively. Genes belonging to these NK families were previously characterized in the Nematostella vectensis genome by Ryan et al. (2006; see also the phylogenetic analysis in Larroux et al., 2007). Similarly, we searched these genomes (and the Nematostella vectensis genome) for orthologs of kinesin 2 KIF3B/C, tetracyclin resistance, Vacuolar Protein Sorting 8, Inositol Polyphosphate-5-Phosphatase A, and kinesin 5 KIF 11. Identified sequences were classified into the different families using BlastP and/or a neighbor-joining phylogenetic tree (data not shown). Upstream intergenic region length and orientation of upstream gene were then determined through visual inspection of gene models in the genome browsers.

Results and Discussion

Structure and composition of the Amphimedon NK cluster

The demosponge Amphimedon queenslandica has 8 NK genes in its genome, of which 6 are clustered within a 71-kb stretch of DNA (Fig. 1; Larroux et al., 2007). As this cluster is similar to that found in bilaterians, we infer that this is a metazoan synapomorphy. There is no evidence of other ANTP class homeobox genes--Hox, ParaHox and EHG-box--in the sponge genome, thus we also hypothesize that the eumetazoan ProtoHox gene originated from within the ancestral NK cluster after the divergence of sponge and eumetazoan lineages. The Amphimedon NK cluster consists of NK2/3/4-, Msx-, Hex-, Tlx-, and two NK5/6/7-related genes, one of which encodes two homeodomains and the other one of which has two different splice forms (Fig. 1). This cluster is markedly smaller than the inferred ancestral bilaterian cluster (Luke et al., 2003; Garcia-Fernandez, 2005; Larroux et al., 2007) and has a number of non-homeobox genes located within it. To illustrate this difference in size, we mapped the genomic regions corresponding to two conserved gene linkages, Tlx and NK5/6/7 (Fig. 2A) and Msx and Hex (Fig. 2B) in Amphimedon and in the chordate Branchiostoma floridae. Between Tlx and NK5/6/7, there are more genes in Amphimedon than in Branchiostoma (5 vs. 2) and the distance between the two genes is shorter in Amphimedon than in Branchiostoma (26 kb vs. 86 kb). Branchiostoma has two Hex-Msx clusters probably resulting from duplication of the genomic region. Not only is the genomic region between the two genes larger in this chordate than in Amphimedon (195-220 kb vs. 25 kb), but it also contains many more genes (11-15 vs. 4). A striking feature of the Amphimedon cluster is the high density of genes and the limited amount of intergenic DNA. In terms of the homeobox genes AmqNK2/3/4, AmqMsx, AmqHex, AmqTlx, AmqNK5/6/7A, and AmqNK5/6/7B, the estimated upstream intergenic regions range in size from 33 to 973 bp (Fig. 1; Appendix Table A1). This size r ange is similar for the non-homeobox genes in and flanking the Amphimedon NK cluster (227 to 1685 bp; Fig. 1; Appendix Table A2).
Appendix

Table A 1

Orientation and length of upstream intergenic: regions of NK genes in
Amphimedon (Amq), Nematostella (Nv), Lottia (Lg), Drosophila (Dm),
C. elegans (Ce), Branchiostoma (Bf), and Ciona (Ci)

  Gene name              JGI gene model          Upstream     Upstream
                                                   gene      intergenic
                                                orientation   distance
                                                   (bp)         (bp)

AmqNK2-4                                            +              190

AmqMsx                                              +              758

AmqHex                                              +              973

AmqTlx                                              +              187

AmqNK5-7al                                          +               33

AmqNK5-7a2                                          +              558

AmqNK5-7b                                           +              317

NvNK2a        gw. 243.78.1                          -            7,000

NvNK2b        estExt_GenewiseH_l.C_2430020          +           10,743

NvNK2c        e_gw. 243.57.1                        +           30,984

NvNK2d        e_gw.243.50.1                         -            7,000

NvNK3         fgenesbl_pg.scaffold_87000063         +            3,512

NvNK4         fgeneslil_pg.scaffold_87000064        -           20,976

NvMsx         e_gw.6.245.1                          -           17,077

NvHhex        gw. 12.267.1                          +            6,441

NvHD017       gw. 105.4.1                           -            2,562
(Tlx)

NvHD023       gw. 124.34.1                          +           10,983
(Tlx)

NvHD032       e_gw. 124.100.1                       +           25,454
(Tlx)

NvHD042       gw.124.101.1                          +            3,220
(Tlx)

NvHD043       fgeneshl_pg.scaffold_91000054         +              170
(Tlx)

NvHD071       gw.124.99.1                           +           12,033
(Tlx)

NvHD076       gw.124.32.1                           +            5.447
(Tlx)

NvHD102       e_gw.91.2.l                           +            2,876
(Tlx)

NvHDI47       gw.17.372.1                           +            6,716
(Tlx)

NvHmx (NK5)   e_gw.6.279.1                          -           18,209

NvNK6         e_gw.464.8.1                          -            3,631

NvNK7         gw.464.14.1                           -           11,289

LgNK2a        gwl.45.125.1                          -           33.879

LgNK2b        gwl.29.306.1                          +           22,920

LgNK3         gwl.21.46.1                           +           47,718

LgNK4         e_gwl. 21.263.1                       +           22,579

LgMsxa        gwl.122.88.1                          +           18,352

LgMsxb        e_gw1.122.24.I                        -           22,282

LgMsxc        gwl.122.87.1                          +           19,195

LgHex         e_gwl.63.126.1                        -              996

LgTlxa        fgenesh2_pg.C_sca_40000128            +           14,861

LgTkb         e_gwl.40.172.1                        +           12,770

LgNK5         gwl.263.3.1                           +           31,538

LgNK6         e_gwl.l9.20.1                         +           21,437

LgNK7         gwl.88.121.1                          -           10,222

Dm Dr (Msx)                                         +           39,903

Dm CG7056                                           +            2,353
(Hex)

Dm tin (NK4)                                        -            1,579

Dm bap (NK3)                                        +            6,912

Dm vnd (NK2)                                        +            4,820

Dm scro                                             +              471
(NK2)

DmC15 (Tlx)                                         +              164

Dm Hmx (NK5)                                        -           12,421

Dm HGTX                                             -           19,474
(NK6)

Dm NK7.1                                            +            7,432

Ce ceh-22                                           +            1,859
(NK2)

Ce ceh-24                                           -            6,070
(NK2)

Ce ceh-27                                           -            3,378
(NK2)

Ce ceh-28                                           -            6,344
(NK4)

Ce pha-2                                            +            5,526
(Hex)

Ce vab-15                                           -            9,258
(Msx)

Ce mls-2                                            +            5,652
(NK5)

Ce                                                  +           19,763
cog-1(NK6)

Ce ceh-9                                            +            6,046
(NK7)

BfNK2a        estExt_fgenesh2 Jcg.C_ 1360002        -           23,622

BfNK2b        fgenesh2_pg.scaffold_136000026        +            2,474

BfNK2c        fgenesh2_kg.scaffold_52000001         +           32,765

BiNK2d        estExt_fgenesh2_Pg._C520073           -           17,035

BfNK3         esfExt_fgenesh2_pg.C_1430013          +            9,209

BfNK4         estExt_fgenesh2_Pg. C_1430014         -           12,523

BfMsxa        estExt_gwp.C_540071                   -            4,186

BfMsxb        fgenesh2_pg. scaffold_56000154        -           17,323

BfMsxc        fgenesh2_pg.scaffold_56000155         +              382

BfHexa        e_gw.54.243.1                         +           30,834

BfHexb        e_gw.56.215.1                         -           67,144

BfTlx         fgenesh2_pg. scaffold_294000030       -           55,425

BfNK5         e_gw.406.46. 1                        +           12,910

BfNK6         estExt_GenewiseH_1.C_2940047          -           21,848

BtNK7         gw.294.35.1                           -           21,848

Ci-TTFl(NK2)  esfExt_genewisel.C_chr_10q0483        +            7,166

Ci-NK4        esfExt_fgenesh3_kg.C_chr_O8q0171      -            9,808

Ci-msxb       estExt_genewisel.C_chr_02ql758        +            2,376

Ci-Hex        TC58909 (scaff_88)                    -            7,094

Ci-Tlx        Ci0100l48267(chr_01q)                 +            3,709

Ci-NK5        ci0 100137765 (chr_08q)               +            3,439

G-NK6         ci0100136284 (chr_05q)                -            4,538

Table A2

Orientation and length of upstream intergenic regions of structural
genes in Amphimedon (Amq), Nematostella (Nv), Lottia (Lg), Drosophila
(Dm), C. elegans (Ce), Branchiostoma (Bf), and Ciona (Ci)

        Gene name *                     JGI gene model

AmqKTF3B/C
AmqTelrRes
AmqVPS8
Amqlpp
AmqKIFll

NvKIF3B/C                   estExt_gwp.C_790037
NvTetrRes                   estExt_GenewiseH_l .C_40135
NvVPS8                      fgeneshl_pg.scaffold_439000001
NvIppA                      fgeneshl_pg.scaffold_4000062
NvKIFl 1                    estExt_fgeneshl_pg.C_40115

LgKIF3B/C                   estExt_Genewise 1 Plus.C_sca_l 900046
LgTetrRes                   gwl.21.72.1
LgVPS8                      fgenesh2_pg.C_sca_30000085
LglppA                      estExt_GenewiselPlus.C_sca_190206
LgKIFl I                    estExt_fgenesh2_pg.C_sca_320118

Dm Klp68D-PA (KIF3B/C)
Dm TetrRes
Dm CG10144 (VSP8)
DmCG31110(IPP)
DmKlp61F-PA(KIFll)

Ce F20C5.2a (KIF3B/C)
Ce F10D7.2 (TetrRes)
CeC42C1.4a
Ce ipp-5 (IPP)
CeCOlBlO.3 (IPP)
Cebmk-I (KIF11)

B[pounds sterling]KIF3B/Ca  estExt_gwp.C_5080063
BfKIF3B/Cb                  estExt_fgenesh2_pg.CJ960003
BtTetrResa                  estExt_fgenesh2_pg.C_1430029
BfTetrResb                  estExt_gwp.C_4060030
BfVPS8a                     jgi|Braill |253940|e_gw.664.1.1
BtVPS8b
BflppAa                     fgenesh2jjg.scaffold_294000035
BflppAb                     fgenesh2_pg.scattold_205000005
BfKIFlla                    estExt_fgenesh2_pg.C_1850009
BfKIFllb                    estExt_fgenesh2_pg.C_1670112

Ci KIF3B                    estExt_fgenesh3_kg.C_chr_01 q0056
Ci TetrRes                  fgenesh3_kg.C_chr_03q000092
Ci VPS8                     fgenesii3_pg.C_chr_02q001219
Ci IppA                     fgenesh3_pm.C_chr_05q000005
CiKIFll                     e_gw 1.59.1.1

        Gene name *         Upstream gene orientation   Upstream
                                                       intergenic
                                                        distance
                                                           (bp)

AmqKTF3B/C                              +                     366
AmqTelrRes                              +                     339
AmqVPS8                                 +                     331
Amqlpp                                  -                     228
AmqKIFll                                -                    1685
NvKIF3B/C                               +                    1442
NvTetrRes                               -                     379
NvVPS8                                  +                     403
NvIppA                                  -                     379
NvKIFl 1                                -                     341
LgKIF3B/C                               -                    2499
LgTetrRes                               +                   10232
LgVPS8                                  -                     928
LglppA                                  -                    5964
LgKIFl I                                +                    2486
Dm Klp68D-PA (KIF3B/C)                  +                     189
Dm TetrRes                              -                     484
Dm CG10144 (VSP8)                       -                     358
DmCG31110(IPP)                          -                    1190
DmKlp61F-PA(KIFll)                      +                    1127
Ce F20C5.2a (KIF3B/C)                   -                    2823
Ce F10D7.2 (TetrRes)                    +                    3693
CeC42C1.4a                              +                     103
Ce ipp-5 (IPP)                          -                   11533
CeCOlBlO.3 (IPP)                        +                    3019
Cebmk-I (KIF11)                         -                     585
B[pounds sterling]KIF3B/Ca              -                     590
BfKIF3B/Cb                              -                     610
BtTetrResa                              -                   25846
BfTetrResb                              +                   21359
BfVPS8a                                 +                    4869
BtVPS8b                                 -                   20941
BflppAa                                 -                     797
BflppAb                                 -                     805
BfKIFlla                                -                     142
BfKIFllb                                -                     145
Ci KIF3B                                +                   21809
Ci TetrRes                              -                    1647
Ci VPS8                                 -                     585
Ci IppA                                 +                   12889
CiKIFll                                 +                   17516

* TetrRes, tetracyclin resistance; Ipp, inositol phosphatase.


A similar high gene density has been observed by Breter et al. (2003) for other clusters of non-homeobox genes in demosponges. These authors deduced that this high gene density was typical of the genome of the sponges they were studying--Suberites domuncula and Geodia cydonium--and inferred that these species' genomes consisted of about 300,000 genes. While we estimate that the Amphimedon genome has at least an order of magnitude fewer genes, we do find that in general it consists of clusters of tightly packed genes (unpubl. data), suggesting that this may be a common feature of demosponge genomes.

[FIGURE 2 OMITTED]

Representative NK-related genes are expressed in simple patterns during Amphimedon embryogenesis

The small intergenic regions of the Amphimedon NK genes suggest that there might be limited regulatory information available to drive complex patterns of developmental gene expression. We addressed this supposition by analyzing the embryonic expression patterns of three representative genes: AmqNK2/3/4, AmqTlx, and AmqNK5/6/7B (Fig. 3). Although patterning processes appear to be less complex in this sponge than in bilaterians (Adamska et al., 2007a,b), gastrulation, patterning, and differentiation events in Amphimedon give rise to a radially symmetrical parenchymella larva with 3 cell layers, at least 11 differentiated cell types, and a tissue-like structure, the photoreceptor pigment ring (Leys and Degnan, 2002). AmqNK2/3/4 transcripts are first detected during cleavage in a small number of micromeres per embryo (Fig.3A, B). At gastrulation these localize to the inner cell mass, where they remain (Fig. 3C, D). Throughout development, the number of cells that highly express this gene seems to be limited. AmqTlx expression is not detected before the spot stage (late gastrulation) by whole-mount in situ hybridization (Fig. 3E-H). At this stage, expression is predominantly in cells surrounding the ring (Fig. 3E, F). Later in development, there is no evidence of expression around the ring. Instead, AmqTlx transcripts are restricted to cells of the inner cell mass (Fig. 3G, H). AmqNK5/6/7B appears to be expressed in a slightly more complex pattern, with two populations of micromeres expressing this gene through cleavage, gastrulation, and larva formation (Fig. 31-M). These cells are fated to the outer layer, which contains a limited number of cell types (Leys and Degnan, 2002). During ring formation, AmqNK5/6/7B is activated in cells underlying the ring (Fig. 3J-M) that are clearly distinct from the micromeric expression. From these in situ hybridization data, we estimate that these three NK genes are expressed in 2-4 cell types during embryogenesis. The expression of AmqNK5/6/7B and AmqTlx in the vicinity of the pigment spot and ring during the formation of the ring suggests that these genes are playing a role in patterning this structure.

[FIGURE 3 OMITTED]

Bilaterian NK genes are expressed in all three germ layers and play a major role in mesoderm and nervous system development (reviewed in Banerjee-Basu and Baxevanis, 2001; Jagla et al., 2001; Carroll et al., 2005; Garcia-Fernandez, 2005; Stanfel et al., Slack, 2006). Given that Amphimedon NK genes are expressed in cells that do not have clear eumetazoan homologs, it is difficult to infer what the role of these homeobox genes in the last common ancestor to sponges and eumetazoans might have been. Nonetheless, it does appear that the sponge NK genes are expressed in a limited number of cell types and developmental stages, in contrast to what has been observed in representative bilaterians. More generally, expression patterns of multiple Amphimedon transcription factors and signaling molecules suggest that these genes do not have a multitude of roles during Amphimedon development (e.g., they are expressed in what appear to be a limited number of cell lineages throughout embryogenesis; Larroux et al., 2006; Adamska et al., 2007a, b; unpubl. data), indicating that regulatory genes may be less pleiotropic in sponges than in bilaterians.

Although we have suggested that whole-mount in situ hybridization data may be used as a proxy to deduce the complexity of the regulatory information controlling the expression of a given gene, a number of important caveats are associated with this approach. First, we only investigated embryogenesis and thus do not know how these genes are expressed during metamorphosis, in the juvenile, or in the adult. New expression patterns at any of these stages of the Amphimedon life cycle will require additional cis-regulatory modules. Second, without accurate cell lineage data for Amphimedon embryos, we cannot exclude the possibility that the cell type we are inferring to be constant between stages is actually changing. Evidence from the developmental expression of another NK-related gene, AmqBsh (formerly RenBsh), indicates that a single cell type does express the gene throughout embryogenesis (Larroux et al., 2006). In this case, AmqBsh is expressed in cells fated to become sclerocytes, which produce spicules and have a distinctive morphology (Leys and Degnan, 2002) and thus can be easily traced.

Comparative genomics of genes in the sponge NK cluster

To test whether there is a significant difference in the sizes of the upstream intergenic region of the eumetazoan orthologs of homeobox and non-homeobox genes in the Amphimedon NK cluster, we analyzed publicly available gene models for the cnidarian Nematostella vectensis and representative bilaterians: the lophotrochozoan Lottia gigantea, the ecdyzosoans Caenorhabditis elegans and Drosophila melanogaster, and the deuterostomes Branchiostoma floridae and Ciona intestinalis (Fig. 4; Appendix Tables A1 and A2). These bilaterian organisms were chosen to span the three major superphyletic lineages: lophotrochozoan; ecdysozoan; and deuterostome. Whereas the genomes of Lottia (360 Mbp), Nematostella (450 Mbp), and Branchiostoma (600 Mbp) are at least twice the size of the Amphimedon genome (~ 167 Mbp), the Drosophila and Ciona genomes are of a similar size to the sponge genome (170-180 Mbp) and C. elegans has a smaller genome (97 Mbp). We chose these latter three genomes to test whether smaller intergenic regions were merely an effect of smaller genome sizes. Gene duplication in the eumetazoan lineage meant that there were often a number of orthologs in each species for a given Amphimedon NK gene (Appendix Table A1).

We compared the intergenic lengths for each of the NK gene families. Where there is more than one ortholog, averages and standard deviations are shown (Fig. 4A). Comparison of these regions reveals that the Amphimedon NK genes have consistently smaller upstream intergenic regions than the orthologous genes from representative bilaterians. In total, the Amphimedon NK genes have significantly smaller intergenic regions compared to those from eumetazoans (see paired values for the Student's t-test in Table 1). As this is also the case for C. elegans, Drosophila, and Ciona, which have genomes that are smaller than or similar in size to those of Amphimedon, this trend does not appear to be a function of genome size but rather to reflect a fundamental difference between Amphimedon and these eumetazoan representatives. Although the Nematostella NK genes have generally smaller intergenic regions than those of their bilaterian orthologs, in total there is no significant difference (Fig. 4A; see paired t-test values in Table 1).
Table 1

Paired t-test values and significance levels: Amphimedon (Amq),
Nematostella (Nv), Lottia (Lg), Drosophila (Dm), C. elegans (Ce),
Branchiostoma (Bf), and Ciona (Ci)

                           Homeodomain

Comparison     t     degrees of freedom  significance level

Amq vs. Nv    4.24         25                 0.01
Amq vs. Dm    2.36         15               0.05 *
Amq vs. Ce    3.86         14                 0.01
Amq vs. Lg    5.85          8                 0.01
Amq vs. Bf    4.42         20                 0.01
Amq vs. Ci    4.98         12                 0.01
Nv vs. Dm     0.91         28                   NS
Nv vs. Ce     0.59         27                   NS
Nv vs. Lg     3.82         31                 0.01
Nv vs. Bf     3.19         33                 0.01
Nv vs. Ci     0.24         25                   NS
Dm vs. Ce     0.58         17                   NS
Dm vs. Lg     1.84         21                 0.1*
Dm vs. Bf     1.97         23                 0.1*
Dm vs. Ci     1.03         15                   NS
Ce vs. Lg     3.26         20                 0.01
Ce vs. Bf     2.86         22                 0.01
Ce vs. Ci     0.83         14                   NS
Lg vs. Bf     0.53         26                   NS
Lg vs. Ci     4.04         18                 0.01
Bf vs. Ci     3.31         20                 0.01

                        Non-Homeodomain

Comparison    t     degrees of freedom  significance level

Amq vs. Nv   0.003          8                   NS
Amq vs. Dm    0.23          8                   NS
Amq vs. Ce    1.78          9                   NS
Amq vs. Lg    2.26          8                  0.1
Amq vs. Bf    2.09         13                  0.1
Amq vs. Ci    1.78          8                   NS
Nv vs. Dm     0.27          8                   NS
Nv vs. Ce     1.79          9                   NS
Nv vs. Lg     2.28          8                  0.1
Nv vs. Bf     2.09         13                  0.1
Nv vs. Ci     1.78          8                   NS
Dm vs. Ce     1.74          9                   NS
Dm vs. Lg     2.23          8                  0.1
Dm vs. Bf     2.07         13                  0.1
Dm vs. Ci     1.77          8                   NS
Ce vs. Lg     0.34          9                   NS
Ce vs. Bf     1.06         14                   NS
Ce vs. Ci     1.49          9                   NS
Lg vs. Bf     0.85         13                   NS
Lg vs. Ci     1.42          8                   NS
Bf vs. Ci     1.10         13                   NS

NS, not significant.

* Due to high variance in Dm dataset.


On the whole, eumetazoan orthologs of the five structural genes linked to the Amphimedon NK cluster have smaller upstream intergenic distances than the NK ortholog (Fig. 4B; Appendix Table A2). Taken together, the distances of Amphimedon structural genes are not significantly different than those of eumetazoans (Fig. 4; see paired t-test values in Table 1). These data are compatible with a difference in proximal regulatory capacity between sponges and eumetazoans occurring specifically in genes belonging to developmental GRNs rather than being a genome-wide trend. Sponge regulatory genes thus appear to possess simpler proximal cis-regulatory control sequences upstream of their transcript start site than their eumetazoan counterparts. Clearly, the supposition that the Amphimedon genome houses less regulatory information per developmental gene than eumetazoan genomes needs to be investigated on a much larger scale, with the inclusion of data from a wider sample of gene classes and families. At this stage, equally parsimonious interpretations of the data presented here are that (i) there has been a compaction and simplification of upstream regulatory region DNA in the sponge lineage, and (ii) a bulk of cis-regulatory DNA is housed elsewhere in the genome. Whether there has been expansion of intergenic region in the eumetazoan lineage or reduction in the sponge lineage, there appears to be an interesting correlation between the size of intergenic DNA and morphological diversity and complexity. This observation supports genome-wide studies undertaken in Drosophila and C. elegans that reveal that the regulatory complexity of a given gene correlates positively with the amount of surrounding intergenic DNA (Nelson et al., 2004).

[FIGURE 4 OMITTED]

Conclusions

The evolution of a developmental gene regulatory network (GRN) may have been one of the prerequisites for the evolution of multicellularity and embryogenesis. Indeed, comparison of early-branching metazoan genomes reveals that the components required to assemble basic GRN--that is, conserved transcription factors and signaling pathways--evolved prior to metazoan cladogenesis. The duplication and divergence of genes encoding signaling pathway components and transcription factors early in eumetazoan evolution apparently allowed for ancient networks to expand and evolve, and for new networks to form. In this study, we have emphasized another mechanism by which simple GRNs may have increased in complexity during the transition from an ancestral metazoan to a eumetazoan grade of organization. Increases in the size and complexity of the cis-regulatory regions required to direct expression of GRN components may have allowed for the evolution of developmental gene pleiotropy by increasing the number of ontogenetic contexts in which a given gene is employed. Using a handful of NK genes clustered in the genome of the sponge Amphimedon, we provide support for the proposal that an increase in proximal, upstream cis-regulatory information of genes belonging to developmental GRNs occurred in the stem leading to the eumetazoan last common ancestor. Because this study provides no measure of cis-regulatory information housed elsewhere in the genome, we recognise that direct measures of intergenic region size may be at best a general indicator of regulatory complexity (cf. Nelson et al., 2004). Nonetheless, the relatively simple expression patterns of the Amphimedon NK genes during embryogenesis are compatible with the idea that these genes have limited regulatory information. Given that this is a single case study that is based on a simple comparison of upstream intergenic sizes, further comparisons and experiments are required before we can include cis-regulatory expansion as a primary force in the early evolution of metazoan morphological complexity.

Acknowledgments

We gratefully acknowledge the significant contribution and support of The US Department of Energy Joint Genome Institute in the production of Amphimedon (Reniera) genomic and EST sequences used in this study through the Community Sequencing Program. The research was supported by grants from the Australian Research Council to B.M.D.

Bryony Fahey *, Claire Larroux *, Ben J. Woodcroft *, AND Bernard M. Degnan [dagger]

School of Integrative Biology, University of Queensland, Brisbane QLD 4072, Australia

Received 7 January 2008; accepted 11 March 2008.

* These authors contributed equally to this paper.

[dagger] To whom correspondence should be addressed. E-mail: b.degnan@uq.edu.au

Abbreviations: GRN, gene regulatory network.

Literature Cited

Abril, J. F., and R. Guigo. 2000. gff2ps: visualizing genomic annotations. Bioinformatics 16: 743-744.

Adamska, M., S. M. Degnan, K. M. Green, M. Adamski, A. Craigie, C. Larroux, and B. M. Degnan. 2007a. Wnt and TGF-[beta] expression in the sponge Amphimedon queenslandica and the origin of metazoan embryonic patterning. PLoS ONE 2: el031.

Adamska, M., D. Q. Matus, M. Adamski, K. M. Green, M. Q. Martindale, and B. M. Degnan. 2007b. The evolutionary origin of hedgehog proteins. Curr. Biol. 17: R836-837.

Banerjee-Basu, S., and A. D. Baxevanis. 2001. Molecular evolution of the homeodomain family of transcription factors. Nucleic Acids Res. 29: 3258-3269.

Borchiellini, C., M. Manuel, E. Alivon, N. Boury-Esnault, J. Vacelet, and Y. Le Parco. 2001. Sponge paraphyly and the origin of Metazoa. J. Evol. Biol. 14: 171-179.

Breter, H. J., V. A. Grebenjuk, A. Skorokhod, and W. E. G. Muller. 2003. Approaches for a sustainable use of the bioactive potential in sponges: analysis of gene clusters, differential display of mRNA and DNA chips. Pp. 199-230 in Sponges: (Porifera), W. E. G. Muller, ed. Springer, Berlin.

Burger, G., L. Forget, Y. Zhu, M. W. Gray, and B. F. Lang. 2003. Unique mitochondrial genome architecture in unicellular relatives of animals. Proc. Natl. Acad. Sci. USA 100: 892-897.

Carroll, S. B., J. K. Grenier, and S. D. Weatherbee. 2005. From DNA to Diversity: Molecular Genetics and the Evolution, of Animal Design. Blackwell Science, Malden, MA.

Cavalier-Smith, T., and E. E. Y. Chao. 2003. Phylogeny of choanozo, apusozoaoa, and other protozoa and early eukaryote megaevolution. J. Mol. Evol 56: 540-563.

Cavalier-Smith, T., E. E. Chao, N. Boury-Esnault, and J. Vacelet. 1996. Sponge phylogeny, animal monophyly, and the origin of the nervous system: 18S rRNA evidence. Can. J. Zool. 74: 2031-2045.

Chourrout, D., F. Delsuc, P. Chourrout, R. B. Edvardsen, F. Rentzsch, E. Renfer, M. F. Jensen, B. Zhu, P. de Jong, R. E. Steele, and U. Technau. 2006. Minimal ProtoHox cluster inferred from bilaterian and cnidarian Hox complements. Nature 442: 684-687.

Collins, A. G. 2002. Phylogeny of Medusozoa and the evolution of cnidarian life cycles. J. Evol. Biol. 15: 418-432.

Davidson, E. H. 2006. The Regulatory Genome: Gene Regulatory Networks in Development and Evolution. Academic Press, New York.

Degnan, S. M., and B. M. Degnan. 2006. The origin of the pelagobenthic metazoan life cycle: what's sex got to do with it? Integr. Comp. Biol. 46: 683-690.

Degnan, B. M., S. P. Leys, and C. Larroux. 2005. Sponge development and antiquity of animal pattern formation. Integr. Comp. Biol. 45: 335-341.

Garcia-Fernandez, J. 2005. The genesis and evolution of homeobox gene clusters. Nat. Rev. Genet. 6: 881-892.

Gauthier, M., andB. M. Degnan. 2008. The transcription factor NF-kB in the demosponge Amphimedon queenslandica: insights on the evolutionary origin of the Rel homology domain. Dev. Genes Evol 218: 23-32.

Jagla, K., M. Bellard, and M. Frasch. 2001. A cluster of Drosophila homeobox genes involved in mesoderm differentiation programs. BioEssays 23: 125-133.

Kamm, K., B. Schierwater, W. Jakob, S. L. Dellaporta, and D. J. Miller. 2006. Axial patterning and diversification in the Cnidaria predate the Hox. system. Curr. Biol. 16: 1-7.

King, N. 2004. The unicellular ancestry of animal development. Dev. Cell 7: 313-325. King, N., and S. B. Carroll. 2001. A receptor tyrosine kinase from choanoilagcliates: molecular insights into early animal evolution. Proc: Natl. Acad. Sci. USA 98: 15032-15037.

King, N., C. T. Hittinger, and S. B. Carroll. 2003. Evolution of key cell signaling and adhesion protein families predates animal origins. Science 301: 361-363.

King, N., M. J. Westbrook, S. L. Young, A. Kuo, M. Abedin, J. Chapman, S. Fairclough, U. Hellsten, Y. Isogai, 1. Letunic, et at. 2008.

The genome of the choanoflagellate Monosiga brevicollis and the origin of metazoans. Nature 451: 783-788.

Kusserow, A., K. Pang, C. Sturm, M. Hrouda, J. Lentfer, H. A. Schmidt, U. Technau, A. von Haeseler, B. Hobmayer, M. Q. Martindale, and T. W. Holstein. 2005. Unexpected complexity of the Wnt gene family in a sea anemone. Nature 433: 156-160.

Lang, B. F., C. O'Kelly, T. Nerad, M. W. Gray, and G. Burger. 2002. The closest unicellular relatives of animals. Curr. Biol. 12: 1773-1778.

Larroux, C, B. Fahey, D. Liubicich, V. F. Hinman, M. Gauthier, M. Gongora, K. Green, G. Worheide, S. P. Leys, and B. M. Degnan. 2006. Developmental expression of transcription factor genes in a demosponge: insights into the origin of metazoan multicellularity. Evol. Dev. 8: 150-173.

Larroux, C, B. Fahey, S. M. Degnan, M. Adamski, D. S. Rokhsar, and B. M. Degnan. 2007. The NK homeobox gene cluster predates the origin of Hox genes. Curr. Biol. 17: 706-710.

Larroux, C, G. N. Luke, P. Koopman, D. S. Rokhsar, S. M. Shimeld, and B. M. Degnan. 2008. Genesis and expansion of metazoan transcription factor gene classes. Mol. Biol. Evol 25: 980-996.

Leys, S. P., and B. M. Degnan. 2002. Embryogenesis and metamorphosis in a haplosclerid demosponge: gastrulation and transdifferentiation of larval ciliated cells to choanocytes. Invertebr. Biol. 121: 171-189.

Li, C. W., J. Y. Chen, and T. E. Hua. 1998. Precanbrian sponges with cellular structures. Science 279: 879-882.

Luke, G. N., L. F. C. Castro, K. MeLay, C. Bird, A. Coutson, and P. W. H. Holland. 2003. Dispersal of NK homeobox gene clusters in amphioxus and humans. Proc. Natl. Acad. Sci. USA 100: 5292-5295.

Magie, C. R., K. Pang, and M. Q. Martindale. 2005. Genomic inventory and expression of Sox and Fox genes in the cnidarian Ne-matostella vectensis. Dev. Genes Evol. 215: 618-630.

Medina, M., A. G. Collins, J. D. Silberman, and M. L. Sogin. 2001. Evaluating hypotheses of basal animal phylogeny using complete sequences of large and small subunit rRNA. Proc. Natl. Acad. Sci. USA 98: 9707-9712.

Miller, D. J., E. E. Ball, and U. Technau. 2005. Cnidarians and ancestral genetic complexify in the animal kingdom. Trends Genet. 21: 536-539.

Nelson, C. E., B. M. Hersh, and S. B. Carroll. 2004. The regulatory content of intergenic DNA shapes genome architecture. Genome Biol. 5: R25.

Nichols, S. A., W. Dirks, J. S. Pearse, and N. King. 2006. Barly evolution of animal cell signaling and adhesion genes. Proc. Natl. Acad. Sci. USA 103: 12451-12456.

Putnam, N., M. Srivastava, U. Hellsten, B. Dirks, J. Chapman, A. Salamov, A. Terry, H. Shapiro, E. Lindquist, V. V. Kapitonov, et al. 2007. Sea anemone genome reveals ancestral euineu.i/oan gene repertoire and genomic organization. Science M'/: 85-94.

Ryan, J. F., P. M. Burton, M. E. Mazza, G. K. Kwong, J. C. Mullikin, and J. R. Finnerty. 2006. The cnidarian-bilaterian ancestor possessed at least 56 homeoboxes. Evidence from the starlet sea anemone, Nematostella vectensis. Genome Biol. 7. R64.

Simionato, E., V. Ledent, G. Richards, M. Thomas-Chollier, P. Kerner, D. Coornaert, B. M. Degnan, and M. Vervoort. 2007. Origin and diversification of the basic helix-loop-helix gene family in metazoans: insights from comparative genomics. BMC Evol. Biol. 7: 33.

Simpson, T. L. 1984. The Cell Biology of Sponges. Springer-Verlag, New York.

Slack, J. M. W. 2006. Essential Developmental Biology. Blackwell Publishing, Maiden, MA.

Snell, E. A., R. F. Furlong, and P. W. H. Holland. 2001. Hsp70 sequences indicate that choanoflagellates are closely related to animals. Curr. Biol. 11: 967-970.

Stanfel, M. N., K. A. Moses, R. J. Schwartz, and W. E. Zimmer. 2005. Regulation of organ development by the Nkx-homendomain factors: an Nkx code. Cell. Mol. Biol. 51: OL785-OL799.

Stanke, M., O. Schoffmann, B. Morgenstern, and S. Waaek. 2006. Gene prediction in eukaryotes with a generalized hidden Markov model that uses hints from external sources. BMC Bioinformatics 7: 62.

Steenkamp, E. T., J. Wright, and S. L. Baldauf. 2006. The protistan origins of animals and fungi. Mol. Biol. Evol. 23: 93-106.

Wallberg, A., M. Thollesson, J. S. Farris, and U. Jondelius. 2004. The phylogenetic position of the comb jellies (Ctenophora) and the importance of laxonomic sampling. Cladistics 20:558-578.

Yamada, A., K. Pang, M. Q. Martindale, and S. Tochinai. 2007. Surprisingly complex T-box gene complement in diploblastic metazoans. Evol. Dev. 9: 220-230.
COPYRIGHT 2008 University of Chicago Press
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2008 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Fahey, Bryony; Larroux, Claire; Woodcroft, Ben J.; Degnan, Bernard M.
Publication:The Biological Bulletin
Geographic Code:1USA
Date:Jun 1, 2008
Words:6427
Previous Article:Biological Bulletin virtual symposium: genomics of large marine metazoans.
Next Article:Cell-cell adhesion in the cnidaria: insights into the evolution of tissue morphogenesis.
Topics:

Terms of use | Privacy policy | Copyright © 2020 Farlex, Inc. | Feedback | For webmasters