A major protein precursor of zebra mussel (Dreissena polymorpha) Byssus: deduced sequence and significance.
The zebra mussel, Dreissena polymorpha (Pallas), is a freshwater bivalve indigenous to the river basins of the Black, Baltic, and Caspian seas. Recently, it was accidentally introduced into one of the Great Lakes, and in less than 10 years, its distribution has expanded into the lakes and rivers of at least a third of the North American continent (Johnson and Padilla, 1996). The economic impact of this expansion has been profound and is due, in large part, to fouling (Roberts, 1990). Zebra mussels foul by attaching opportunistically and in large numbers to a wide variety of surfaces by means of a thread-like structure known as a byssus (Ackerman et al., 1992). In this respect, they resemble marine mussels (Mytilidae), which have adopted a similar strategy.
Zebra mussel byssal threads are fibrous extracellular structures composed largely of proteins, many of which contain the post-translationally modified amino acid 3,4-dihydroxyphenylalanine (Dopa) (Rzepecki and Waite, 1993). Peptidyl Dopa is thus a convenient marker of byssal precursor proteins and is thought to play an important role in adhesion and the maturational cross-linking of byssal threads (Waite, 1990). Three polymorphic Dopa-containing protein families have previously been isolated and partially characterized from zebra mussel foot tissue, the site of byssal protein synthesis and storage. The largest of these proteins, Dreissena polymorpha foot protein 1 (Dpfp1), has an apparent molecular weight of 76 kDa and Dopa at levels up to 6.6 mole % (Rzepecki and Waite, 1993). Like many byssal precursors from marine mussels, Dpfp1 features Dopa residues in repeating consensus motifs. Despite this similarity, Dpfp1 is markedly different from the marine proteins in two respects. First, members of the Dpfp1 family have acidic isoelectric points ranging from 5.3 to 6.5; marine byssal precursors, in contrast, are highly basic - many with pIs exceeding the effective resolving range of available ampholytes. Second, dreissenid byssal precursors, including Dpfp1, are glycosylated with N-acetylgalactosamine O-linked to serine and threonine residues; there is, however, no evidence for glycosylation in byssal proteins from any marine taxa. It is not known whether these differences reflect two generally valid solutions to the problem of adhesion underwater or represent genuine differences in the requirements for adhesive bond formation in freshwater and marine systems.
Our efforts to determine the complete primary sequence of Dpfp1 by traditional peptide mapping have been thwarted by the repetitive structure and protease-resistance of large regions of the protein (Rzepecki and Waite, 1993). In this study, we report on the complete primary sequence of Dpfp1 deduced using molecular techniques. cDNA sequence data reveal that Dpfp1 is a tandemly repetitive protein composed of two motifs: a novel heptapeptide sequence and a tridecapeptide consensus sequence. Unusually, these motifs are segregated to distinct regions of the protein, a fact which almost certainly has important consequences to the self-assembly of the zebra mussel byssus.
Materials and Methods
All tissues used in these experiments were excised, immediately frozen in liquid nitrogen, and ground in a mortar chilled to -80 [degrees] C. Tissue was homogenized in a hand-held glass homogenizer (Kontes, Vineland, NJ), and total RNA was extracted according to the methods of Chomczynski and Sacchi (1987).
Reverse transcriptase (RT)-polymerase chain reaction (PCR) and 5[prime] rapid amplification of cDNA ends (RACE)
mRNA was purified from total RNA using the Oligotex mRNA spin column kit (Qiagen, Chatsworth, CA). After purification, 1 [[micro]gram] mRNA was reverse transcribed using 20 pmoles of a primer specific to polyA tracts (polyT-LD AGAGAGATTTTTTTTTTTTTTTTTVN) with 200 units of MM-LV reverse transcriptase (Superscript II, Gibco-BRL) for 2 h at 37 [degrees] C in buffer supplied by the manufacturer. The reaction was quenched with 1 ml of 1 x TE, pH 7.5. One percent (v/v) of the resulting first-strand cDNA was amplified with the polymerase chain reaction (PCR) using degenerate oligonucleotide primers based on the previously determined (Rzepecki and Waite, 1993) amino acid sequence of the N-terminus of Dpfp1 (Dp1.N(+) GGIACITAYGAYTGGACNGA) and an internal peptide (Dp1.A(-) TTRTCRTAIGGICCRTCRTA). Each 50-[[micro]liter] reaction contained 0.25 mM of each dNTP, 100 pmoles of each primer, and 2.5 units of Taq2000 polymerase (Stratagene, La Jolla, CA), in a buffer containing 10 mM Tris-Cl, 1.5 mM Mg[Cl.sub.2], 75 mM KCl, and 15 mM [(N[H.sub.4]).sub.2]S[O.sub.4]. Samples were initially denatured at 95 [degrees] C for 4 min 30 s followed by 30 cycles of amplification as follows: 95 [degrees] C for 30 s, 50 [degrees] C for 30 s, and 72 [degrees] C for 2 min. A final extension for 5 min at 72 [degrees] C was carried out to ensure addition of 3[prime] A overhangs. The resulting amplification product was ligated into the pCRII vector (Invitrogen, San Diego, CA) according to manufacturer's instructions. The insert from the newly constructed plasmid, pDP1.NA, was sequenced on both strands using vector-specific and degenerate oligonucleotide primers.
5[prime] RACE was performed to obtain cDNA sequence data upstream of the region coding for the N-terminus (Frohman et al., 1988) and to independently establish the cDNA sequence of the N-terminus. All reactions were performed using reagents contained in the 5[prime] RACE System V2.0 (Life Technologies, Bethesda, MD) according to manufacturer's instructions. Briefly, 1 [[micro]gram] of D. polymorpha foot tissue total RNA was reverse-transcribed using a gene-specific primer (Dp1.GSP1(-) TATTTTGTAGGAGTGGG). The purified first-strand cDNA was tailed with dCTP, and PCR was performed using the supplied abridged anchor primer (GGCCACGCGTCGACTAGTACGGGIIGGGIIGGGIIG) and Dp1.GSP1(-). Each 50-[[micro]liter] reaction contained 0.25 mM of each dNTP and 20 pmoles of each primer in 1x PCR buffer (Life Technologies, Bethesda, MD) supplemented with 2 mM Mg[Cl.sub.2]. Samples were denatured at 95 [degrees] C for 4 min 30 s and equilibrated to 72 [degrees] C. Two-and-one-half units of Taq2000 polymerase were added and amplification for 25 cycles was performed under the following conditions: 95 [degrees] C for 30 s, 42 [degrees] C for 30 s, and 72 [degrees] C for 30 s. A final 5-min extension was performed at 72 [degrees] C. A second round of PCR was performed using AAP and a nested gene-specific primer (Dp1.GSP2(-) TTGTTGTATAGTTCGGAATTTTAG). The reaction volume and component concentrations were as outlined in the previous reaction. Samples were initially denatured at 95 [degrees] C for 4 min 30 s followed by 30 cycles of amplification as follows: 95 [degrees] C for 30 s, 42 [degrees] C for 30 s, and 72 [degrees] C for 60 s. A final extension for 5 min at 72 [degrees] C was carried out to ensure addition of 3[prime] A overhangs. The resulting amplification products were cloned into the pGEM-T vector (Promega, Madison, WI) according to manufacturer's instructions. The insert from the newly constructed plasmid, pDP1.5[prime]UTA, was sequenced on both strands using gene-specific primers.
Probe synthesis and cDNA library screening
Two probes were created in this experiment to screen a D. polymorpha foot tissue cDNA library (Eddington, 1996). A digoxigenin (DIG)-labeled antisense RNA probe (probe #1) was generated from DdeI-digested pDP1.NA using T7 polymerase and the DIG-RNA labeling kit (Boehringer-Mannheim) according to manufacturer's instructions. A DIG-labeled double-stranded DNA probe (probe #2) spanning the 5[prime] untranslated region of Dpfp1 and the first 172 nt coding for the mature protein was generated using the PCR DIG probe synthesis kit (Boehringer-Mannheim) according to manufacturer's instructions. pDP1.5[prime]UTA was used as a template for this reaction, and a primer specific to the 5[prime] untranslated region of Dpfp1 (Dp1.5[prime] UT(+) ATACTTCAGAGCATCAACCAA) and Dp1.GSP1(-) were used as primers. Both probes were individually incorporated at a concentration of 100 ng/ml into standard hybridization buffer + 50% formamide (5x SSC, 1% Blocking buffer (Boehringer-Mannheim), 0.1% (w/v) sarcosyl, 0.02% (w/v) SDS, 50% formamide (v/v)). Hybridizations were carded out at 60 [degrees] C (probe #1) or 42 [degrees] C (probe #2). Stringency washes for both probes were conducted with 0.1x SSC/0.2% (w/v) SDS at 68 [degrees] C.
One million plaques generated from a [Lambda]ZAP-Express cDNA library (Stratagene, La Jolla, CA) were doubly screened with probes #1 and #2. No plaques positive for probe #2 were detected, suggesting that a full-length clone of Dpfp1 was not present in this library. Forty plaques positive for probe #1 were cored, eluted in SM buffer (100 mM NaCl, 50 mM Tris-Cl pH 7.5, 8 mM MgS[O.sub.4], 0.1% gelatin), and tested for insert size by PCR using vector-specific primers flanking the cDNA insert. After secondary screening, cDNA from the plaque bearing the largest insert was rescued as a phagemid using the ExAssist interference-resistant helper phage kit (Stratagene, La Jolla, CA) and sequenced using the nested deletion technique (see below).
Nested deletions were performed using the double-stranded nested deletion kit (Pharmacia Biotech, Piscataway, NJ). In each case, 5 [[micro]gram] of template was doubly digested with EcoRI and PstI, and the restriction enzymes were heat inactivated. Digested clones were precipitated in ethanol and resuspended in a buffer containing 1.5 M potassium acetate, 37.5 mM Tris-acetate pH 7.6, 15 mM magnesium acetate, 750/[[micro]Molar] [Beta]-mercaptoethanol, and 15 [[micro]gram]/ml bovine serum albumin (BSA). A 2-[[micro]gram] sample of each digest was used for digestion with Exonuclease III. The reactions were carried out at 23 [degrees] C and aliquots taken every 5 min. All clones yielding deletions larger than the size of the empty vector were ligated, transformed into XL1-Blue MRF[prime] cells (Stratagene, La Jolla, CA), purified, and sequenced using a vector-specific primer.
RNA dot blots
Ten micrograms of total RNA separately extracted from D. polymorpha foot, adductor mussel, mantle, and gill tissue were diluted in an equal volume of RNA dilution buffer (water: 20x SSC: formaldehyde; 5:3:2) and spotted onto a positively charged nylon membrane (MSI, Westboro, MA). The membrane was hybridized to either probe #1 as described above or to an actin-specific double-stranded DIG-labeled DNA probe (Patwary et al., 1996). Hybridization with actin specific probe was performed at 37 [degrees] C with a stringency wash using 0.5X SSC/0.1% (w/v) SDS at 68 [degrees] C.
Three micrograms of foot tissue mRNA were subjected to formaldehyde/agarose gel electrophoresis according to Sambrook et al. (1989). RNA was transferred onto a positively charged nylon membrane and hybridized overnight with probe #1.
Mass analysis of native Dpfp1
Native Dpfp1 was purified from the foot of adult zebra mussels according to Rzepecki and Waite (1993). The mass of the native protein was determined by matrix-assisted laser desorption-ionization mass spectrometry with time-of-flight (MALDI-TOF) using a PerSeptive Biosystems Voyager model in the positive ion mode and delayed extraction. A 20-[[micro]Molar] solution of Dpfp1 in 0.1% acetic acid was mixed with three volumes of a saturated sinapinic acid solution (40% acetonitrile/0.1% TFA); 2[[micro]liter] of the resulting mixture (10 pmoles Dpfp1) was placed on a sample plate and allowed to air dry. The sample was inserted into a vacuum chamber (1 X [10.sup.-7] torr) and the spectra generated from 256 pulses of a 337-nm laser were averaged. The acceleration voltage was 25,000 with a 90% grid voltage and a guidewire setting of 0.1%.
RNA dot blots and Northern hybridizations
The tissue specificity of Dpfp1 is demonstrated in Figure 1. RNA dot blots show that Dpfp1 mRNA transcripts were detected only in total RNA extracts from foot tissue and not in extracts from gill, adductor muscle, or mantle tissue. Identical dot blots hybridized to an actin-specific probe were positive for all tissue types although the strength of the signal varied considerably between tissue types (data not shown). These results are consistent with data obtained from other marine byssal precursor proteins (Inoue et al., 1995, 1996a; Coyne et al., 1997; Qin et al., 1997) and support the hypothesis that Dpfp1 plays a role as abyssal structural protein. Northern blots of foot tissue mRNA indicated that Dpfp1 transcripts range in size from 1200 b to 1500 b, suggesting the presence of size variants [ILLUSTRATION FOR FIGURE 2 OMITTED].
Dpfp1 cDNA sequence
In Figure 3 the aligned nucleotide sequence data obtained from 5[prime] RACE, RT-PCR with degenerate oligonucleotide primers, and from the largest cDNA clone isolated are presented. Each sequence differs slightly from the other, and therefore the consensus sequence generated from this alignment does not represent any single Dpfp1 sequence. It is likely that differences in the data sets reflect the existence of Dpfp1 variants rather than errors introduced during amplification, because each set of PCR sequence data was determined from at least two independently amplified samples. The combined transcript is 1481 bp in length and contains an open reading frame of 1332 bp coding for a protein of 443 amino acids. Included in the transcript is a start codon at nucleotide position 36 and two overlapping canonical polyadenylation signals (Kozak, 1986) at nucleotide positions 1464 and 1468. The calculated molecular weight of the deduced primary sequence is 49 kDa, with a predicted isoelectric point of 5.29.
The first 19 amino acids code for a putative signal peptide that conforms to the rule of von Heijne (1985). Computer-based modeling of signal peptide cleavage (Nielsen et al., 1997) correctly predicts cleavage of the signal peptide preceding the previously determined N-terminal glycine residue of the mature protein (Rzepecki and Waite, 1993). The N-terminus of Dpfp1, as coded for by sequences generated using 5[prime] RACE, differs from the previously reported N-terminal sequence (Rzepecki and Waite, 1993) in that it substitutes serine residues for threonine at position #2, tyrosine at position #3, and aspartic acid at position #10. None of the three independently generated 5[prime] RACE clones exactly coded for the previously reported N-terminus of Dpfp1. N-terminal sequence data generated with degenerate oligonucleotide primers more closely resemble the previously reported N-terminal sequence but also substitute serine for aspartic acid at position #10. It is not possible to determine from these data if the N-terminal sequence deduced from cDNAs generated via degenerate oligonucleotide primers reflects a genuinely different N-terminus or is simply an artifact forced by the primers used during amplification.
The N-terminal 38 amino acids of the mature protein are relatively enriched in threonine and serine residues and quickly give way to a tandemly repeating heptapeptide. This generally basic motif (P-[V/E]-Y-P-[T/S]-[K/Q]-X) is repeated 22 times in the N-terminal half of Dpfp1 with some variation, particularly at position #7 of the consensus sequence; however, proline residues at positions #1 and #4 and tyrosine residues at position #3 are highly conserved [ILLUSTRATION FOR FIGURE 4 OMITTED]. RT-PCR data differ from cDNA clone data in this region of the transcript by omission of threonine 175 and by a G190E substitution resulting from a transversion at nucleotide position 661.
The C-terminal half of Dpfp1 is dominated by the previously reported 13 amino acid consensus sequence: K-P-G-P-Y-D-Y-D-G-P-Y-D-K (Rzepecki and Waite, 1993). This acidic sequence is found tandemly repeated 16 times with only slight variations from the consensus [ILLUSTRATION FOR FIGURE 4 OMITTED]. The deduced amino acid composition of the composite Dpfp1 sequence, without signal peptide sequence agrees well with that of native Dpfp1 (Table I), suggesting that the composite sequence described above is representative of Dpfp1 mRNAs present in zebra mussel foot tissue. Examination of codon usage for Dpfp1 (Table II) reveals a significant degree of codon bias in amino acids that occur in conserved positions of the above-mentioned consensus sequences (e.g., P, Y, D, K, T, G).
Mass analysis of native Dpfp1
MALDI-TOF analysis of native Dpfp1 indicates that the purified protein is represented by two major mass variants. The lighter of the two variants has a mass [[M + [H.sup.+]].sup.+] = 48.6 kDa, whereas in the heavier variant, [[M + [H.sup.+]].sup.+] = 54.5 kDa. No peaks were detected in the 60-80 kDa range.
The primary structure of Dpfp1, deduced from overlapping cDNAs, represents the first complete sequence for a dreissenid byssal protein and an important advance in understanding the attachment strategy of the zebra mussel. Two observations suggest that the composite sequence generated from these data sets is likely to resemble full-length transcripts for Dpfp1. First, the size of the composite sequence (1481 bases) closely matches the size of the largest Dpfp1 transcript as determined by Northern blots of zebra mussel foot tissue mRNA hybridized to a Dpfp1-specific probe. Second, the deduced amino acid composition of the composite sequence, excluding the signal peptide, closely matches the composition of native Dpfp1 as reported in Rzepecki and Waite (1993).
Table I Amino acid composition of deduced and native Dpfp1 Amino acid Native Deduced Asx 136.7 134.8 Thr 75.0 82.7 Ser 34.4 33.1 Glx 70.1 52.0 Pro 238.6 234.0 Gly 76.5 68.6 Ala 7.9 2.4 Val 50.4 52.0 Met 0.7 0.0 Ile 9.9 9.5 Leu 20.4 18.9 Dopa 66.6 N.D. Tyr 84.5 165.5 Phe 9.6 14.2 His 5.1 7.1 Lys 94.8 99.3 Arg 17.0 14.2 Trp 1.8 11.8 Total: 1000.0 1000.0 The amino acid composition of deduced Dpfp1 is determined excluding signal peptide residues, and that of native Dpfp1 is from Rzepecki and Waite (1993). All values are in residues per thousand residues.
Purified native Dpfp1 was subjected to MALDI-TOF analysis to resolve the conflict between the apparent and [TABULAR DATA FOR TABLE II OMITTED] cDNA-deduced mass estimates. SDS-PAGE of native Dpfp1 established that the purified protein migrates as a doublet with apparent molecular masses of 65 and 76 kDa (Rzepecki and Waite, 1993). However, the deduced mass of Dpfp1 of 49 kDa (this work), even allowing for an additional 6.5 kDa contributed by post-translational glycosylation and hydroxylation (Rzepecki and Waite, 1993), is difficult to reconcile with the empirically determined apparent masses. According to MALDI-TOF mass spectrometric analysis, Dpfp1 exists primarily as a doublet (48.6 and 54.5 kDa) with no visible components above 60 kDa. The mass of the larger variant is in excellent agreement with the deduced mass of Dpfp1 after addition of post-translational modifications. The smaller variant may represent unmodified Dpfp1 or possibly a fully modified variant coded for by one of the smaller Dpfp1 transcripts detected during Northern blot analysis of mRNA from zebra mussel foot tissue [ILLUSTRATION FOR FIGURE 2 OMITTED]. This observation confirms that Dpfp1, like many other byssal precursor proteins (see Coyne et al., 1997; Qin et al., 1997; Taylor et al., 1996; Papov et al., 1995), migrates anomalously during SDS-PAGE.
In previous studies, isoelectric focusing of purified Dpfp1 suggested the presence of at least 10 electrophoretic variants in the polymorphic family (Rzepecki and Waite, 1993). These multiple bands may reflect differences in the primary structure of Dpfp1 variants, nonuniform post-translational modification of one or more forms of the protein, or both. At least some of the variation must arise from differences in primary structure since the N-terminus of Dpfp1 exhibited heterogeneity at two positions (#2 and #8, [ILLUSTRATION FOR FIGURE 3 OMITTED]) (Rzepecki and Waite, 1993). The nucleotide sequences presented in Figure 3 suggest the existence of at least two of these variants. Differences between these variants in regions of cDNA overlap are limited to the deletion of a single codon in the RT-PCR data and a single transversion resulting in an amino acid substitution in one of the heptapeptide sequences.
An examination of the codon usage data (Table II) indicates that compositionally dominant amino acids are predominantly coded for by half of the potentially available codons for these residues. This is especially true of proline, tyrosine, aspartic acid, lysine, threonine, and glycine residues, which together account for almost 75% of the amino acid composition of Dpfp1. The pattern of codon bias in compositionally dominant residues has also been noted in other marine byssal precursor proteins - notably Mcfp1 (Filpula et al., 1990), Mgfp1 (Inoue and Odo, 1994), Mcfp1 (Inoue et al., 1996b), and, to a lesser extent, Mgfp2 (Inoue et al., 1995) - and may reflect a need to express byssal structural proteins rapidly in response to developmental cues and changing environmental conditions. It is well established that in bacterial systems, codon bias is positively correlated with the rates of gene expression (Robinson et al., 1984; Varenne et al., 1984; Sorensen et al., 1989), presumably through selection of codons that recognize the most abundant isoaccepting tRNAs for a given amino acid. Precedence for this hypothesis can also be found among highly expressed genes in multicellular organisms such as Drosophila melanogaster, whose chorion genes, important eggshell components known to be highly expressed during egg development (Kafatos et al., 1987), also exhibit significant codon bias (Akashi, 1994). Such a hypothesis has also been advanced to explain observed codon bias in the highly expressed silk fibroin heavy chain of the silk moth, Bombyx mori (Mita et al., 1994).
More than 80% of the deduced primary amino acid sequence of Dpfp1 is composed of tandemly repeated and segregated motifs: one is a heptapeptide; the other, a tridecapeptide consensus motif that coincides with peptides sequenced previously (Rzepecki and Waite, 1993). The occurrence of two relatively short tandemly repeating motifs in Dpfp1 is consistent with its proposed role as a byssal structural protein. However, the absence of data on the distribution of Dpfp1 within the byssus makes it difficult to assign a specific role at this time. The repetitive nature of Dpfp1 is shared by many of the structural proteins of marine byssi. Two of three characterized Dopa-containing byssal proteins in Mytilus are known to be composed almost entirely of tandem repeats. Mefp1, a 110-kDa protein thought to play a role as a cuticular lacquer in the byssus of M. edulis, is dominated by nonsegregated hexa- and decapeptide repeats (Filpula et al., 1990; Waite et al., 1985; Laursen, 1992). Mgfp2, a 49-kDa plaque-specific protein of M. galloprovincialis, is largely composed of larger, epidermal growth factor-like repeats (Inoue et al., 1995).
The N-terminal half of Dpfp1 is dominated by a heptapeptide motif that is repeated 22 times with some variation, particularly at position #7 of the consensus sequence. Variability notwithstanding, the spacing of proline and tyrosine residues is well conserved, suggesting that these amino acids play an important functional role in the motif. No tryptic peptides exactly matching the deduced primary sequence could be mapped to this part of the protein; however, a fragment of one tryptic peptide (tryptic peptide #13 in fig. 6 of Rzepecki and Waite, 1993) containing the subsequence S-P-L-Y-G-W . . . is found to bridge two of the heptapeptide repeats. Although the tyrosine in this sequence is efficiently converted to Dopa, the amino acid composition of residual undigested Dpfp1 suggests that, as a whole, this region contains relatively little Dopa (Rzepecki, pers. comm).
Given the frequency of lysine and arginine in the heptapeptide repeat region, the resistance of the repeat to cleavage by trypsin is intriguing. An examination of the deduced primary sequence indicates that K-P or R-P sequences cannot be the basis for this resistance. Interestingly, lysine and arginine residues in this domain frequently occur adjacent to threonine and serine residues. That observation, coupled with the detection of high levels of threonine and N-acetylgalactosamine in partially digested tryptic peptides (Rzepecki and Waite, 1993), leads to the hypothesis that Arg and Lys are protected from trypsin cleavage by adjacent glycosylated amino acids. A similar protection appears to be imparted by glycosylated residues in an extensin-like glycoprotein from Volvox carteri (Ertl et al., 1992).
The N-terminal half of Dpfp1 differs significantly from the C-terminal domain with its repeated 13 amino acid motif [ILLUSTRATION FOR FIGURE 4 OMITTED]. Previous peptide data (Rzepecki and Waite, 1993) and the deduced sequence of Dpfp1 are consistent with the hypothesis that glycosylation is more extensive in the N-terminal region of the protein, whereas hydroxylation of tyrosine to Dopa occurs more frequently in the remaining C-terminal portion. Additionally, the average isoelectric point of Dpfp1 in the region occupied by the heptapeptide is moderately basic (pI = 8.7), whereas the C-terminal domain is quite acidic (pI = 4.7). These divergent characteristics suggest that the segregation of motifs plays a significant role in the architectural design of the zebra mussel byssus. Recently, two byssal structural proteins from M. edulis have also been shown to be composed of "block copolymer"-like domains. Both proteins have a central collagenous core flanked by sequences resembling either elastin (Coyne et al., 1997) or silk fibroin (Qin et al., 1997). The distribution of these proteins can be used to account for the heterogeneous mechanical properties of byssus in M. edulis (Qin and Waite, 1995).
Although the consensus motifs of Dpfp1 do not have strong homologies with any known structural proteins, they do share some features with other proteins containing tandem repeats - i.e., marine adhesives (Laursen, 1992), extensin-like proteins from plants (Kieliszewski and Lamport, 1994), and a trematode eggshell protein (Wells and Cordingley, 1992) [ILLUSTRATION FOR FIGURE 5 OMITTED]. The [Beta]-turn (Pro Val) and lysine of the heptapeptide are prominent in extension (soybean PRP) (Hong et al., 1987) and adhesive protein (Waite et al., 1985). In addition, although not a repeating sequence, the PEVK domain of titin, a protein of skeletal muscle, contains at least 27 occurrences of the motif PVP[X.sub.n]K in which [X.sub.n] can be from one to three amino acids long (Labeit and Kolmerer, 1995). The tridecapeptide of Dpfp1, in contrast, shares the repeated proximity of YD with a trematode eggshell protein (Wells and Cordingley, 1992), although the latter notably lacks proline [ILLUSTRATION FOR FIGURE 5 OMITTED]. Curiously, all these proteins have one thing in common: they are significant components of structures that function in tension.
We thank Alan Jordan for year-round collections of D. polymorpha and the National Sea Grant Program of NOAA for support. Drs. John McDonald and Alison Hunt provided generous assistance with DNA sequencing. Dr. Lesz Rzepecki generously provided samples of native Dpfp1, and Luis Burzio provided assistance and advice regarding analysis of Dpfp1 by mass spectrometry. Research was supported by grants from the National Oceanic and Atmospheric Administration (94-3501-0115) and the Office of Naval Research (N00014-96-1-1205) to JHW. KEA was supported in part by USPHS Grant T32-GM08550. cDNA sequences have been submitted to GenBank (Accession # AF043221; AF043222; AF043223).
Ackerman, J. D., C. R. Ethier, D. G. Allen, and J. K. Spelt. 1992. Investigation of zebra mussel adhesion strength using rotating disks. J. Environ. Eng. 118: 708-724.
Akashi, H. 1994. Synonymous codon usage in Drosophila melanogaster: natural selection and translational accuracy. Genetics 135: 927-935.
Chomczynski, P., and N. Sacchi. 1987. Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction. Anal. Biochem. 162: 156-159.
Coyne, K. J., X. X. Qin, and J. H. Waite. 1997. Extensible collagen in mussel byssus - a natural block-copolymer. Science 277: 1830-1832.
Eddington, N. D. 1996. Partial oligonucleotide sequence of a mussel byssal precursor protein, Dreissena polymorpha foot protein 2. M. S. thesis, University of Delaware, Lewes.
Ertl, H., A. Hallmann, S. Wenz, and M. Sumper. 1992. A novel extensin that may organize extracellular matrix biogenesis in Volvox carteri. EMBO J. 11: 2055-2062.
Filpula, D. R., S. M. Lee, R. P. Link, S. L. Strausberg, and R. L. Strausberg. 1990. Structural and functional repetition in a marine mussel adhesive protein. Biotechnol. Prog. 6: 171-177.
Frohman, M. A., M. K. Dush, and G. R. Martin. 1988. Rapid production of full-length cDNAs from rare transcripts: amplification using a single gene-specific oligonucleotide primer. Proc. Natl. Acad. Sci. USA 85: 8998-9002.
Hong, J. C., R. T. Nagao, and J. L. Key. 1987. Characterization and sequence analysis of a developmentally regulated putative cell wall protein gene isolated from soybean. J. Biol. Chem. 262: 8367-8376.
Inoue, K., and S. Odo. 1994. The adhesive protein cDNA of Mytilus galloprovincialis encodes decapeptide repeats but no hexapeptide motif. Biol. Bull. 186: 349-355.
Inoue, K., Y. Takeuchi, D. Miki, and S. Odo. 1995. Mussel adhesive plaque protein gene is a novel member of epidermal growth factor- like gene family. J. Biol. Chem. 270: 6698-6701.
Inoue, K., Y. Takeuchi, D. Miki, S. Odo, S. Harayama, and J. H. Waite. 1996a. Cloning, sequencing and sites of expression of genes for the hydroxyarginine-containing adhesive-plaque protein of the mussel Mytilus galloprovincialis. Eur. J. Biochem. 239: 172-176.
Inoue, K., Y. Takeuchi, S. Takeyama, E. Yamaha, F. Yamazaki, S. Odo, and S. Harayama. 1996b. Adhesive protein cDNA sequence of the mussel Mytilus coruscus and its evolutionary implications. J. Mol. Evol. 43: 348-356.
Johnson, L. E., and D. K. Padilla. 1996. Geographic spread of exotic species: ecological lessons and opportunities from the invasion of the zebra mussel Dreissena polymorpha. Biol. Conserv. 78: 23-33.
Kafatos, F. C., N. Spoerel, S. A. Mitsialis, H. T. Nguyen, C. Ramano, J. R. Lingappa, B. D. Mariani, G. C. Rodakis, R. Leganidou, and S. G. Tsitilou. 1987. Developmental control and evolution in the chorion gene families of insects. Adv. Genet. 24: 223-242.
Kieliszewski, M. J., and D. T. A. Lamport. 1994. Extensin: repetitive motifs, functional sites, post-translational codes, and phylogeny. Plant J. 5: 157-172.
Kozak, M. 1986. Point mutations define a sequence flanking the AUG initiator codon that modulates translation by eukaryotic ribosomes. Cell 44: 283-292.
Labeit, S., and B. Kolmerer. 1995. Titins: giant proteins in charge of muscle ultrastructure and elasticity. Science 270: 293-296.
Laursen, R. A. 1992. Reflections on the structure of mussel adhesive proteins. Pp. 55-74 in Structure, Cellular Synthesis and Assembly of Biopolymers, S. T. Case, ed. Springer Verlag, Berlin.
Mita, K., S. Ichimura, and T. C. James. 1994. Highly repetitive structure and its organization of the silk fibroin gene. J. Mol. Evol. 38: 583-592.
Nielsen, H., J. Engelbrecht, S. Brunak, and G. von Heinjne. 1997. Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites. Protein Eng. 10: 1-6.
Papov, V. V., T. V. Diamond, K. Biemann, and J. H. Waite. 1995. Hydroxyarginine-containing polyphenolic proteins in the adhesive plaques of the marine mussel Mytilus edulis. J. Biol. Chem. 270: 20183-20192.
Patwary, M. U., M. E. Reith, and E. L. Kenchington. 1996. Isolation and characterization of a eDNA encoding an actin gene from sea scallop (Placopecten magellanicus). J. Shellfish Res. 15: 265-271.
Qin, X. X., and J. H. Waite. 1995. Exotic collagen gradients in the byssus of the mussel Mytilus edulis. J. Exp. Biol. 198: 633-644.
Qin, X. X., K. J. Coyne, and J. H. Waite. 1997. Tough tendons: mussel byssus has collagen with silk-like domains. J. Biol. Chem. 272: 32623-32627.
Roberts, L. 1990. Zebra mussel invasion threatens U.S. waters. Science 249: 1370-1372.
Robinson, M., R. Lilley, S. Little, J. S. Emtage, G. Yamamoto, P. Stephens, A. Millican, M. Eaton, and G. Humphreys. 1984. Codon usage can affect efficiency of translation of genes in Escherichia coli. Nucleic Acids Res. 12: 6663-6671.
Rzepecki, L. M., and J. H. Waite. 1993. The byssus of the zebra mussel, Dreissena polymorpha. II: structure and polymorphism of byssal polyphenolic protein families. Mol. Mar. Biol. Biotechnol. 2: 267-279.
Sambrook, S., E. F. Fritsch, and T. Maniatis. 1989. Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
Sorensen, M. A., C. G. Kurland, and S. Pedersen. 1989. Codon usage determines translation rate in Escherichia coli. J. Mol. Biol. 207: 365-377.
Taylor, S. W., D. B. Chase, M. H. Emptage, M. J. Nelson, and J. H. Waite. 1996. Ferric ion complexes of DOPA-containing adhesive protein from Mytilus edulis. Inorg. Chem. 35: 7572-7577.
Varenne, S. J., J. Buc, R. Lloubes, and C. Lazdunski. 1984. Translation is a non-uniform process. Effect of tRNA availability on the rate of elongation of nascent polypeptide chains. J. Biol. Chem. 180: 549-576.
von Heijne, G. 1985. Signal sequences: the limits of variation. J. Mol. Biol. 184: 99-105.
Waite, J. H. 1990. The phylogeny and chemical diversity of quinone-tanned glues and varnishes. Comp. Biochem. Physiol. 97B: 19-29.
Waite, J. H., T. J. Housley, and M. L. Tanzer. 1985. Peptide repeats in a mussel glue protein: theme and variation. Biochemistry 24: 5010-5014.
Wells, K. E., and J. S. Cordingley. 1992. The cell and molecular biology of eggshell formation in Schistosoma mansoni. Pp. 97-114 in Structure, Cellular Synthesis and Assembly of Biopolymers, S. T. Case, ed. Springer Verlag, Berlin.
|Printer friendly Cite/link Email Feedback|
|Author:||Anderson, Kevin E.; Waite, J. Herbert|
|Publication:||The Biological Bulletin|
|Date:||Apr 1, 1998|
|Previous Article:||Zebra mussel spawning is induced in low concentrations of putative serotonin reuptake inhibitors.|
|Next Article:||Ion transport in the freshwater bivalve Corbicula fluminea.|