BIOLOGICAL EFFECTS OF THE GROWTH HORMONE GENE POLYMORPHISM IN THE AMERICAN MINK (NEOVISON VISON SCHREB., 1777) - DOES SYNONYMOUS ALWAYS MEAN SILENT.
Most often, genetic variation causing phenotypical changes is calculated exclusively upon the occurrence of non- synonymous variation. However, it has been increasingly more common to state that synonymous mutations and polymorphisms, as well as nucleotide variation present in introns, can have very serious biological effects. In the following work, the potential biological effect (including molecular phenotype) of nucleotide variation within the growth hormone gene in American mink (Neovison vison Schreb., 1777) was evaluated, based on multilocus genotypes of 389 individuals (wild mink from Canada, six colour types of ranch mink, mink acquired from the natural environment in Poland and Iceland), identified by direct sequencing. The focus is on the possible occurrence of changes in a splicing regulatory sequences and sequence motifs, different reading of codons and on an influence on the mRNA secondary structure.
The results obtained confirm the hypothesis that synonymity of single-nucleotide variation does not always signify its neutrality.
Key words: American mink, synonymous mutation, silent mutation, biological effect, growth hormone gene
The primary measurement of nucleotide variation in a gene is the occurrence of non-synonymous variation (Loewe et al., 2006). However, it is more and more common nowadays to emphasize that synonymous mutation is actually capable of creating serious biological effects, as the synonymity is not always equal with being phenotypically neutral, especially in the molecular scope (Nackley et al., 2006, Parmley and Hurst, 2007). Moreover, a relevant biological effect can be linked to nucleotide alteration located in introns (Chorev and Carmel, 2012). The after-effects of the described genetic variation can be meaningful for changes in splicing regulatory sequences and sequence motifs, different reading of codons and for the influence on mRNA secondary structure (Shabalina et al., 2006, Parmley and Hurst, 2007, Hsu et al., 2010, Gingold and Pilpel, 2011).
In the following work, the potential biological effect of nucleotide variation within the growth hormone gene in American mink is evaluated (Neovison vison Schreb., 1777) (mGH). The focus is primarily on the macromolecular phenotype, including gene expression heterogeneity and RNA phenotypes (Ferrada and Wagner, 2012, Wagner, 2014).
MATERIALS AND METHODS
The study involved 389 animals - 26 Canadian wild minks, 295 animals representing six farm colour-breed (Brown, Scanblack, Sapphire, Pearl, Black-Cross and Sapphire-Cross), 28 feral animals from north-west Poland and 40 from Iceland. Samples from farm animals came from slaughter-waste and from feral animals from Poland from carcasses of animals killed on roads. In case of wild animals from Canada and feral animals from Iceland, authors have received isolated, ready-to-use DNA samples, made available by courtesy of Prof. H. Farid from Dalhousie University, and R. A. Stefansson from West Iceland Centre of Natural History, respectively. The genomic DNA was isolated from muscle tissue (High Pure PCR Template Preparation Kit, Roche). The quality of extracted DNA was determined by agarose gel electrophoresis (AGE) on 1.0% w/v agarose. Standard PCR and nested-PCR were used to amplify the growth hormone gene.
In order to obtain amplicons with an optimal length for sequencing two sets of internal and external primers were designed for two separate nested PCRs, and one set of primers for standard PCR (Tab. 1), based on the mGH sequence (GenBank: JX489617.2). DNA amplification was performed in a mixture with a volume of 15 ml. A ready-to-use 2xPCR mixture from AandA BIOTECHNOLOGY was used. All PCRs consisted of initial denaturation at 94C for 5 min., 35 cycles of denaturation at 94C for 40 s, annealing at 52C (for the nested-PCR) or 55C (standard PCR) for 40 s, and polynucleotide chain elongation at 72C for 40s, and final extension at 72C for 5 min. Products of standard PCR and nested-PCR with internal primers were separated by agarose gel electrophoresis (7 V/cm gel, 1.5%). In order to assess the size of the isolated DNA fragments, pUC19/MspI DNA marker (AandA BIOTECHNOLOGY) and GeneRulerTM1kb DNA Ladder (Fermentas) were used.
In order to determine the nucleotide sequence, the resulting amplicons obtained from all 389 tested subjects, were subjected to Sanger sequencing. Sequential PCR amplification was performed using a BigDye(R) Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems). The sequential PCR amplification consisted of initial denaturation at 96C for 1 min., 25 cycles of denaturation at 96C for 10 s, annealing at 50-52C for 5 s, and polynucleotide chain synthesis at 60C for 4 min. The next step was to clean the sequential PCR products using a ready-to-use ExTerminator kit (AandA BIOTECHNOLOGY). The purified sequencing reaction products were separated and read in a capillary sequencer, a 3730xl DNA Analyser (Applied Biosystems). The chromatograms were read in a FinchTV v.1.4 software (Geospiza).
To look at the possible occurrence of changes in the domain of splicing regulatory sequences and sequence motifs in relation to the wild genotype and the influence of the recognised genetic variation on splicing process of the mGH gene, the consensus haplotype (5'-G-G-G-T-G-G-G-A-C-C-A- (C-TCTTGCAGGGGCAGGGG)-T-3') was compared with a hypothetical haplotype which covered changes in all variable loci (5'-C-A-A-C-A-A-C-G-del-T-G-(del)-G- C-3'). The software used was Human Splicing Finder v.2.4.1 (Desmet et al., 2009), and by means of Alternative Splice Site Predictor (Wang and Marin, 2006) the presence of cryptic splicing sites (css) was tested. Based on the data from the Codon Usage Database (Nakamura, 2007), there were comparisons made between the impact of changes within triplets in the third exon in the domain of the frequency of using synonymous codons.
This was made by means of the value of the codon fraction (cf) parameter which calculates the relative share of a given synonymous codon in coding its respective amino acid. The higher the parameter value, the more frequently a given synonymous codon is used for coding certain amino acid and, potentially, the more efficient the expression with its usage is (Gingold and Pilpel, 2011). Not only were the values of codon fractions for each codon with estimated variability compared, but also the total numbers including all changed codons for the possible multi-locus combinations for the third exon of the mGH gene. The models of the secondary structure of mRNA were generated by means of the CentroidFold software (Hamada et al., 2009). Then the influence of the recognised nucleotide variation on the secondary structure of the mature mRNA was evaluated.
To predict the secondary structure in silico generalized centroid estimator method was employed (Hamada et al., 2009). Analysis was performed for possible variants of the structure of mature mRNA, considering seven possible multilocus combinations for third exon. The Gibbs free energy (G) of the structure to be predicted served as a comparative parameter. The Gibbs function represented its thermodynamic stability and reactivity (Chen and Dill, 2000).
The results of sequencing revealed the presence of 14 polymorphic loci: g.703Ggreater than A, g.742Ggreater than A, g.748Tgreater than C, g.775Ggreater than A, g.778Ggreater than A, g.846Agreater than G, g.931Cgreater than T, g.1156Agreater than G, g.1329Tgreater than C, g.616Ggreater than C, g.837Ggreater than C, g.1219Cgreater than G, g.885delC, g.1219_1236del CTCTTGCAGGGGCAGGGG. As opposed to the consensus haplotype, in the haplotype with changes in 14 loci, 4 new and the loss of 2 branch points were identified, as well as 11 new and the loss in 8 exonic splicing enhacers (ESE), 4 new and the loss in 4 exonic splicing silencers (ESS), and the presence of 2 donor css and the loss in 1 acceptor css was found. The most vital differences involved the following regions: second intron-third exon, fourth exon-fourth intron and fifth exon-fifth intron. Each case of substitution in the analysed sequence is connected with an increase of the cf parameter values. This, in turn, means that the codon with a lower frequency of usage is replaced with the codon with a higher frequency.
It can signify a selective promotion and establishment of this type of single- nucleotide variation and explains the shortage of examples of substitutions that would decrease the frequency of usage of codons for the mGH gene. The average difference between the usage of codons from the consensus sequence and the changed codons is 0.13 when it is described by the cf value. Considering the efficiency of translation, the least favourable is the multilocus consensus combination 5'-G-G-T-G-G-3' which has the total cf value at 0.69; and, in theory, the most efficient are combinations 5'-G-G-T-A-A-3' and 5'-A-G-T-A-G-3'. The total cf value for them is equal to 0.92.The possible variants of the secondary structure of the mature mRNA were analysed with 7 possible multilocus combinations for the third exon being considered.
As a result of this analysis, two models of the secondary structure were found: a regular Y-shaped model that is characteristic for the sequence 5'-G-G-T-G-G-3', 5'-G-G-T-G-A-3', 5'-G- G-T-A-A-3', 5'-G-G-C-G-G-3', 5'-A-G-T-G-G-3', 5'-A- G-T-A-G-3' (Fig. 1a) and a less regular one that is characteristic for the sequence 5'-G-A-T-G-G-3' (Fig. 1b). In the following research, the influence of nucleotide variation on thermodynamical parameters of a mature mRNA molecule has also been found. The lowest free energy level in the mature RNA was found for the sequence 5'-G-G-C-G-G-3' (-239.84 kcal/mole) and the highest for 5'-A-G-T-A-G-3' (-220.83 kcal/mole). The difference between the highest and lowest free energy values was a little more than 19.0 kcal/mole.
Even if all substitutions identified within cDNA of the mGH gene are silent, and thus there was no effect of this gene nucleotide sequence variation on amino acid sequence variation of its protein product, it was also found that described effects of genetic diversity are associated with the occurrence of changes in the splicing regulatory sequences and sequential motifs, codon usage bias and influence on secondary structure of mRNA for the mGH gene. Five of detected substitutions are located within the third exon, and they are all silent, as has been assumed for the genetic code of mammals. In each case, variation takes place in the third nucleotide in a codon. This is relevant because, according to the wobble hypothesis', there are some deviations possible from the standard complementary nature of nucleobases for the last nucleotide in the mRNA codon and the first nucleotide of the tRNA anticodon. There are changes of types Ggreater than A and Tgreater than C as part of the substitution within the third exon of the mGH gene.
In both cases, the consensus nucleotide and the changed nucleotide are recognised by the same nucleotide in the anticodon, i.e. A or G in the third position of the codon is recognised by U in the first position of the anticodon and, respectively, C and U are recognised by G (Gabryelska and Barciszewski, 2011).The occurrence of more than 800 SNPs within splicing motifs has been found in the genome of domestic cattle (Kawahara-Miki et al., 2011). The presence of that many differences may suggest that genetic variability has influence on not only the process of regular genes splicing, but also on alternative splicing (Modrek et al., 2001). The process of alternative splicing of the growth hormone gene has been described especially meticulously in humans (Solis et al., 2008).
Vitally important in this process are enhancing regulatory sequences and alternative splicing sites within the third exon and the third intron, like ESE1 sequence (located in nucleotides 1-7 in this exon), ESE3 (in nucleotides 83-89) and many intronic splicing enhancer motifs (ISE) G2X(1-4)G3 (McCarthy and Phillips III, 1998, Solis et al., 2008, Babu et al., 2012). In the third exon of the mGH gene (this region goes in line with all the exon splicing enhancers present in humans and described above), the SNP of g.702Ggreater than A (EESE1), mutation of g.774Ggreater than A and the SNP of g.777Ggreater than A (EESE3) were identified. In the third exon and the fourth exon in the human growth hormone gene (hGH1), there has been found a considerable single- nucleotide polymorphism that is linked to conventional or alternative gene splicing (Millar et al., 2010).
Therefore, a high degree of variation within splicing sequences identified in matching locations of the mGH gene fragments may be meaningful when addressing the process of the growth hormone gene splicing in American mink. According to the codon usage bias, certain synonymous codons occur more often in some phylogenetic lines (Behura and Severson, 2013). As a consequence, there are differences in frequency of different tRNA types for a given amino acid and, accordingly, they are variably available in a cell and the efficiency of translation by means of a given codon is also variable (Cannarozzi et al., 2010).In vertebrates, a strong positive correlation has been observed between the contents of the GC nucleotides and the presence of the nucleotide C or G in the third position in codons (Palidwor et al., 2010). The consensus sequence mGH goes in line with this model while deviations are especially about single-nucleotide variation.
This agrees with the assumption of the mutational bias' hypothesis for transition that the amount of the substitution of type GCAT is larger than of type ATGC (Smith and Eyre-Walker, 2001). This clearly corresponds to a higher number of the cf parameter for codons with an estimated mutation/SNP. A significant consequence of synonymous nucleotide variation is the potential influence on the secondary structure of mRNA (Ritz et al., 2012). Therefore, modelling the secondary structure of the mature RNA points to the fact that even small variation in the primary structure of RNA results in either introduction or a loss of whole structural elements. Other studies show that even a single SNP can have considerable functional consequences for the secondary structure of different types of RNA (Glinsky, 2008). A more serious biological effect, however, can be caused by condensed effect of numerous single-nucleotide polymorphisms (Ritz et al., 2012).
Natural selection favouring energetically stable variants of the mRNA structure was described in, among others, Chamary and Hurst (2005). They pointed to the lack of neutral (when it comes to phenotypic manifestation) character of synonymous substitutions and to an evolutionary trend in mammals which is manifested by selective favouring of the nucleotide variation that conditions more energetic stability of mRNA molecules. An important conclusion drawn from these results is the actual non-neutrality of the described synonymous variation. At the same time it has to be remembered that the directional selection mentioned should not be favourable to ultra-stable structures (Chamary and Hurst, 2005).
It is because, as opposed to the thermodynamically unstable mRNA structures characterised by a lower period of cytoplasmic half-life, ultra-stable molecules demonstrate decreased reactivity, lowered susceptibility to interacting with regulatory factors and the reduced ability to bind ribosomes (Kozak, 2002, Duan and Antezana, 2003, Nackley et al., 2006).Nowadays, researchers forward the necessity of more detailed studies on the assessment of the influence of synonymous nucleotide variation on mRNA, as the very stage of regulating the expression of genetic information at the RNA level may be critically important in modulation and the knowledge available on this topic is not satisfactory when compared to its potential significance (Johnson et al., 2011).
A crucial conclusion from the debate on the biological significance of synonymous nucleotide variation, and not only in the context of the growth hormone gene in American mink, is the potential high importance of variation identified in non-coding sequences (Chamary and Hurst, 2004). Many of the described above potential possibilities of causing a phenotypic effect by the identified single-nucleotide variation is connected with genetic polymorphisms within introns. The results obtained lead to a conclusion that is compatible with more and more popular and better documented trend in genomics that emphasizes the high but neglected importance of non-synonymous variation in the first place, but also of nucleotide variation occurring in non-coding sequences (Sauna and Kimchi-Sarfaty, 2011).
Table 1. Summary of nucleotide sequences of primers used for the amplification of the mGH gene fragments.
Acknowledgments: This research was funded by the Polish National Science Centre and would not have been possible without the kind support of Prof. Hossain Farid from Dalhousie University, Canada and Mr Robert Arnar Stefansson and Ms Menja von Schmalensee from West Iceland Centre of Natural History, Iceland.
Babu, D., I. Fusco, S. Mellone, M. Godi, F. Petri A., Prodam, S. Bellone, P. Momigliano-Richiardi, G. Bona and M. Giordano (2012). Identification of novel Exon Splice Enhancers (ESES) in the growth hormone gene (GH1) mutated in isolated GH Deficiency (IGHD). American Society of Human Genetics 62nd Annual Meeting. San Francisco: 717.
Behura, S.K. and D.W. Severson (2013). Codon usage bias: causative factors, quantification methods and genome-wide patterns: with emphasis on insect genomes. Biol. Rev. 88:49-61.
Cannarozzi, G., N.N. Schraudolph, M. Faty, P. von Rohr, M.T. Friberg, A.C. Roth, P. Gonnet, G. Gonnet and Y. Barral (2010). A role for codon order in translation dynamics. Cell 141:355-367.
Chamary, J.V. and L.D. Hurst (2004). Similar rates but different modes of sequence evolution in introns and at exonic silent sites in rodents: evidence for selectively-driven codon usage. Mol. Biol. Evol. 21: 1014-1023.
Chamary, J.V. and L.D. Hurst(2005). Evidence for selection on synonymous mutations affecting stability of mRNA secondary structure in mammals. Genome Biol 6:R75.
Chen, S.-J. and K.A. Dill (2000). RNA folding energy landscapes. PNAS 97:646-651.
Chorev, M. and L. Carmel (2012). The function of introns. Front. Genet. 3:doi:10.3389/fgene.2012.00055.
Desmet, F.-O., D. Hamroun, M. Lalande, G. Collod- BACopyrightroud, M. Claustres and Ch. BACopyrightroud (2009). Human Splicing Finder: an online bioinformatics tool to predict splicing signals. Nucleic Acids Res. 37:doi:10.1093/nar/gkp215.
Duan, J., M.S. Wainwright, J.M. Comeron, N. Saitou, A.R. Sanders, J. Gelernter and P.V. Gejman (2003). Synonymous mutations in the human dopamine receptor D2 (DRD2) affect mRNA stability and synthesis of the receptor. Hum. Mol.Genet. 12:205-216.
Ferrada E. and Wagner A. (2012). A comparison of genotype-phenotype maps for RNA and proteins. Biophys. J. 102:1916-1925.
Gabryelska, M.M. and J. Barciszewski J. (2011). Odyseja 1961: 50 Lat Kodu Genetycznego [The Odyssey 1961:50 years of the genetic code]. Nauka 3:77- 88.
Gingold, H. and Y. Pilpel (2011). Determinants of translation efficiency and accuracy. Mol. Syst. Biol. 7:481.
Glinsky, G.V. (2008). SNP-guided microRNA maps (MirMaps) of 16 common human disorders identify a clinically accessible therapy re-versing transcriptional aberrations of nuclear import and inflammasome pathways. Cell Cycle 22:3564- 3576.
Hamada, M., H. Kiryu, K. Sato, T. Mituyama and K. Asai (2009). Predictions of RNA secondary structure using generalized centroid estimators. Bioinformatics 25:465-473.
Hsu, F. R., W. Ch. Shia, W. J. Lo, H.Ch. Lin and H.-Y. Chang (2010). Discovering the relationship between single nucleotide polymorphisms and alternative splicing events. Proc. of the 10th WSEAS international conference on applied informatics and communications, and 3rd WSEAS international conference on Biomedical electronics and biomedical informatics. World Scientific and Engineering Academy and Society (WSEAS). Stevens Point:333-340.
Johnson, A. D., H. Trumbower and W. Sadee (2011). RNA structures affected by single nucleotide polymorphisms in transcribed regions of the human genome. Webmed Central BIOINFORMATICS 2: WMC001600.
Kawahara-Miki, R., K. Tsuda, Y. Shiwa, Y. Arai- Kichise, T. Matsumoto, Y. Kanesaki, S. Oda, S. Ebihara, S. Yajima, H. Yoshikawa and T. Kono (2011). Whole-genome resequencing shows numerous genes with nonsynonymous SNPs in the Japanese native cattle Kuchinoshima-Ushi. BMC Genomics 12:103 (1-8).
Kozak, M. (2002). Pushing the limits of the scanning mechanism for initiation of translation. Gene 299:1-34.
Loewe, L., B. Charlesworth, C. BartolomACopyright and V. NAlel (2006). Estimating selection on nonsynonymous mutations. Genetics 172:1079-1092.
McCarthy, E. M. S. and J. A. Phillips III (1998). Characterization of an intron splice enhancer that regulates alternative splicing of human GH pre-mRNA. Human Molecular Genetics 7:1491- 1496.
Millar, D.S., M. Horan, N.A. Chuzhanova and D.N. Cooper (2010). Characterisation of a functional intronic polymorphism in the human growth hormone (GH1) gene. Hum. Gen. 5:289-301.
Modrek, B., A. Resch, C. Grasso and C. Lee (2001). Genome-wide analysis of alternative splicing using human expressed sequence data. Nucleic Acids Res. 29:2850-2859.
Nackley, A.G., S.A. Shabalina, I.E. Tchivileva, K. Satterfield, O. Korchynskyi, S.S. Makarov, W. Maixner and L. Diatchenko (2006). Human catechol-O-methyltransferase haplotypes modulate protein expression by altering mRNA secondary structure. Science 5807:1930-1933.
Nakamura, Y. (2007). Codon Usage Database: http://www.kazusa.or.jp/codon/.
Palidwor, G.A., T.J. Perkins and X. Xia (2010). A general model of codon bias due to GC mutational bias. PLoS One 27:e13431.
Parmley, J. L. and L. D. Hurst (2007). How do synonymous mutations affect fit-ness Bioessays 29:515-519.
Ritz, J., J.S. Martin and A. Laederach (2012). Evaluating our ability to predict the structural disruption of RNA by SNPs. BMC Genomics 13, Suppl. 4:S6 1-11.
Sauna, Z.E. and C. Kimchi-Sarfaty (2011). Understanding the contribution of synonymous mutations to human disease. Nat. Rev. Genet. 12:683-691.
Shabalina, S.A., A.Y. Ogurtsov and N.A. Spiridonov (2006). A periodic pattern of mRNA secondary structure created by the genetic code. Nucleic Acids Res. 34:2428-2437.
Smith, N.G.C. and A. Eyre-Walker (2001). Synonymous codon bias is not caused by mutation bias in G+C-rich genes in humans. Mol. Biol. Evol. 18:982-986.
Solis, A.S., R. Peng, J.B. Crawford, J.A. Phillips III and J.G. Patton (2008). Growth hormone deficiency and splicing fidelity two serine/arginine-rich proteins, ASF/SF2 and SC35, act antagonistically. J. Biol. Chem. 283:23619- 23626.
Wagner A.(2014). Mutational robustness accelerates the origin of novel RNA phenotypes through phenotypic plasticity. Biophys. J. 106:955-965.
Wang, M. and A. Marin (2006). Characterization and prediction of alternative splice sites. Gene 366:219-227.