Bioinformatics Characterization of Growth Differentiation Factor 11 of Oryctolagus cuniculus.
Summary: Growth differentiation factor 11 (GDF11) is a member of the transforming growth factor-[beta] (TGF-[beta]) superfamily and acts as a regulator for the aging of multiple tissues. The mature forms of human GDF11 and GDF8 show more than 90% sequence identity but the role of GDF11 is not well understood. For detail annotation of GDF11 from Oryctolagus cuniculus, the mature segment of GDF11 was analyzed using different bioinformatics online tools for its physicochemical properties and secondary structure predictions. Multiple tools for secondary structure prediction were used and the information about rabbit GDF11 given by these tools was compared. It was observed that all tools gave different information regarding properties of GDF11 secondary structure and concluded that no tool is perfect for prediction and multiple tools should be tried for protein secondary structure prediction.
The final prediction should be manually selected which could be supported by tertiary structure prediction of protein. For phylogeny four closest homologous proteins of TGF-[beta] superfamily were selected i.e. GDF11, GDF8, inhibin and Bone Morphogenetic Protein 7 (BMP7). Ten sequences of each protein from same mammals were retrieved from GenPept database and included in the phylogenetic analysis to show evolutionary relatedness among these proteins. All selected proteins were resolved in the tree and appeared in their respective clusters. It was confirmed from the phylogenetic tree that GDF11 and GDF8 are paralogs of each other as both proteins are appeared to be derived from a common ancestor in the tree. Findings of this study will help to understand the interactions of rabbit GDF11 with different ligands and molecular details of how these interactions could be reached.
Keywords: GDF11, GDF8, Rabbit, Secondary structure prediction, Phylogeny.
The transforming growth factor-[beta] (TGF-[beta]) is a superfamily of a large group of secreted proteins which have different roles in tissue homeostasis and embryo development. On the basis of structural similarities and downstream signaling pathways the TGF-[beta] superfamily has been divided into two groups i.e. TGF-[beta] or activin group and Dpp or BMP group . The TFG-[beta] superfamily consists of more than 50 structurally related secreted proteins and most of these proteins have been categorized into three major subfamilies i.e. TFG-[beta], BMP and activin or inhibin . GDF11 is a secreted protein with 90% amino acid sequence identity with GDF8 that is also a growth factor and expressed in mammalian muscles. GDF8 inhibits myoblasts proliferation in culture and in GDF8 knockout mammals the increased skeletal muscle mass is caused . A variety of signaling polypeptides are included in TGF-[beta] superfamily that are responsible to regulate various cellular functions .
All proteins of TGF-[beta] family are synthesized as larger precursors which are processed through proteolysis and a biological active carboxyterminal domain having 110-140 amino acid residues is produced. This processed mature protein region is highly conserved among members of same family and nine cysteine residues .
TGF-[beta] family of proteins has been involved in signaling pathways which control various cellular proliferation and differentiation processes particularly muscle differentiation. One of the members of TGF-[beta] superfamily is GDF11 that has been derived together with GDF8 (also known as myostatin) from a common ancestor . Many recent reports have shown the biological role of GDF11 that is a topic of intense debate [6,7]. GDF11 is found to be closely related to GDF8 when studied in rat, mouse and human. Studies have shown that GDF11 is expressed in various parts including brain, eye, muscle and many other tissues at low levels . In developing vertebrates, the GDF11 was also hypothesized as an important regulator of axial skeleton patterning because the deletion of GDF11 in homozygous mice was resulted in defects in patterning of axial skeletal along with renal and palate defects.
An extended trunk with reduced or no tail was also exhibited by this GDF11 knockout mice which was resulted due to the formation of additional thoracic and lumbar vertebrate . It was also shown that GDF11 inhibits in vitro olfactory epithelial neurogenesis through induction of p27 (Kip1) and reversible cell cycle arrest in precursor cells. This role of GDF11 as negative autoregulation of cell proliferation is in accordance with the negative role of GDF8 as muscle cell proliferation regulation .
GDF11 has more than 90% sequence identity with GDF8. Other members of TGF-[beta] superfamily have also high sequence identities. No study has shown the evolutionary relationships among different members of TGF-[beta] superfamily from different mammals therefore, to fill this gap the current study was planned to infer evolutionary closeness or divergence of selected members of TGF-[beta] superfamily i.e. GDF11, GDF8, inhibin and BMP7. A very high protein sequence identity has been found between GDF11s of human and Oryctolagus cuniculus (rabbit) therefore rabbit GDF11 mature protein segment was also characterized in detail.
Characterization of rabbit GDF11
The protein sequence of O. cuniculus GDF11 was retrieved from the GenPept database using the primary accession number [GenBank: XP_002711212.1]. The physicochemical properties of mature region of GDF11 protein (i.e. 109 residues) such as molecular mass, instability index, theoretical pI, aliphatic index, extinction coefficient and grand average of hydropathicity (GRAVY) were studied using ProtParam from ExPASy server . PredictProtein server was also used to predict secondary structure, solvent accessibility, disorder regions and protein-protein binding sites .
Secondary Structure Prediction
The secondary structure prediction of rabbit GDF11 was done by SOPMA , CFSSP , PSIPRED , PHD  and GOR4  servers.
GDF11 protein sequence of O. cuniculus was retrieved from NCBI protein database and analyzed using protein BLAST (Basic Local Alignment Search Tool)  available on the NCBI website (http://www.ncbi.nlm.nih.gov/). Along with GDF11 protein sequences of O. cuniculus and nine different mammals, ten sequences each of GDF8, inhibin and BMP7 of same animals were also retrieved from GenPept for phylogenetic systematics. All sequences were aligned using ClustalX and imported into the MEGA5 program  for manual alignment. Maximum Likelihood phylogenetic tree was constructed using MEGA5 with 100 bootstrap replicates.
Characterization of rabbit GDF11
Physicochemical properties of GDF11 predicted by ProtParam tool are given in Table-1. It has been shown that GDF11 has a molecular weight of 12457.2 Daltons and pI 7.67. The value of calculated pI showed that GDF11 is nearly a neutral protein. The isoelectric point is useful for the separation of proteins on a polyacrylamide gel through isoelectric focusing. The concentration of a protein in solution can be calculated using its extinction coefficient. The values of aliphatic index, instability index and grand average of hydropathicity (GRAVY)  gave idea about the stability of GDF11. Instability index value was found to be 36.58 therefore GDF11 was predicted as a stable protein. The relative volume of a protein occupied by aliphatic side chains is referred by its aliphatic index and is important for increased thermo-stability of proteins. Aliphatic index of GDF11 was 54.59.
GRAVY index tells about solubility of proteins, GRAVY index of GDF11 was -0.452. A negative value of GRAVY for GDF11 defines it as hydrophilic in nature.
Table-1: Physicochemical properties of rabbit GDF11 predicted by ProtParam.
###1###Molecular weight###12457.2 Dalton
###21930 at Abs 0.1% 1.760,
###3###Extinction coefficient*###assuming all pairs of Cys
###residues form cystines
###Grand average of
Some other important features of rabbit GDF11 were also predicted by PredictProtein that is a meta-service for sequence analysis. PredictProtein has been used to predict structural and functional features of different proteins since 1992 . A detailed annotation and predicted features of rabbit GDF11 protein have been shown in Fig. 1 and details of residues have been given in Table-2.
Table-2: Description of residues of rabbit GDF11.
###Protein binding###2, 9-10, 13-14, 16-19, 37, 41-42, 46, 49,
###regions###51-55, 66-68, 90-91, 104
###15-25, 30-33, 60-62, 82-88, 93-97, 102-
###6-13, 16, 18, 24-25, 27, 35-36, 38, 43,
###3###Exposed regions###45, 47, 52-53, 55, 64-67, 69, 77, 79-80,
###88-91, 96, 98-99, 103-104, 106
###14, 19, 21, 23, 26, 28, 30-34, 37, 39,
###41-42, 44, 46, 48, 50, 54, 56, 59-60, 72-
###75, 78, 81-87, 92, 94, 97, 100-102, 105,
###15, 17, 20, 22, 29, 40, 49, 51, 57-58,
###61-63, 68, 70-71, 76, 93, 95
###6###1-15, 47-70, 72-73, 75-77, 98-109
Table-3: Secondary structure elements of rabbit GDF11 predicted by different tools.
Twelve protein binding regions and six strands have been predicted in GDF11 mature protein by PredictProtein. Five disordered regions have also been found in protein. Solvent accessible regions were also predicted that include exposed, buried and intermediate residues in GDF11 (Table-2).
Secondary Structure Prediction of GDF11
To evaluate online tools for protein secondary structure prediction, the elements of secondary structure of mature GDF11 protein from rabbit were predicted through different tools. The results showed by selected tools are given in Table-3. Interestingly, all the selected tools displayed different results for secondary structure elements. SOPMA showed that all four elements (i.e. [alpha]-helix, extended strands, [beta]-turns and random coils) were present in GDF11 whereas CFSSP that is Chou and Fasman Secondary Structure Prediction Server did not predict random coils. Similarly, PSIPRED did not predict [beta]-turns in GDF11. However, PHD and GOR4 gave nearly similar results with 42.20% and 35.78% residues belonged to extended strands according to PHD and GOR4 respectively. Both tools showed that GDF11 did not have any [alpha]-helix and [beta]-turn and 57.8% and 64.22% residues belonged to random coils according to PHD and GOR4 respectively.
Phylogenetic analysis of GDF11
BLASTp was used to identify local regions of similarity and statistical significance of GDF11, GDF8, inhibin and BMP7 protein sequences from ten selected organisms including O. cuniculus. Multiple sequence alignment was also performed through Geneious . The truncated sequences were deleted and longer sequences were shortened in the multiple sequence alignment to make them all equal in length. Phylogenetic analysis was performed to show evolutionary closeness among selected members of TGF-[beta] family (Fig 2). Actin from Homo sapiens was used to root the tree.
Maximum Likelihood (ML) method was employed to infer evolutionary history that was based on Poisson correction model . The bootstrap consensus tree that was inferred from 100 replicates was taken for the representation of evolutionary history of selected taxa . Branches corresponding to partitions reproduced in less than 50% bootstrap replicates were collapsed. Total 41 amino acid sequences were involved in the analysis. There were a total of 73 positions in the final dataset. Evolutionary analysis was conducted in MEGA5 .
Although GDF11 and GDF8 have more than 90% sequence identity in their mature signaling domain and all other TGF-[beta] family members have also high sequence identities. A phylogenetic analysis was conducted to show evolutionary relationships among different members of TGF-[beta] family. For phylogeny, four members of TGF-[beta] family i.e. GDF11, GDF8, inhibin and BMP7 were selected and sequences of ten same animals for each family were retrieved from GenPept. The tree has clearly four separate clades each labeled with name of protein they belong. A very high percentage of identity is displayed by all the taxa falling in GDF11 clade as they are showing same pattern. Ovis aries from GDF8 clade has been shown to be evolutionary divergent from other sequences. As expected, GDF11 and GDF8 have strong evolutionary relationships and falling under a common ancestor. Inhibin and BMP7 have shown that they are distantly related to GDF11.
Gorilla gorilla gorilla of both inhibin and BMP7 groups has displayed some interesting observation as it is emerging out from rest of its respective clades and showing its distant evolutionary relationships with other proteins of its class. From the phylogenetic tree it has been observed that no sequence of one group appeared in any other clade and GDF11 protein is very much conserved and highly identical among different mammals whereas inhibin and BMP7 are not very much conserved or closely related among different mammals. High bootstrap values at nodes have supported the phylogenetic analysis.
Proteins are important molecules in living organism and play different functions. Proteins are specific in their shapes and functions. The three-dimensional structure of proteins is a key characteristic and referred as tertiary structure in which linear chains get folded. This structure enables proteins to interact with other proteins and ligands to perform specific functions therefore results in domains of electrochemical interactions . The transmembrane proteins are very important as they play different roles in life activities of cells therefore knowledge about membrane buried residues in transmembrane proteins is of great importance . The membrane proteins are embedded in cellular membranes and provide targets for more than 60% drugs available in market. According to Cordes et al.  20-30% of all proteins in any organism belong to membrane proteins.
The structure of a protein is important to understand the role of that protein and the knowledge about transmembrane protein segments, the bends in its helices and membrane buried regions help to study tertiary structures of proteins. It has been widely accepted that all the information that is required to form 3D structures is carried by amino acid sequences of proteins  therefore the protein structures (2D and 3D) can be predicted theoretically on the basis of their amino acid sequences. Different tools for protein secondary structure predictions give different properties i.e. presence or absence of [alpha]-helices, extended strands, [beta]-turns and random coils and percentage of residues belonging to these regions. Therefore, one should not rely on a single tool for protein secondary structure prediction and should try as many tools as possible and then manually choose the best prediction among all.
Protein tertiary structure should also be predicted to get better idea about presence and locations of helices, strands, turns and coils.
Assigning a physicochemical index to amino acid sequences of proteins can help to derive structural or biological information including secondary structure , kink  and the hydrophobic regions  in those proteins. The information about hydrophobic regions and their predictions are of great importance to understand the structure and functions of proteins. The folding of a protein is driven mainly by the effect of its hydrophobicity which gives high degree stability to protein structures. In bioinformatics, the determination of membrane buried regions in proteins is computationally exhaustive and various prediction methods and tools have been described but they have some drawbacks in their prediction accuracy and compliance .
Protein-protein interactions are very important in various biological processes including cellular communication, gene expression, metabolism and immune responses. Binding regions are the unique parts of protein structures that provide points for protein interactions. Predictions and identifications of binding regions help to understand protein functions and to design drugs which can target these binding regions effectively that are involved in different diseases e.g. cancer . A number of disordered proteins play their roles through binding to some structured partner proteins and undergo a disorder to order phase. Such coupled binding and folding of partner proteins can infer various functional advantages including accurate binding specificity control without increase in affinity . Furthermore, the binding site is allowed by its inherent flexibility to show different conformations and binding with multiple partner proteins.
These characteristics depict the importance of these binding regions in signaling and regulation processes . The principles of protein recognition mechanisms are dependent on detailed knowledge about protein interactions. The classification of protein binding regions might be useful to decipher protein interaction networks, understand protein functions and design .
From the phylogenetic analysis it was revealed that all four selected members of TGF-[beta] superfamily clearly resolved in the tree and all sequences appeared in their respective group of proteins. In our analysis all the mammals of GDF8 superfamily appeared in a single clade but in case of GDF8 of fishes, the biology was differed significantly from those of mammals and the fishes appeared in two distinct clades . In another study , the cluster of GDF11 was clearly separated from that of GDF8 when a phylogenetic tree of both proteins in teleosts and representative tetrapods was generated. Xing et al.  reported that vertebrate GDF8 genes are paralogs of GDF11 and they all have emerged from a common ancestor of GDF8/GDF11 chordate. They suggested that the amphioxus GDF8/11 could be an ancestral GDF8/GDF11 homologue of vertebrates.
The event of gene duplication that generated GDF8 and GDF11 might have occurred before the divergence of vertebrates and after or at the point of divergence of amphioxus from vertebrates.
In this study we have successfully annotated the mature segment of GDF11 from O. cuniculus and revealed information about its different regions such as protein-protein binding regions, strands, exposed, buried, intermediate and disordered regions. The study has also explored physicochemical nature of rabbit GDF11, secondary structure prediction through multiple tools and comparison of the information given by these tools. A phylogenetic analysis was also conducted of four selected member proteins of TGF-[beta] superfamily to reveal the evolutionary relatedness or divergence among them. The phylogeny inferred that GDF11 and GDF8 of different mammals are paralogs of each other and have evolved from a common ancestor.
Bioinformatics section of Molecular Care, University of Agriculture, Faisalabad, Pakistan is highly acknowledged for providing space and expertise to accomplish this study.
1. S. J. Newfeld, R. G. Wisotzky and S. Kumar, Molecular evolution of a developmental pathway: phylogenetic analyses of transforming growth factor-beta family ligands, receptors and Smad signal transducers. Genetics., 152, 783 (1999).
2. B. Funkenstein and E. Olekh, Growth/differentiation factor-11: an evolutionary conserved growth factor in vertebrates, Dev. Genes Evol., 220, 129 (2010).
3. W. E. Taylor, S. Bhasin, J. Artaza, F. Byhower, M. Azam, D. H. Jr. Willardr, F. C. Jr. Kull and N. Gonzalez-Cadavid, Myostatin inhibits cell proliferation and protein synthesis in C2C12 muscle cells, Am. J. Physiol. Endocrinol. Metab., 280, E221 (2001).
4. A. Herpin, C. Lelong and P. Favrel, Transforming growth factor-[beta]-related protein, an ancestral and widespread superfamily of cytokines in metazoans, Dev. Comp. Immunol., 28, 461 (2004).
5. F. Xing, X. Tan, P. J. Zhang, J. Ma, Y. Zhang, P. Xu and Y. Xu, Characterization of amphioxus GDF8/GDF11 gene, an archetype of vertebrate MSTN and GDF11, Dev. Genes Evol., 217, 549 (2007).
6. C. E. Brun and M. A. Rudnicki, GDF11 and the mythical fountain of youth, Cell Metab., 22, 54 (2015).
7. M. A. Egerman, S. M. Cadena, J. A. Gilbert, A. Meyer, H. N. Nelson, S. E. Swalley, C, Mallozzi, C. Jacobi, L. L. Jennings, I. Clay and G. Laurent, GDF11 increases with age and inhibits skeletal muscle regeneration, Cell Metab., 22, 164 (2015).
8. M. Nakashima, T. Toyono, A. Akamine and A. Joyner, Expression of growth/differentiation factor 11, a new member of the BMP/TGFb superfamily during mouse embryogenesis, Mech. Dev., 80, 185 (1999).
9. H. H. Wu, S. Ivkovic, R. C. Murray, S. Jaramillo, K. M. Lyons, J. E. Johnson and A. L. Calof, Autoregulation of neurogenesis by GDF11, Neuron., 37, 197 (2003).
10. E. Gasteiger, A. Gattiker, C. Hoogland, I. Ivanyi, R. D. Appel and A. Bairoch, ExPASy: the proteomics server for in-depth protein knowledge and analysis, Nucleic Acids Res., 31, 3784 (2003).
11. G. Yachdav, E. Kloppmann, L. Kajan, M. Hecht, T. Goldberg, T. Hamp, P. Honigschmid, A. Schafferhans, M. Roos, M. Bernhofer and L. Richter, PredictProtein-an open resource for online prediction of protein structural and functional features, Nucleic Acids Res., gku366 (2014).
12. C. Geourjon and G. Deleage, SOPMA: significant improvements in protein secondary structure prediction by consensus prediction from multiple alignments, CABIOS, 11, 681 (1995).
13. P. Y. Chou and G. D. Fasman, Prediction of protein conformation. Biochemistry., 13, 222 (1974).
14. D. W. A. Buchan, F. Minneci, T. C. O. Nugent, K. Bryson and D. T. Jones, Scalable web services for the PSIPRED Protein Analysis Workbench, Nucleic Acids Res., 41: W340 (2013).
15. B. Rost and C. Sander, Combining evolutionary information and neural networks to predict protein secondary structure, Proteins: Struct. Funct. Bioinf., 19, 55 (1994).
16. C. Combet, C. Blanchet, C. Geourjon and G. Deleage, NPS@: Network Protein Sequence Analysis, Tibs., 25, 147 (2000).
17. S. F. Altschul, W. Gish, W. Miller, E. W. Myers D. J. Lipman, Basic local alignment search tool, J. Mol. Biol., 215, 403 (1990).
18. K. Tamura, D. Peterson, N. Peterson, G. Stecher, M. Nei and S. Kumar, MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods, Mol. Biol. Evol., 28, 2731 (2011).
19. I. K. A. I. Atsushi, Thermostability and aliphatic index of globular proteins. J. Biochem., 88, 1895 (1980).
20. M. Kearse, R. Moir, A. Wilson, S. Stones-Havas, M. Cheung, S. Sturrock, S. Buxton, A. Cooper, S. Markowitz, C. Duran and T. Thierer, Geneious Basic: An integrated and extendable desktop software platform for the organization and analysis of sequence data, Bioinformatics., 28, 1647 (2012).
21. E. Zuckerkandl and L. Pauling, Evolutionary divergence and convergence in proteins. Edited in Evolving Genes and Proteins by V. Bryson and H.J. Vogel, pp. 97, Academic Press, New York (1965).
22. J. Felsenstein, Confidence limits on phylogenies: An approach using the bootstrap, Evolution., 39, 783 (1985).
23. G. Ramachandran, C. Ramakrishnan and V. Sasisekharan, Stereochemistry of polypeptide chain configuration, J Mol. Biol., 7, 95 (1963).
24. J. Meher, M. K. Raval, G. Dash and P. K. Meher, Prediction of hydrophobic regions effectively in transmembrane proteins using digital filter. J. Biomed. Sci. Eng., 4, 562 (2011).
25. F. Cordes, J. Bright and M. Sansom, Proline induced distortions of transmembrane helices. J. Mol. Biol., 323, 951 (2002).
26. C. B. Anfinsen, Principles that govern the folding of protein chains, Science., 181, 223 (1973).
27. H. Qian, Prediction of [alpha]-helices in proteins based on thermodynamic parameters from solution chemistry, J. Mol. Biol., 256, 663 (1996).
28. P. Mohapatra, A. Khamari and M. Raval, A method for structural analysis of [alpha]-helices of membrane proteins, J. Mol. Biol., 10, 393 (2004).
29. M. B. Swindells, A procedure for the automatic determination of hydrophobic cores in protein structures, Prot. Sci., 4, 93 (1995).
30. M. E. Halatsch, U. Schmidt, J. Behnke-Mursch, A. Unterberg and C. R. Wirtz, Epidermal growth factor receptor inhibition for the treatment of glioblastoma multiforme and other malignant brain tumours, Cancer Treat. Rev., 32, 74 (2006).
31. H. Xie, S. Vucetic, L. M. Iakoucheva, C. J. Oldfield, A. K. Dunker, V. N. Uversky and Z. Obradovic, Functional anthology of intrinsic disorder. 1. Biological processes and functions of proteins with long disordered regions, J. Proteome Res., 6, 1882 (2007).
32. B. Meszaros, I. Simon and Z. Dosztanyi, Prediction of protein binding regions in disordered proteins, PLoS Comput. Biol., 5, e1000376 (2009).
33. J. Teyra, M. Paszkowski-Rogacz, G. Anders and M. T. Pisabarro, SCOWLP classification: structural comparison and analysis of protein binding regions, BMC Bioinformatics., 9, 9 (2008).
34. T. Kerr, E. H. Roalson and B. D. Rodgers, Phylogenetic analysis of the myostatin gene sub-family and the differential expression of a novel member in zebrafish, Evol Dev., 7, 390 (2005).
|Printer friendly Cite/link Email Feedback|
|Author:||Mustafa, Ghulam; Iqbal, Muhammad Javid; Hassan, Muhammad; Jamil, Amer|
|Publication:||Journal of the Chemical Society of Pakistan|
|Date:||Dec 31, 2017|
|Previous Article:||Optimization of Ethanol Production from Garcinia Cambogia Residues and the Effects of its Medicinal Component on Production Yield.|
|Next Article:||Bioavailability of Aluminum from Black Shale Using Acidic Metabolites of Heterotrophs.|