Genome wide association study on feed conversion ratio using imputed sequence data in chickens.
Increasing the productivity of livestock species and minimizing their environmental impact are the major goals in the livestock husbandry , Improving feed efficiency would assist in meeting these challenges because feed consumption is about 70% of the total costs in livestock production . Previous studies demonstrated that genetic selection could account for 85% to 90% of phenotypic improvement and plays a predominant role in underlying genetic architecture of increasing feed efficiency, while nutrition and management only explained 10% to 15% of phenotypic improvement . Feed conversion ratio (FCR) is the most widely used measurement of feed efficiency by indicating how much feed mass livestock converts into the desired output. Up to now, a total of 32 quantitative trait loci (QTFs) associated with FCR have been reported. However, most QTFs are mapped with microsatellites markers by using linkage analysis  and genome scans  resulting in low resolution.
Compared with previous studies, genome-wide association studies (GWAS) can accurately identify genes involved in economically important traits of cattle , pigs , and chickens . With the reduction of sequencing costs and rapid development of imputation technology, GWAS is the preferred option for such analyses. Using imputed whole genome sequence data, GWAS could take full advantage of all markers and detect variants associated with interesting traits, without being affected or influenced by the linkage disequilibrium between single nucleotide polymorphisms (SNPs) and the underlying genes . These observations suggest that whole genome sequence data is an effective pipeline to enhance the power of GWAS.
The objective of this study was to identify candidate genes associated with FCR via GWAS in a Chinese indigenous chicken population using sequence data imputed from a SNP panel.
MATERIALS AND METHODS
The study population was obtained from a Chinese indigenous chicken breed that has been maintained for 25 generations by the Wens Nanfang Poultry Breeding Co. Ltd (Yun Fu, Guangdong, China). We obtained 435 male birds from the 25th generation, which were generated by 30 males and 360 females from the 24th generation and were reared with the same recommended nutritional and environmental conditions. The FCR was calculated according to recorded data during the feeding trial. For more details about this population, please refer to Zhang et al .
Chip data and imputed sequences data
All of 435 male chickens from the 25th generation and 15 sires of the 24th generation were selected for genotyping. The genomic DNA from the 450 birds was extracted from blood samples using the NRBC Blood DNA Kit (Omega Bio-Tek, Norcross, GA, USA) according to the manufacturers instructions, and DNA samples were analyzed for genome DNA concentration and integrity. DNA samples of the appropriate quality were genotyped using 600K Affymetrix Axiom HD chicken genotyping array, which contains 580,961 SNPs distributed on 28 autosomes, two linkage groups (TGE64 and TGE22C19W28_E50C23), and two sex chromosomes. Then, twenty-four key individuals were selected for sequencing from the 435 birds and their 15 sires based on a strategy of maximizing the expected genetic relationship, while maximizing the proportion of unique genomes sequenced in the population. In this process, we used G matrix to calculate the genetic relationship between key individuals and the remaining population, details of the selection of key individuals in our study were described by Ye et al [11,12]. Finally, 21 males and 3 sires were selected as key individuals and re-sequenced with 150-bp paired-end reads on the Illumina HiSeq 3000 platform with the average sequence depth of 14.62. Sequencing reads were aligned to the Gallus gallus 4.0 genome using the Burrows-Wheeler Alignment tool . Duplicated reads were removed using Picard release 1.119 (http://sourceforge.net/projects/ picard/files/picard20tools/1.119/). GATK software  was used for SNP calling with the default parameters. Finally, 11,645,758 SNPs remained for further analysis and the concordance between chip data and sequence data was 98.0%. After that, the SNP chip was imputed to sequence data using Fimpute software  with key sequenced individuals as the reference. For more details about the genotype imputation, please refer to Ye et al . Quality control both for the chip and imputed sequences genotypes were conducted use PFINK 1.90 software  with the same criteria: SNP call rates >0.97, minor allele frequencies >0.05, obeys Hardy-Weinberg equilibrium (p>0.00001), and individual call rate >0.95. After quality control, 447,766 SNPs from SNP chip and 8,626,020 SNPs from imputed sequences remained for subsequent analysis. These SNPs distributed on autosome with the average distance between adjacent SNPs ranged from 92.63 to 168.27 bp among different autosomes.
Mixed linear model was used for the GWAS, and the analysis was performed with GEMMA software (Zhou and Stephens ). The statistical model is described as follows:
Y = Xb + Sa + Zu + e
where Y is the vector of phenotypic values for all individuals, b is a vector of fixed effects including the batch effect, which has three levels, a is the substitution effect of the SNP under consideration, u is the random additive genetic effect; X and Z are the design matrices for b and u, respectively; S is the design vector for a; e is the vector of random residuals. In this model, u and e were assumed to have the structure u~N(0, G[[omega].sup.2.sub.a]) and e~N(0, I[[omega].sup.2.sub.e]), where G is the additive genomic relationship matrix which was derived from the sequence data, G = 1/p [[SIGMA].sup.p.sub.i=1] ([X.sub.i] - [I.sub.n][[bar.X].sub.i]) [([X.sub.i] - [I.sub.n][[bar.X].sub.i]).sup.T], X is the n x p matrix of genotypes of n individuals' p markers, [X.sub.i] as its ith column representing genotypes of ith SNP, [[bar.X].sub.i] as the sample mean and [I.sub.n] as a n x 1 vector of 1's, [[sigma].sup.2.sub.a] is the polygenic additive variance, I is an identity matrix and [[sigma].sup.2.sub.e] is the residual variance. To compare the GWAS using the sequence data with chip data, the GWAS was performed using the chip data with the same model. DMU software  was used to estimate the heritability explained by the sequence data with the same model as the one described above.
To visualize the result from the GWAS, Manhattan plots and quantile-quantile plots were drawn by the qqman package  in R for each trait. In this study, the significant level was adjusted based on the effective number of independent tests, which was calculated by the simpleM software .
Candidate genes which were located within or nearby the significant SNPs (within 0.3 Mb) were identified by Ensembl (http://www.ensembl.org/index.html) and NCBI (https://www. ncbi.nlm.nih.gov/) annotation of the Gallusgallus 4.0 genome version.
Comparing significant regions with reported quantitative trait loci
SNPs with p-value <1.0 x [10.sup.-5] were compared with all reported QTLs that associated with FCR. The QTLs were obtained from the Chicken QTLdb (https://www.animalgenome.org/cgi-bin/ QTLdb/GG/index). Based on the physical location, QTLs close to the significant SNPs were extracted, with a maximum distance of 5 Mb between the QTL and the SNP to be compared.
Phenotype statistics and analysis of genetic parameter
Descriptive statistics for trait FCR are presented in Table 1. We found that the phenotypes of FCR were normally distributed (Shapiro-Wilk test, p>0.05), and its heritability was 0.33.
The effective number of independent tests was 631,181 as calculated by the simple method. Hence, the threshold p-value was adjusted to 7.92 x [10.sup.-8] (0.05/631,181) for a genome-wide significance level, and to 1.58 x [10.sup.-6] (1.00/631,181) for a genomewide suggestive significance level.
In the case of using imputed whole sequence, one genomewide significant SNP on chromosome 8 locating at 5th intron of zinc finger and BTB domain containing 41 (ZBTB41) was found to be associated with FCR, and 9 SNPs reached the suggestive significance level (Table 2). Among these 9 SNPs, 3 of them were located between 45.43 Mb and 45.61 Mb of chromosome 1 with the nearest genes being ubiquitin specific peptidase 44 (USP44), leukotriene A4 hydrolase (LTA4H), and ETS transcription factor (ELK3), respectively. One SNP was located on chromosome 2 and 60 Kb distance away from its nearest R-spondin 2 (RSP02) gene. Two SNPs were located on a narrow region of chromosome 4 (from 15.59 Mb to 16.04 Mb), and the nearest genes were inhibitor of apoptosis protein 3 (IAP3) and sosondowah ankyrin repeat domain family member D (SOWAHD). Another two SNPs were located on the positions of 1.39 Mb and 2.71 Mb of chromosome 8 respectively, with the first SNP locating on the 14th intron of calmodulin regulated spectrin associated protein family member 2 (CAMSAP2) and the other locating on the 4th intron of potassium sodium-activated channel subfamily T member 2 (KCNT2). The remaining 1 SNP was located on the position of 3.20 Mb of chromosome 26 and member of RAS oncogene family (RAP1A) was its nearest gene. In the case of using chip data, only one SNP reach a suggestive significance level, and the SNP was also detected by using imputed whole sequence (Figure 1).
Comparing significant regions with reported quantitative trait loci
In our study, p-value of 31 SNPs was smaller than 1.0 x [10.sup.-5]. They were compared with the reported QTLs collected from Chicken QTLdb. The results are shown in Supplementary Table S1.A total of 32 QTLs on autosomes were reported to be associated with FCR, and 8 of those QTLs were found near the 31 SNPs (Supplementary Table SI). Besides, seven SNPs located on chromosome 1 and 2 located far from reported QTL. Three out of the 7 SNPs located on a narrow region from 102.70 Mb to 106.46 Mb. The distribution of chicken FCR related QTLs as compared with SNPs with p-value lower than 1 x [10.sup.-5] are graphically displayed in Figure 2.
Feeding traits are economically important in chicken industry, as they largely determine the edible percentage of the chicken and display moderate to high heritability. In this study, with the scientific feeding management and systematic record of phenotypes, a total of 435 birds' average daily feed intake and average daily gain were used to calculate the FCR. Through the statistical test, the FCR presented the normal distribution (Shapiro-Wilk test, p>0.05). The hereditability estimates for FCR were higher than the estimate of Aggrey et al . Based on the genetic parameter estimates, exploring the genetic mechanisms and identifying major genes would be useful for improving feed efficiency. In our study, we performed GWAS for FCR in a Chinese indigenous chicken population using both imputed whole sequence data and chip data.
To our expectation, the imputed sequence data including more SNPs could capture more genetic variation and detect more signals than a SNP array. With the same statistical model and significance level for both datasets, the SNPs detected from imputed whole sequence data were more than that from the chip data. On one hand, this demonstrated the power of imputed whole sequence data. On another hand, the stronger associations with imputed genotypes were partly due to some imputation artefacts driven by allele frequencies in the imputed loci.
In this study, the ten candidate genes detected for FCR are partly functionally related to feeding traits. Functional study showed that USP44 prevents the premature activation of the anaphase-promoting complex and regulating centrosome separation, positioning, and mitotic spindle geometry. LTA4H is an enzyme that would generate leukotriene B4 (LTB4) . LTB4 is an extremely pro-inflammatory lipid mediator that can exert its activity by binding to receptors BLT1 or BLT2 . ELK3 is a ternary complex factor and transcription repressor, which belongs to the ETS family involved in angiogenesis during embryonic development . In addition, individuals lacking ELK3 protein had smaller tumors due to their inability to become vascularized and oxygenated. RSP02 is a member of R-spondin family of proteins. These proteins are secreted ligands of leucine-rich repeat containing G protein-coupled receptors that enhance Wnt signaling through the inhibition of ubiquitin E3 ligases . IAP3 encodes a protein that belongs to a family of apoptotic suppressor proteins. This protein functions through binding to tumor necrosis factor receptor-associated factors TRAF1 and TRAF2 and inhibits apoptosis induced by menadione, a potent inducer of free radicals, and interleukin 1-[beta] converting enzyme . SOWAHD is a protein coding gene linked to Iroquois genes, suggesting that regulatory constraints underlie the maintenance of the Iroquois-Sowah syntenic block . CAMSAP2 specifically binds the minus-end of non-centrosomal microtubules, which can regulate the dynamics, organization ,and polymerization of microtubules [28,29]. ZBTB41 is a protein gene, which maybe involved in transcriptional regulation . KCNT2 is a human gene that encodes the [K.sub.Na] protein potassium channel activated by internal raises in sodium or chloride ions . RAP1A encodes a member of the Ras family of small GTPases. The encoded protein undergoes a change in conformational state and activity, depending on whether it is bound to guanosine triphosphate (GTP) or guanosine diphosphate. This protein is activated by several types of guanine nucleotide exchange factors, and inactivated by two groups of GTPase-activating proteins .
QTLs that reported to be associated with chicken FCR are distributed across the whole genome. Part of them were reported several times in previous studies and overlapped with significant SNPs detected in this study, which suggested that the candidate regions detected in this study were reliable. Other SNPs that reached the suggestive significant levels were not located in QTL region may be the new candidate loci or statistically false positive signals.
The selection of significant level was seriously considered in this study. Though Bonferroni correction is widely used, it is rigorous in the genotype data with high linkage disequilibrium. Especially for such a high density imputed sequence data, the false negative results are not neglectable . Hence, the number of effectively independent tests through principal component analysis  were used in this study.
In this study, we conducted a genome wide association study for FCR using both imputed whole sequence data and SNP array in a Chinese indigenous chicken population. A list of significant SNPs and ten candidate genes were identified, and several of these regions were overlapped with QTTs reported in previous studies. Results from this study provided valuable prior information for chicken genomic breeding program and would potentially improve our understanding of the molecular mechanism for feeding traits.
CONFLICT OF INTEREST
We certify that there is no conflict of interest with any financial organization regarding the material discussed in the manuscript.
This work was supported by the National Natural Science Foundation of China (31772556), the earmarked fund for China Agriculture Research System (CARS-41), the S&T Planning Project of Guangdong (2015A020209159), and the Pearl River S&T Nova Program of Guangzhou (201506010027).
[1.] Hume DA, Whitelaw CBA, Archibald AL. The future of animal production: improving productivity and sustainability. J Agric Sci 2011;149:9-16.
[2.] Zhang W, Aggrey SE. Genetic variation in feed utilization efficiency of meat-type chickens. World Poult Sci J 2003;59:32839.
[3.] Havenstein GB, Ferket PR, Qureshi MA. Growth, livability, and feed conversion of 1957 versus 2001 broilers when fed representative 1957 and 2001 broiler diets. Poult Sci 2003;82: 1500-8.
[4.] Rao Y, Shen X, Xia M, et al. SNP mapping of QTL affecting growth and fatness on chicken GGA1. Genet Sel Evol 2007;39: 569-82.
[5.] van Kaam JB, Groenen MA, Bovenhuis H, et al. Whole genome scan in chickens for quantitative trait loci affecting growth and feed efficiency. Poult Sci 1999;78:15-23.
[6.] Abo-Ismail MK, Vander Voort G, Squires JJ, et al. Single nucleotide polymorphisms for feed efficiency and performance in crossbred beef cattle. BMC Genet 2014;15:14.
[7.] Onteru SK, Gorbach DM, Young JM, et al. Whole genome association studies of residual feed intake and related traits in the pig. PLoS One 2013;8:e61756.
[8.] Xu Z, Ji C, Zhang Y, et al. Combination analysis of genomewide association and transcriptome sequencing of residual feed intake in quality chickens. BMC Genomics 2016;17:594.
[9.] Isotouru T, Sahana G, Guldbrandtsen B, Lund MS, Vilkki J. Genome-wide association analysis of milk yield traits in Nordic Red Cattle using imputed whole genome sequence variants. BMC Genet 2016;17:55.
[10.] Zhang Z, Xu ZQ, Luo YY, et al. Whole genomic prediction of growth and carcass traits in a Chinese quality chicken population. J Anim Sci 2017;95:72-80.
[11.] Druet T, Macleod IM, Hayes BJ. Toward genomic prediction from whole-genome sequence data: impact of sequencing design on genotype imputation and accuracy of predictions. Heredity (Edinb) 2014;112:39-47.
[12.] Ye S, Yuan X, Lin X, et al. Imputation from SNP chip to sequence: a case study in a Chinese indigenous chicken population. J Anim Sci Biotechnol 2018;9:30.
[13.] Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 2009;25:1754-60.
[14.] Mckenna A, Hanna M, Banks E, et al. The Genome Analysis Toolkit: a Map Reduce framework for analyzing next-generation DNA sequencing data. Genome Res 2010;20:1297-303.
[15.] Sargolzaei M, Chesnais JP, Schenkel FS. A new approach for efficient genotype imputation using information from relatives. BMC Genomics 2014; 15:478.
[16.] Purcell S, Neale B, Toddbrown K, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. Am J Hum Genet 2007;81:559-75.
[17.] Zhou X, Stephens M. Genome-wide efficient mixed-model analysis for association studies. Nat Genet 2012;44:821-4.
[18.] Madsen P, Sorensen P, Su G, et al. DMU--a package for analyzing multivariate mixed models. In: Proceedings of the 8th World Congress on Genetics Applied to Livestock Production, Belo Horizonte, MG, Brazil, 13-18 August, 2006; 2014. p. 27-11.
[19.] Turner SD. qqman: an R package for visualizing GWAS results using Q-Q and manhattan plots. Biorxiv 2014 May 14 [Epub], https://doi.org/10.1101/005165
[20.] Gao X, Starmer J, Martin ER. A multiple testing correction method for genetic association studies using correlated single nucleotide polymorphisms. Genet Epidemiol 2008;32:361-9.
[21.] Aggrey SE, Karnuah AB, Sebastian B, Anthony NB. Genetic properties of feed efficiency parameters in meat-type chickens. Genet Sel Evol 2010;42:25.
[22.] Fuchs G, Shema E, Vesterman R, et al. RNF20 and USP44 regulate stem cell differentiation by modulating H2B monoubiquitylation. Mol Cell 2012;46:662-73.
[23.] Low CM, Akthar S, Patel DF, et al. The development of novel LTA4H modulators to selectively target LTB4 generation. Sci Rep 2017;7:44449.
[24.] Rogers CD, Phillips JL, Bronner ME. Elk3 is essential for the progression from progenitor to definitive neural crest cell. Dev Biol 2013;374:255-63.
[25.] Ilmer M, Recio BA, Regel I, et al. RSP02 enhances canonical Wnt signaling to confer sternness-associated traits to susceptible pancreatic cancer cells. Pancreatology 2015;15:S23.
[26.] Pachlopnik SJ, Canioni D, Moshous D, et al. Clinical similarities and differences of patients with X-linked lymphoproliferative syndrome type 1 (XLP-1/SAP deficiency) versus type 2 (XLP-2/XIAP deficiency). Blood 2011;117:1522-9.
[27.] Maeso I, Irimia M, Tena JJ, et al. An ancient genomic regulatory block conserved across bilaterians and its dismantling in tetrapods by retrogene replacement. Genome Res 2012;22:642-55.
[28.] Jiang K, Hua S, Mohan R, et al. Microtubule minus-end stabilization by polymerization-driven CAMSAP deposition. Dev Cell 2014;28:295-309.
[29.] Hendershott MC, Vale RD. Regulation of microtubule minus-end dynamics by CAMSAPs and Patronin. Proc Natl Acad Sci USA 2014;111:5860-5.
[30.] Yokoyama S, Ito Y, Ueno-Kudoh H, et al. A systems approach reveals that the myogenesis genome network is regulated by the transcriptional repressor RP58. Dev Cell 2009;17:836-48.
[31.] Santi CM, Ferreira G, Yang B, et al. Opposite regulation of Slick and Slack K+ channels by neuromodulators. J Neurosci 2006; 26:5059-68.
[32.] Sayyah J, Bartakova A, Nogal N, et al. The Ras-related Protein, Rap1A, mediates thrombin-stimulated, integrin-dependent glioblastoma cell proliferation and tumor growth. J Biol Chem 2014;289:17689-98.
[33.] Lautenberger JA, Troyer JL, Nelson GW, et al. Accounting for multiple comparisons in a genome-wide association study (GWAS). BMC Genomics 2010;11:724.
Jiaying Wang (1), Xiaolong Yuan (1), Shaopan Ye (1), Shuwen Huang (1), Yingting He (1), Hao Zhang (1), Jiaqi Li (1), Xiquan Zhang (1), and Zhe Zhang (1), *
* Corresponding Author: Zhe Zhang
Tel: +86-0-20-85282019, Fax: +86-0-20-85282019, E-mail: firstname.lastname@example.org
(1) Guangdong Provincial Key Lab of Agro-Animal Genomics and Molecular Breeding, College of Animal Science, South China Agricultural University, Guangzhou, 510642, China
Submitted Apr 23, 2018; Revised Jul 20, 2018; Accepted Sept 20, 2018
Caption: Figure 1. Manhattan plot (left) and quantile-quantile plot (right) of the observed p-values for feed conversion ratio (FCR) using chip data (top) and imputed whole sequence data (bottom). In the Manhattan plots, the position of each single nucleotide polymorphism on the chromosome was plotted against its - [log.sub.10]-transformed p-value. The dotted red line and solid red line in the Manhattan plots represent the significant threshold of 1.58 x [10.sup.-6] and 7.92 x [10.sup.-8] respectively. For quantile-quantile plot, the x-axis represents the expected - log 10-transformed p-values, and the y-axis represents the observed -log 10- transformed p-values.
Caption: Figure 2. Genome-wide association study (GWAS) result of feed conversion ratio compared with reported quantitative trait loci (QTLs) associated with feed conversion ratio (FCR). The inner and outer circles were used to indicates the reported QTLs obtained from AnimalQTL database and from this study, respectively. The different shades of grey in the inner circle were used to distinguish the adjacent chromosomes. The red lines on the inner circle indicate the times of reported QTLs at the corresponding genome region. The outside circle is a Manhattan plot of the FCR with two thresholds at (1.58 x [10.sup.-6]) and 5.0.
Table 1. Descriptive statistics for phenotypes Trait N Mean (SD) Min Max CV (%) FCR (%) 435 3.94 [+ or -] 0.49 2.9 6.92 12.44 Trait [h.sup.2](se) (1)) FCR (%) 0.33 (0.11) SD, standard deviation; CV, coefficient of variation; se, standard errors; FCR, feed conversion ratio; SNP, single nucleotide polymorphism. (1)) Heritability and standard error estimated by DMU software package based on SNP genotypes. Table 2. SNPs suggestive significantly associated with feed conversion ratio Chr (1)) Position MAF Beta (2)) Se (3)) -log10(P) 1 45439962 0.053 0.36 0.07 5.92 1 45582623 0.051 0.36 0.07 5.81 1 45616618 0.053 0.38 0.07 6.40 2 131217033 0.051 0.36 0.07 6.34 4 15596919 0.161 0.23 0.05 5.97 4 16040593 0.100 0.29 0.05 6.87 8 1395613 0.401 0.17 0.03 6.16 8 2583927 0.182 0.24 0.04 7.14 8 2717880 0.181 0.22 0.04 5.97 26 3202060 0.374 0.18 0.04 5.98 Chr (1)) Near-gene Distance (Kb) (4)) 1 USP44 U2.39 1 LTA4H U1.72 1 ELK3 U5.16 2 RSPO2 U60.14 4 IAP3 U15.85 4 SOWAHD U23.81 8 CAMSAP2 intro14 8 ZBTB41 intro5 8 KCNT2 intro4 26 RAPA U3.72 SNP, single nucleotide polymorphism; MAF, minor allele frequency; USP44, ubiquitin specific peptidase 44; LTA4H, leukotriene A4 hydrolase; ELK3, ETS transcription factor; RSPO2, R-spondin 2; IAP3, inhibitor of apoptosis protein 3; SOWAHD, sosondowah ankyrin repeat domain family member D; CAMSAP2, calmodulin regulated spectrin asso-ciated protein family member 2; ZBTB41, zinc finger and BTB domain containing 41; KCNT2, potassium sodium-activated channel subfamily T member 2; RAP1A, member of RAS oncogene family. (1)) Chicken chromosome. (2)) The effect size for marker. (3)) The standard errors for effect. (4)) D and U indicate the SNP is upstream and downstream of a gene, respectively.
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Open Access|
|Author:||Wang, Jiaying; Yuan, Xiaolong; Ye, Shaopan; Huang, Shuwen; He, Yingting; Zhang, Hao; Li, Jiaqi; Zhan|
|Publication:||Asian - Australasian Journal of Animal Sciences|
|Date:||Apr 1, 2019|
|Previous Article:||Analysis of genetic characteristics of pig breeds using information on single nucleotide polymorphisms.|
|Next Article:||Maternal lineage of Okinawa indigenous Agu pig inferred from mitochondrial DNA control region.|