Cancer as the Disintegration of Robustness: Population-Level Variance in Gene Expression Identifies Key Differences Between Tobacco- and HPV-Associated Oropharyngeal Carcinogenesis.
Robustness is a phenomenon whereby environmental variation, as well as the underlying genetic variation, does not get expressed at the phenotypic level. (16) We now know that this phenomenon may be the result of multiple factors; however, the original assumption by Waddington (17-19) was that given a trait under selection, the mechanism that brings about this trait--that is, the gene regulatory network that results in the phenotype under observation--would be under selection. Because variation due to random mutation is an unavoidable process, Waddington argued that any mechanism that would buffer such variation to be expressed at the phenotypic level would be selected for. Theoretical studies by Bergman and Siegal (20,21) show that this is not a necessary condition for the evolution of robustness. One way to achieve robustness is to have hierarchical and complex genotype-phenotype maps. (22) Complex traits, namely traits that result from complex genotype-phenotype maps, may exhibit robustness even though there may not be direct selection for a particular phenotype. This phenomenon of robustness allows the harboring of underlying genetic variation that is considered cryptic. (23,24)
The genotype is fairly easy to describe: it is the specific sequence for the genetic material for all biological organisms. The phenotype is less obvious. In this study we look at gene expression as a fundamental level of the phenotype. Thus, the phenomenon of robustness that we are examining is the insensitivity of the gene expression level, our phenotype, to genetic variation. Robustness may seem to hinder evolution; however, at times of stress, when a major perturbation occurs, cryptic variation percolates to the phenotypic level, enabling selection. Such perturbation may be the result of major environmental changes or large genotypic aberration. Cancer, we argue, may be the outcome resulting from such breakage of robustness.
In contrast to breaking of robustness in carcinogen-driven tumorigenesis, viral infections, such as HPV, hepatitis B virus, and Epstein-Barr virus, induce tumorigenesis through systematic and potentially predictable mechanisms. (25-28) Viruses essentially hijack the infected cell's molecular machinery to promote their own survival. In the process, the virus needs to shut down apoptotic and cellular senescence signals, which can ultimately lead to tumorigenesis. Because the virus uses the same mechanisms for shutting down tumor suppressive pathways in each infected cell, variation at the level of gene expression is hypothesized to be lower among multiple patient tumors.
Unlike previous studies of tumorigenesis and progression, we look at cancer as the result of normal, nontumorigenic cells losing the robustness of their complex genotypephenotype maps. As a result, we hypothesized that tumors will exhibit a greater degree of phenotypic variation, as measured by gene expression, than their histologically normal counterparts. To study the increase in expression variation, that is, loss of robustness, within OPSCC, we designed an approach to detect genes that exhibit a high degree of variability in expression across tumor samples. Our approach does not look at the consequences of consistent upregulation and down-regulation for genes involving tumor suppressors or oncogenes, but at variation that is increased in pathways. From our results, we identified multiple genes involved in cancer pathways that have previously been associated with cancer, as well as singled out those not yet directly implicated in tumorigenesis. We tested the hypothesis that variability in expression will differ in environmentally driven cancers (tobacco smoking) versus virally driven cancers (HPV infection). Patient tumors were compared with adjacent normal tissues, which are considered to be "wild type," thus robust, whereas the tumor cells are considered to be those in which robustness is broken. In turn, we were able to apply our method to [HPV.sup.-] tumors from smoking patients and [HPV.sup.+] tumors from nonsmoking patients to identify genes that are highly variable within the context of smoking. Finally, we also applied the same approach to miRNA expression data to assess the loss of robustness at an epigenetic level.
MATERIALS AND METHODS
Patient Tissue Samples
Histologically confirmed OPSCC tumors were obtained by biopsy or surgical resection from patients undergoing treatment at Montefiore Medical Center in Bronx, New York. All patients provided written consent for participation in this study under a protocol approved by the Institutional Review Board at Montefiore Medical Center. Tumors and adjacent histologically normal tissue were snap frozen in liquid nitrogen within 30 minutes of surgical resection or biopsy. Samples were kept at -80[degrees]C until used for microarray analyses.
All tumors were tested for HPV DNA and RNA (including for HPV-16, the most prevalent type found in OPSCC) using polymerase chain reaction assays as described previously. (9) One sample was negative for HPV-16 but positive for HPV-35. Human papillomavirus type 35 is closely related phylogenetically to HPV-16. (29) Tumors testing positive for low-risk (nononcogenic) types were considered [HPV.sup.-]. The sample sizes used to analyze the messenger RNA (mRNA) and miRNA data sets are listed in Table 1.
mRNA Microarray Preparation
The protocol for the miRNA expression microarrays was previously described. (9) For each primary tumor and matched normal sample, total RNA (500 ng) was amplified and biotin labeled with the Illumina Total Prep RNA Amplification Kit (Ambion, Austin, Texas). Whole-genome expression was analyzed by RNA hybridization to the Illumina HumanHT-12-v3 Expression BeadChip (Illumina, San Diego, California). Probes were matched to known genes and alternative splice variants using the RefSeq database release 17 (www.ncbi.nlm.nih.gov/refseq/) and UniGene build 188 (www.ncbi.nlm.nih.gov/unigene). Controls for each RNA sample were used to confirm RNA quality, biotin labeling success, hybridization stringency, and signal levels. A total of 39 364 probes from the microarray were used for subsequent analyses.
Microarray expression values were quantile normalized within BeadStudio (Illumina) prior to analysis. Expression data were batch corrected using the ComBat function from the sva R package. (30) For each sample, the median expression values from 750 negative control genes were subtracted from the other expression values, and expression values less than or equal to 1 were converted to a value of 1. Only genes with 70% of their values greater than 1 were considered valid and included in subsequent analyses. Table 2 lists the final number of probes analyzed for each group.
Coefficients of Variation for mRNA Data
Coefficients of variation (CVs; [sigma]/[mu]) were calculated for each gene with valid expression across 70% of the samples. Coefficients of variation were plotted to detect differences in genetic robustness between tumor and normal tissues. The differentiation of CV(tumor) - CV(normal) and CV(normal) - CV(tumor) were used to identify potentially relevant genes in [HPV.sup.-] and [HPV.sup.+] tumor-normal pairs. The threshold for high CV values is case dependent but has been reported to be as high as 0.5 for gene expression studies. (31) We increased the threshold of the CV value for genes with highly variable expression to 1 to ensure that the expression levels of the genes selected were indeed highly variable. Then, the CVs of the genes were plotted for 20 [HPV.sup.-] and 6 [HPV.sup.+] tumor samples. A threshold of CV([HPV.sup.-]) greater than 1 and CV([HPV.sup.+]) less than 0.4 was used to identify 43 genes with expression levels that were highly variable only in [HPV.sup.-] samples (see Supplemental Figure 1 for threshold selection in the supplemental digital content at www.archivesofpathology.org in the November 2015 table of contents). To validate the relevancy of these 43 genes, an expression heat map was generated using the supraHex R package (32) (www.r-project.org). Patients were clustered according to a dendrogram at the top of the heat map using a Spearman correlation.
Weighted Gene Coexpression Network Analysis for mRNA Data
The Weighted Gene Co-Expression Network Analysis R package (33) was used to establish a coexpression network for selected genes. A typical workflow for Weighted Gene Co-Expression Network Analysis was used with a few modifications. (33,34) Briefly, for the 43 selected genes, we used a power adjacency matrix ([beta] = 6) to generate modules of coexpressed genes as described by Zhang and Horvath. (34) The selected genes were placed into 4 modules based on the Topological Overlap Matrix-based dissimilarity (DistTOM) of their expression. A DistTOM heat map and multidimensional scaling plot were generated to determine which modules would form networks of coexpression. Coexpression networks were visualized in VisANT (35) with a weight cutoff of 0.4 for the blue network and 0.5 for the red network. Gene ontology (GO) was then done using PANTHER (36) and EMBL-EBI QuickGO (http://www.ebi.ac.uk/QuickGO/, accessed January 12, 2015).
miRNA Microarray Preparation
The miRNA microarray protocol was previously described. (37) For each primary tumor and matched normal sample, total RNA (200 ng) was used for miRNA microarray analysis using the Illumina DASL (cDNA-mediated annealing, selection, extension, and ligation) in the 96-well-plate Sentrix Array Matrix format (Illumina). RNA quality was assessed using the Agilent Bioanalyzer (Agilent, Santa Clara, California). Results were analyzed within GenomeStudio (Illumina) using its miRNA labeling and hybridization quality controls. Seven hundred and thirty-nine miRNAs that satisfied quality control entries from the microarray were used for subsequent analyses.
CVs for miRNA Data
Coefficients of variation were calculated for each miRNA on the array. Like the mRNA data, CVs were plotted for tumor-normal pairs and for [HPV.sup.-] and [HPV.sup.+] tumors. Thresholds of CV(tumor)--CV(normal) greater than 1 and CV(normal)--CV(tumor) greater than 1 were used to determine the number of miRNAs with highly variable expression levels in each tumor-normal set. Thresholds of CV([HPV.sup.-]) greater than 1 and CV([HPV.sup.+]) less than 0.5 were used to select 18 miRNAs with expression levels that were highly variable only in the [HPV.sup.-] tumors. A literature search was performed for the selected miRNAs and those with published data were included in the results. TargetScan (38) and Target Explorer (Ingenuity, Redwood City, California) were used to generate a table of cellular functions for the remaining miRNAs based on their inferred targets.
To measure a loss of robustness in OPSCC samples, we examined gene expression. The mRNA microarrays provided a high-throughput mechanism for measuring gene expression across both [HPV.sup.-] and [HPV.sup.+] patient samples from the oropharynx. To test our hypothesis that there is a greater number of genes with high expression variance in tumors than in normal tissue, we analyzed the CVs for mRNA expression in [HPV.sup.-] and [HPV.sup.+] primary tumornormal pairs (see "Materials and Methods"). Patients were categorized by HPV and smoking status, resulting in 4 groups for analysis. Here we defined nonsmokers as patients who have never smoked tobacco. To visualize expression variation between tumor and normal tissues, the CVs for each gene were plotted, and genes with higher CVs in either tumor or normal tissues were identified (Figure 1, A through D, highlighted in red). The 2 patient smoking groups had similar sample sizes, with 15 tumor-normal pairs for [HPV.sup.-] patient samples and 16 pairs for [HPV.sup.+] patient samples. Likewise, the 2 nonsmoking groups had identical sample sizes, 4 tumor-normal pairs for both [HPV.sup.-] and [HPV.sup.+] patient samples. From our results, we observed a higher number of genes with variant expression in [HPV.sup.-] tumors compared with the matched normal tissue, 70 versus 27 genes for smokers (Figure 1, A) and 106 versus 8 genes for nonsmokers (Figure 1, B). Similarly, there were more genes with highly variable expression in [HPV.sup.+] tumors than in normal tissues, 45 versus 10 genes for smokers (Figure 1, C) and 55 versus 14 genes for nonsmokers (Figure 1, D). This confirmed our hypothesis that tumors exhibit more genes with a higher variance in expression than their matched histologically normal tissue. Interestingly, the [HPV.sup.-] and [HPV.sup.+] tumors from the smoking groups had a larger range of CV values, with a maximal CV value above 3, whereas the maximal CV value for the nonsmoking groups did not exceed 2. This translates to a higher degree of expression variation for tumors from smoking patients.
Additionally, there were about twice the number of genes with highly variable expression in [HPV.sup.-] tumors compared with [HPV.sup.+] tumors regardless of smoking status (70 versus 45 for the smoking patients and 106 versus 55 for the nonsmoking patients), which supports our second hypothesis that virally driven tumorigenesis is more controlled than carcinogen-driven tumorigenesis that is induced by random gene mutation.
To identify genes with highly variable expression in smoking-induced tumors only, the CVs of each gene were plotted for tumors from 20 [HPV.sup.-] smoker and 6 [HPV.sup.+] nonsmoker OPSCC patients (Figure 2, A). Because smoking added expression variation to the [HPV.sup.+] samples, the [HPV.sup.+] smoking patients were excluded from this analysis. Although the sample sizes between the 2 groups differed largely (20 versus 6), as we have shown with the [HPV.sup.-] nonsmokers (n = 4), the sample sizes did not affect the number of genes with highly variable expression in the tumor samples (Figure 1). Because we defined a CV of 1 as high expression variation, CVs greater than 1 in [HPV.sup.-] tumors from smokers and less than 0.4 in [HPV.sup.+] tumors from nonsmokers were considered to be highly variable in smokers only (see Supplemental Figure 1 for [HPV.sup.+] threshold selection). From this analysis, we identified 43 genes that showed high expression variation only in tumors from [HPV.sup.-] smokers. Expression values for the 43 genes with highly variable expression were then used to group tumor samples by hierarchical clustering (Figure 2, B). The [HPV.sup.+] samples (colored black) formed one cluster at the right end of the dendrogram, whereas the [HPV.sup.-] samples (colored red) were broken up into several clusters.
Because the genes with high expression variation in smokers with [HPV.sup.-] tumors potentially represent an important difference between [HPV.sup.-] and [HPV.sup.+] OPSCC, we analyzed the expression relationships among the selected genes. To gain insight into the possible relationship between the genes with highly variable expression, we used a gene coexpression network approach. This method identifies networks of genes that share expression levels. Our goal was to identify coexpression networks with hub genes that were highly coexpressed with the rest of the network. These hub genes would have the potential to be key players in the breakage of robustness within tumors from [HPV.sup.-] smokers. All 20 [HPV.sup.-] tumors from smokers were used to generate a dendrogram based on the expression values for the 43 selected genes (data not shown). One outlier was identified and removed from all subsequent analyses. A power adjacency matrix was generated for the remaining 19 tumors to establish coexpression relationships among the variably expressed genes. The matrix consists of the coefficient variations for all possible combinations of genes across the patients used in the analysis. The matrix was then raised to a [beta] value to provide weighted relationships, which better represents genetic data. (34) To visualize the coexpression networks, a DistTOM dendrogram was constructed, and the resulting clusters were placed into modules labeled by unique colors (Figure 3, A). Two major clusters were identified, shown as blue and red modules. To highlight the gene expression overlap within the identified clusters, a DistTOM heat map was generated (Figure 3, B). A multidimensional scaling plot was used as an additional method for visualizing the gene clusters (Figure 3, C). Based on a strong DistTOM overlap, as shown by dark coloring on the heat map and a scattering of genes on the multidimensional plot, the blue and red modules formed clear networks of coexpression. The gray module did not form a coexpression network based on a weak DistTOM overlap shown as light coloring on the heat map and the clustering of genes on the multidimensional scaling plot. The green cluster included 2 transcripts from the same gene and it was not considered for network analysis. The blue and red modules were then used to construct 2 coexpression networks, which are referred to as the blue and red networks for the remainder of this article.
Using the Weighted Gene Co-Expression Network Analysis package, (33) we identified 2 networks (the blue and red networks) of coexpression within the 43 selected genes. Once the networks were identified, we assessed the number of connections for each gene within its network to isolate hub genes. The blue coexpression network consisted of 16 variably expressed genes (see Supplemental Table 1 for GO) and included phosphatidylinositol glycan anchor biosynthesis class X (PIGX) as a potential hub gene based on a connection weight of 0.4 (Figure 4, A). The red coexpression network was slightly larger with 20 genes (see Supplemental Table 2 for GO) and included 2 potential hub genes, cysteine-rich PAK1 inhibitor (CRIPAK) and forkhead box P4 (FOXP4), based on a connection weight of 0.5 (Figure 4, B).
We also assessed whether similar associations with HPV and smoking existed for miRNA expression. Variable levels of typically stable miRNAs in normal tissues could have significant implications during tumorigenesis. As with the mRNA array, we assessed miRNA expression levels in [HPV.sup.-] and [HPV.sup.+] tumors (see "Materials and Methods"). Compared with the mRNA microarray, the miRNA array probed far fewer targets. The CVs for all the miRNAs were plotted for tumor and normal tissues and separated by patient HPV and smoking status. Because there were only 2 nonsmoking patients with [HPV.sup.-] tumors, they were not included in the analyses. The same parameters used for CV analysis in the mRNA data set were used to analyze the miRNA array data. The miRNAs that had highly variable expression in both tumor and normal tissue samples for each group were highlighted in red. For the smokers with [HPV.sup.-] tumors, there were 8 miRNAs that had high expression variation among the tumors compared with no miRNAs with highly variable expression in the normal tissue samples (Figure 5, A). Conversely, there were no miRNAs with highly variable expression in either the tumor or normal tissue for nonsmokers with [HPV.sup.+] tumors (Figure 5, B). However, for the smokers with [HPV.sup.+] tumors, there was 1 miRNA with high expression variation among the tumors and there were no miRNAs with highly variable expression in the normal tissue (Figure 5, C). Because there were more genes with high expression variation for the smokers, the CV analyses support our first hypothesis that tumors from smokers gain genetic variation during tumorigenesis. Secondly, because there were fewer miRNAs with highly variable expression in the [HPV.sup.+] tumors of both smokers and nonsmokers compared with the [HPV.sup.-] tumors of smokers, the miRNA data set may support our second hypothesis that virally driven tumorigenesis is genetically a more controlled process. More [HPV.sup.-] tumors from nonsmokers need to be analyzed to fully interpret these results.
As we did with the mRNA expression data, we next compared the [HPV.sup.-] tumors from smokers with the [HPV.sup.+] tumors from nonsmokers (Figure 5, D). The CVs for each miRNA were plotted and the miRNAs that had high expression variation among the [HPV.sup.-] tumors were highlighted in red (see "Materials and Methods" for the parameters used). As expected, there was an observable increase in the number of miRNAs that had highly variable levels of expression and an increase in the range of expression variation for [HPV.sup.-] tumors from smokers compared with the [HPV.sup.+] tumors from nonsmokers. Table 3 lists the 18 miRNAs that had highly variable levels of expression in [HPV.sup.-] tumors along with inferred or experimentally confirmed cellular functions. Unlike mRNAs, which are transcribed into a limited number of proteins, miRNAs can target multiple mRNAs in various pathways, resulting in a variety of potential mechanisms by which miRNAs can induce tumorigenesis.
We describe a unique model for studying cancer and tumorigenesis. The down-regulation of tumor suppressors and the upregulation of oncogenes have been the most frequently used method for studying cancer. The identification of tumor suppressors and oncogenes, such as Rb and Ras, respectively, led to the rapid progression of cancer research for tumors where these key regulators induced tumorigenesis. (39-41) However, many tumors do not exhibit alterations solely in tumor suppressors or oncogenes, but rather display multiple mutations in the genome. Alternative models of tumorigenesis and tumor progression have been proposed and studied extensively. One approach is focused on the transformation to a more "stem-cell-like" phenotype as a model for cancer progression, which looks for the expression of genes involved in development, motility, and invasion. (42-44) The stemlike model provides a mechanism for tumor invasion and metastasis, but does not begin to tackle the process of tumorigenesis itself. Additionally, cancer has been modeled through a different form of robustness, (45) here defined as the capacity of tumor cells to avoid apoptosis. (46-48) Although this approach may be useful for explaining the variance seen among tumors of the same type, it fails to address the mechanism by which normal cells lose robustness of their own complex genotype-phenotype maps. Here we look at cancer as the result of various mechanisms whereby the robustness of normal, nontumorigenic cells has been broken.
Robustness, which evolved as a result of multiple possible mechanisms, buffers genotypic alterations from manifesting at the phenotypic level. (16,20,21) That is, robustness allows for the harboring of cryptic genetic variation without disrupting the complex genotypic-phenotypic maps. (23,24) This enables individuals to exhibit genetic variation without compromising important cellular functions. Under certain stressful conditions, robustness may be lost, exposing the cryptic genetic variation at the phenotypic level. We argue that cancer goes through such a process, whereby increased genetic mutations result in a breakage of robustness leading to tumorigenesis. In our study, we consider gene expression as the most basic phenotype of genetic variation. Thus, we look for an increase in gene expression variation as a result of breakage of robustness. This hypothesis requires the accumulation of a certain number of key mutations to achieve malignant transformation. If a normal cell is primed to become a cancer cell through a different mechanism, such as viral infection by HPV, such loss of robustness may not occur as it does in carcinogen-driven cancers. During virally induced tumorigenesis, random mutations may be less likely because viral infection of the cell disrupts the cell cycle and apoptotic pathways consistently to promote its own survival. However, we do expect to see some degree of variation among genes from [HPV.sup.+] tumors, because both expression of viral genes and viral integration (49) may disrupt a number of important genes, thereby leading to a breakage of robustness. There are other models available to study virally driven tumorigenesis, such as HPV infection in the cervix, hepatitis B virus infection in the liver, and the role of Epstein-Barr virus infection during B-cell transformation. (25-27) In this study, we chose OPSCC as a model wherein both carcinogen- and virally induced tumorigenesis may occur to test 2 hypotheses: the first being that tumor cells experience a breakage of robustness, and the second that virally induced tumorigenesis is a more controlled process. We predicted that tumor samples would exhibit increased gene expression variation with respect to their adjacent histologically normal tissue. Furthermore, we predicted that smoking-related tumors would increase expression variability compared with virus-related tumors.
Through our analyses, we identified a number of genes with high expression variation among the [HPV.sup.-] tumors from smokers compared with [HPV.sup.+] tumors from nonsmokers. These genes were divided into networks of coexpression with hub genes, which potentially represent or interact with drivers of phenotypic variance during tumorigenesis. Because we did not include the full microarray for the coexpression network construction, it is likely that the hub genes may not be key drivers. Rather, the hubs may represent a downstream player of a key driver of tumorigenesis, or simply share transcription factor regulation with the rest of the network. Epigenetic mechanisms such as altered chromatin marks, DNA methylation, and/or miRNA levels may also be driving network construction. Because these hub genes may not be directly regulating the rest of the network, it is key to look at them within the context of their GO and pathways to understand their importance in cancer. One network included hub gene PIGX, a key player during glycosylphosphatidylinositol anchor synthesis. (50) The glycosylphosphatidylinositol anchor is a posttranslational lipid modification added to cell surface transmembrane proteins. (51) We propose that during tumorigenesis, variation in PIGX expression may perturb the cell signaling and cell-to-cell cross talk of important proteins that are unable to anchor into the plasma membrane. Another network included the hub gene CRIPAK, a regulator of cellular senescence through Pak1 inhibition. (52) We hypothesize that variation in CRIPAK expression may represent a mechanism for tumorigenesis that may occur through dysregulation of senescence. A second hub gene from this network is FOXP4, a transcription factor important during development. (53) FOXP4 expression by neural epithelial cells post-differentiation suppresses N-cadherin, (54) suggesting its importance for normal cellular functioning. N-cadherin expression is a marker for epithelial-mesenchymal transition, (55) and we think the variation in FOXP4 expression may represent the breakage of robustness of the epithelial phenotype. For both networks, the GO for the nonhub genes revealed a range of functions for apoptotic, proliferation, cell signaling, and structural genes. These genes warrant further study because their expression values were more variant in [HPV.sup.-] tumors from smokers and could be used as key markers to differentiate between carcinogen-induced and virally induced tumorigenesis.
As with gene expression, a number of miRNAs were identified to have high expression variation among the [HPV.sup.-] tumors from smokers compared with [HPV.sup.+] tumors from nonsmokers. We observed very little variation for miRNA expression among [HPV.sup.+] tumors regardless of smoking status, suggesting that miRNAs are under viral control. (56) However, we cannot speculate about smoking-induced miRNA expression variation in the absence of viral infection because of the lack of [HPV.sup.-] tumors from nonsmoking patients. There are many algorithms to predict potential miRNA targets based on the seed sequences; however, the only way to truly identify a real miRNA target or function is through experimental validation. (57) For this reason, we only discuss 2 miRNAs with highly variable expression that have been validated in the literature to date. One of the miRNAs with high expression variation, hsamiR-302b, has been shown to be a regulator of cell reprogramming and targets genes responsible for epithelial-mesenchymal transition. (58) Not only does this miRNA have an important role in tumorigenesis, its identification in our list of variant miRNAs also supports our approach. We propose that variant expression levels of hsa-miR-302b indicate a loss of robustness with respect to its targets and complex genotype-phenotype networks. The second miRNA, hsa-miR-376b, has been identified as a miRNA expressed by endothelial cells during hypoxia, (59) and more recently has been shown to regulate autophagy induced by starvation. (60) Hypoxia and metabolic starvation are both consequences of tumor growth. The other miRNA functions/interactions have only been inferred and not experimentally validated; however, a few may be relevant to the study of loss of robustness in tumorigenesis and warrant further exploration.
In addition to mRNA and miRNA expression, methylation patterns, copy number variation, and mutation rates for OPSCC should be considered for study using our approach to reveal the underlying mechanisms for gene expression alterations. Recently, a study published by the Cancer Genome Atlas Network (61) reported differences in copy number and somatic mutations between [HPV.sup.-] and [HPV.sup.+] head and neck squamous cell carcinomas. The Cancer Genome Atlas Network findings also revealed differential methylation patterns, copy number variation, and somatic mutations between [HPV.sup.+] head and neck squamous cell carcinomas with integrated and nonintegrated genomes, which may indicate varying mechanisms of HPV-driven tumorigenesis. (61,62) Finally, although we identified genes and pathways that have not been previously associated with a loss of robustness in cancer cells, these findings need to be corroborated experimentally and additional analyses of what might have contributed to increased gene expression variation need to be done. As we have seen in yeast knockouts, not all disruptions result in increased phenotypic variation. (21) What would be required for a better understanding would be to identify the characteristics of genes that increase phenotypic variation when either mutated or disrupted epigenetically. Characteristics, in this case, are defined as the biological functions, downstream targets, and position in the network hierarchy that are relevant for the progression of cancer. Once identified, these characteristics may reveal novel underlying driving mechanisms for the cancer phenotype.
This work is supported in part by National Institute of Dental & Craniofacial Research and National Cancer Institute grants (CA131648, CA115243, and DE023941 to Drs Schlecht and Belbin and 1-R01-AG028872-01A1 to Dr Bergman).
Please Note: Illustration(s) are not available due to copyright restrictions.
(1.) Jemal A, Bray F, Center MM, Ferlay J, Ward E, Forman D. Global cancer statistics. CA Cancer J Clin. 2011; 61(2):69-90.
(2.) Jaber MA, Porter SR, Gilthorpe MS, Bedi R, Scully C. Risk factors for oral epithelial dysplasia--the role of smoking and alcohol. Oral Oncol. 1999; 35(2): 151-156.
(3.) Pytynia KB, Dahlstrom KR, Sturgis EM. Epidemiology of HPV-associated oropharyngeal cancer. Oral Oncol. 2014; 50(5):380-386.
(4.) Kreimer AR, Clifford GM, Boyle P, Franceschi S. Human papillomavirus types in head and neck squamous cell carcinomas worldwide: a systematic review. Cancer Epidemiol Biomarkers Prev. 2005; 14(2):467-475.
(5.) Vidal L, Gillison ML. Human papillomavirus in HNSCC: recognition of a distinct disease type. Hematol Oncol Clin North Am. 2008; 22(6):1125-1142, vii.
(6.) Schlecht Nf, Burk RD, Adrien L, et al. Gene expression profiles in HPV-infected head and neck cancer. J Pathol. 2007; 213(3):283-293.
(7.) Troy JD, Weissfeld JL, Youk AO, Thomas S, Wang L, Grandis JR. Expression of EGFR, VEGF, and NOTCH1 suggest differences in tumor angiogenesis in HPV-positive and HPV-negative head and neck squamous cell carcinoma. Head Neck Pathol. 2013; 7(4):344-355.
(8.) van Kempen PM, Noorlag R, Braunius WW, Stegeman I, Willems SM, Grolman W. Differences in methylation profiles between HPV-positive and HPV-negative oropharynx squamous cell carcinoma: a systematic review. Epigenetics. 2014; 9(2):194-203.
(9.) Lleras RA, Smith RV, Adrien LR, et al. Unique DNA methylation loci distinguish anatomic site and HPV status in head and neck squamous cell carcinoma. Clin Cancer Res. 2013; 19(19):5444-5455.
(10.) Lajer CB, Garnaes E, Friis-Hansen L, et al. The role of miRNAs in human papilloma virus (HPV)-associated cancers: bridging between HPV-related head and neck cancer and cervical cancer. Br J Cancer. 2012; 106(9):1526-1534.
(11.) Bol V, Gregoire V. Biological basis for increased sensitivity to radiation therapy in HPV-positive head and neck cancers. Biomed Res Int. 2014; 2014: 696028.
(12.) Mirghani H, Amen F, Blanchard P, et al. Treatment de-escalation in HPV-positive oropharyngeal carcinoma: ongoing trials, critical issues and perspectives. Int J Cancer. 2015; 136(7):1494-1503.
(13.) Salazar CR, Anayannis N, Smith RV, et al. Combined P16 and human papillomavirus testing predicts head and neck cancer survival. IntJ Cancer. 2014; 135(10):2404-2412.
(14.) Nygard M, Aagnes B, Bray F, Moller B, Mork J. Population-based evidence of increased survival in human papillomavirus-related head and neck cancer. Eur J Cancer. 2012; 48(9):1341-1346.
(15.) Sethi S, Ali-Fehmi R, Franceschi S, et al. Characteristics and survival of head and neck cancer by HPV status: a cancer registry-based study. Int J Cancer. 2012; 131 (5):1179-1186.
(16.) Masel J, Siegal ML. Robustness: mechanisms and consequences. Trends Genet. 2009; 25(9):395-403.
(17.) Waddington CH. Canalization of development and the inheritance of acquired characters. Nature. 1942(150):563-565.
(18.) Waddington CH. The Strategy of the Genes. London, England: Allen & Unwin; 1957.
(19.) Waddington CH. Evolutionary systems--animal and human. Nature. 1959(183):1634-1638.
(20.) Siegal ML, Bergman A. Waddington's canalization revisited: developmental stability and evolution. Proc Natl Acad Sci U S A. 2002; 99(16):10528-10532.
(21.) Bergman A, Siegal ML. Evolutionary capacitance as a general feature of complex gene networks. Nature. 2003; 424(6948):549-552.
(22.) Belbin TJ, Bergman A, Brandwein-Gensler M, et al. Head and neck cancer: reduce and integrate for optimal outcome. Cytogenet Genome Res. 2007; 118(24):92-109.
(23.) Paaby AB, Rockman MV. Cryptic genetic variation: evolution's hidden substrate. Nat Rev Genet. 2014; 15(4):247-258.
(24.) Gibson G, Dworkin I. Uncovering cryptic genetic variation. Nat Rev Genet. 2004; 5(9):681-690.
(25.) Hawley-Nelson P, Vousden KH, Hubbert NL, Lowy DR, Schiller JT. HPV16 E6 and E7 proteins cooperate to immortalize human foreskin keratinocytes. EMBO J. 1989; 8(12):3905-3910.
(26.) Paterlini-Brechot P, Saigo K, Murakami Y, et al. Hepatitis B virus-related insertional mutagenesis occurs frequently in human liver cancers and recurrently targets human telomerase gene. Oncogene. 2003; 22(25):3911-3916.
(27.) Price AM, Luftig MA. Dynamic Epstein-Barr virus gene expression on the path to B-cell transformation. Adv Virus Res. 2014; 88:279-313.
(28.) Chen Y, Williams V, Filippova M, Filippov V, Duerksen-Hughes P. Viral carcinogenesis: factors inducing DNA damage and virus integration. Cancers (Basel). 2014; 6(4):2155-2186.
(29.) Chen Z, Freitas LB, Burk RD. Evolution and classification of oncogenic human papillomavirus types and variants associated with cervical cancer. Methods Mol Biol. 2015; 1249:3-26.
(30.) Leek JT, Johnson WE, Parker HS, Fertig EJ, Jaffe AE. Storey JD. sva: Surrogate Variable Analysis. R package version 3.12.0.
(31.) Sultan M, Piccini I, Balzereit D, et al. Gene expression variation in Down's syndrome mice allows prioritization of candidate genes. Genome Biol. 2007; 8(5): R91.
(32.) Fang H, Gough J. supraHex: an R/Bioconductor package for tabular omics data analysis using a supra-hexagonal map. Biochem Biophys Res Commun. 2014; 443(1): 285-289.
(33.) Langfelder P, Horvath S. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics. 2008; 9:559.
(34.) Zhang B, Horvath S. A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol. 2005; 4: Article 17.
(35.) Hu Z, Mellor J, Wu J, DeLisi C. VisANT: an online visualization and analysis tool for biological interaction data. BMC Bioinformatics. 2004; 5:17.
(36.) Thomas PD, Campbell MJ, Kejariwal A, et al. PANTHER: a library of protein families and subfamilies indexed by function. Genome Res. 2003; 13(9): 2129-2141.
(37.) Harris T, Jimenez L, Kawachi N, et al. Low-level expression of miR-375 correlates with poor outcome and metastasis while altering the invasive properties of head and neck squamous cell carcinomas. Am j Pathol. 2012; 180(3):917-928.
(38.) Lewis BP, Burge CB, Bartel DP. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell. 2005; 120(1):15-20.
(39.) Murphree AL, Benedict WF. Retinoblastoma: clues to human oncogenesis. Science. 1984; 223 (4640):1028-1033.
(40.) Parada LF, Tabin CJ, Shih C, Weinberg RA. Human EJ bladder carcinoma oncogene is homologue of Harvey sarcoma virus ras gene. Nature. 1982; 297(5866):474-478.
(41.) Weinberg RA. Oncogenes and tumor suppressor genes. CA Cancer j Clin. 1994; 44(3):160-170.
(42.) Ksiazkiewicz M, Markiewicz A, Zaczek AJ. Epithelial-mesenchymal transition: a hallmark in metastasis formation linking circulating tumor cells and cancer stem cells. Pathobiology. 2012; 79(4):195-208.
(43.) Chen C, Zimmermann M, Tinhofer I, Kaufmann AM, Albers AE. Epithelialto-mesenchymal transition and cancer stem(-like) cells in head and neck squamous cell carcinoma. Cancer Lett. 2013; 338(1):47-56.
(44.) Lamouille S, Xu J, Derynck R. Molecular mechanisms of epithelialmesenchymal transition. Nat Rev Mol Cell Biol. 2014; 15(3):178-196.
(45.) Kaneko K. Characterization of stem cells and cancer cells on the basis of gene expression profile stability, plasticity, and robustness: dynamical systems theory of gene expressions under cell-cell interaction explains mutational robustness of differentiated cells and suggests how cancer cells emerge. Bioessays. 2011; 33(6):403-413.
(46.) Tian T, Olson S, Whitacre JM, Harding A. The origins of cancer robustness and evolvability. Integr Biol. 2011; 3(1):17-30.
(47.) Masuda M, Toh S, Wakasaki T, Suzui M, Joe AK. Somatic evolution of head and neck cancer--biological robustness and latent vulnerability. Mol Oncol. 2013; 7(1):14-28.
(48.) Radisavljevic Z. AKT as locus of cancer positive feedback loops and extreme robustness. J Cell Physiol. 2013; 228(3):522-524.
(49.) Akagi K, Li J, Broutian TR, et al. Genome-wide analysis of HPV integration in human cancers reveals recurrent, focal genomic instability. Genome Res. 2014; 24(2):185-199.
(50.) Ashida H, Hong Y, Murakami Y, et al. Mammalian PIG-X and yeast Pbn1p are the essential components of glycosylphosphatidylinositol-mannosyltransferase I. Mol Biol Cell. 2005; 16(3):1439-1448.
(51.) Chatterjee S, Mayor S. The GPI-anchor and protein sorting. Cell Mol Life Sci. 2001; 58(14):1969-1987.
(52.) Talukder AH, Meng Q, Kumar R. CRIPak, a novel endogenous Pak1 inhibitor. Oncogene. 2006; 25(9):1311-1319.
(53.) Teufel A, Wong EA, Mukhopadhyay M, Malik N, Westphal H. FoxP4, a novel forkhead transcription factor. Biochim Biophys Acta. 2003; 1627(2-3):147-152.
(54.) Rousso DL, Pearson CA, Gaber ZB, et al. Foxp-mediated suppression of Ncadherin regulates neuroepithelial character and progenitor maintenance in the CNS. Neuron. 2012; 74(2):314-330.
(55.) Hazan RB, Phillips GR, Qiao RF, Norton L, Aaronson SA. Exogenous expression of N-cadherin in breast cancer cells induces cell migration, invasion, and metastasis. J Cell Biol. 2000; 148(4):779-790.
(56.) Salazar C, Calvopina D, Punyadeera C. miRNAs in human papilloma virus associated oral and oropharyngeal squamous cell carcinomas. Expert Rev Mol Diagn. 2014; 14(8):1033-1040.
(57.) Kuhn DE, Martin MM, Feldman DS, Terry AV Jr, Nuovo GJ, Elton TS. Experimental validation of miRNA targets. Methods. 2008; 44(1):47-54.
(58.) Subramanyam D, Lamouille S, Judson RL, et al. Multiple targets of miR-302 and miR-372 promote reprogramming of human fibroblasts to induced pluripotent stem cells. Nat Biotechnol. 2011; 29(5):443-448.
(59.) Voellenkle C, Rooij J, Guffanti A, et al. Deep-sequencing of endothelial cells exposed to hypoxia reveals the complexity of known and novel microRNAs. RNA. 2012; 18(3):472-484.
(60.) Korkmaz G, le Sage C, Tekirdag KA, Agami R, Gozuacik D. miR-376b controls starvation and mTOR inhibition-related autophagy by targeting ATG4C and BECN1. Autophagy. 2012; 8(2):165-176.
(61.) Cancer Genome Atlas Network. Comprehensive genomic characterization of head and neck squamous cell carcinomas. Nature. 2015; 517(7536):576-582.
(62.) Parfenov M, Pedamallu CS, Gehlenborg N, et al. Characterization of HPV and host genome interactions in primary head and neck cancers. Proc Natl Acad Sci USA. 2014; 111 (43):15544-15549.
Miriam M. Ben-Dayan, MS; Thomas MacCarthy, PhD; Nicolas F. Schlecht, PhD; Thomas J. Belbin, PhD; Geoffrey Childs, PhD; Richard V. Smith, MD; Michael B. Prystowsky, MD, PhD; Aviv Bergman, PhD
Accepted for publication March 13, 2015.
Published as an Early Online Release July 1, 2015.
Supplemental digital content is available for this article at www archivesofpathology.org in the November 2015 table of contents.
From the Departments of Pathology (Ms Ben-Dayan and Drs Belbin, Childs, and Prystowsky), Epidemiology and Population Health (Dr Schlecht), and Computational and Systems Biology (Dr Bergman), Albert Einstein College of Medicine, Bronx, New York; the Department of Applied Mathematics and Statistics, SUNY Stony Brook, Stony Brook, New York (Dr MacCarthy); and the Department of Otorhinolaryngology, Montefiore Medical Center, Bronx, New York (Dr Smith).
The authors have no relevant financial interest in the products or companies described in this article.
Reprints: Aviv Bergman, PhD, Computational and Systems Biology, Albert Einstein College of Medicine, Michael F. Price Center, 1301 Morris Park Ave, Room 153C, Bronx, NY 10461 (email: firstname.lastname@example.org).
Caption: Figure 1. Tumor messenger RNA (mRNA) expression variation in human papillomavirus-negative ([HPV.sup.-]) and human papillomavirus-positive ([HPV.sup.+]) patients. Scatterplots representing individual mRNA coefficients of variation (CVs) for tumors and matched normal tissue. Genes with a CV > 1 both for tumors and for normal tissues are colored red. A, Plot of CVs for 15 paired [HPV.sup.-] tumors from smokers. Seventy genes had greater CVs in tumor samples compared with 27 genes in the corresponding normal tissues, suggesting a greater amount of genes with high expression variation among tumors. B, Plot of CVs for 4 paired [HPV.sup.-] tumors from nonsmokers. Again, there is a greater amount of genes with high expression variation among the tumors, as demonstrated by 106 variably expressed genes in the tumors and only 8 variably expressed genes in the normal tissue. C, Plot of CVs for 16 paired [HPV.sup.+] tumors from smokers. There were 45 variably expressed genes in the tumors compared with 10 variably expressed genes in the normal tissue. D, Plot of CVs for 4 paired [HPV.sup.+] tumors from nonsmokers. Similarly, 55 genes were variant in the tumors compared with 14 genes in the matched normal tissue.
Caption: Figure 2. Greater degree of messenger RNA (mRNA) expression variation in human papillomavirus-negative ([HPV.sup.-]) tumors from smokers compared with human papillomavirus-positive ([HPV.sup.+]) tumors from nonsmokers. A, Scatterplot representing individual mRNA coefficients of variation for 20 [HPV.sup.-] tumors from smokers and 6 [HPV.sup.+] tumors from nonsmokers. Forty-three genes with high expression variation only in [HPV.sup.-] tumors are colored red. B, Heat map representing the expression of the 43 genes with high expression variation in [HPV.sup.-] tumors. The patient sample dendrogram is above the heat map. The 20 [HPV.sup.-] tumors from smokers are denoted with a red bar, and the 6 [HPV.sup.+] tumors from nonsmokers are indicated by the black bar. The expression values were log2 transformed for the heat map, and the gray scale was divided into 6 distinct colors to highlight the extreme differences in expression within the 2 groups of tumors.
Caption: Figure 3. Weighted Gene Co-Expression Network (WGCNA) construction. Network construction using the WGCNA R package. (33) A, Topological Overlap Matrix-based dissimilarity cluster dendrogram of genes with high expression variation in 19 human papillomavirus-negative ([HPV.sup.-]) tumors from smokers. Gene clusters are assigned to colored modules (blue, gray, red, and green). B, Heat map representation of the clusters with module colors. Dark red signifies high network overlap between 2 genes. C, Multidimensional scaling visualization of the colored modules. Scattered modules represent strong coexpression networks. Ends of the scattered modules represent hub genes (ie, the genes with the most connections).
Caption: Figure 4. Weighted Gene Co-Expression Networks. Expression networks were constructed for the blue and red modules. Genes that are colored black are the hub genes (the genes with the most connections within the network). A, The network representation of the blue module. A connection weight cutoff of 0.4 was used within the VisANT software (55) to visualize the genes with the most connections and identify the hub gene, phosphatidylinositol glycan anchor biosynthesis class X (PIGX). B, The network representation of the red module. A connection weight cutoff of 0.5 was used to identify 2 hub genes, cysteine-rich PAK1 inhibitor (CRIPAK) and forkhead box P4 (FOXP4).
Caption: Figure 5. Increased microRNA (miRNA) expression in human papillomavirus-negative ([HPV.sup.-]) tumors from smokers. A through C, Scatterplots representing individual miRNA coefficients of variation (CVs) for tumors and matched normal tissue. Genes with CV greater than 1 for both tumors and normal tissues are colored red. No [HPV.sup.-] tumors from nonsmokers are shown because of a sample size of 2. A, Plot of CVs for 14 paired [HPV.sup.-] tumors from smokers. Eight miRNAs had greater CVs in tumor samples compared with no miRNAs in the corresponding normal tissue, suggesting a greater amount of expression variation among tumor miRNAs. B, Plot of CVs for 3 paired human papillomavirus-positive ([HPV.sup.+]) tumors from nonsmokers. Conversely, there were no miRNAs with high expression variation in the [HPV.sup.+] tumors from the nonsmoking group. C, Plot of CVs for 18 paired [HPV.sup.+] smokers. Only 1 miRNA had a high CV in the tumors, and there were no miRNAs with highly variable expression in the normal tissue. D, Scatterplot representing individual miRNA CVs for 14 [HPV.sup.-] tumors from smokers and 3 [HPV.sup.+] tumors from nonsmokers. Eighteen miRNAs with high expression variation in [HPV.sup.-] tumors from smokers are colored red.
Table 1. Sample Sizes (a) [HPV.sup.-] smokers mRNA array No. of tumors 20 No. of tumors with matched normal tissue 15 miRNA array No. of tumors with matched normal tissue 14 [HPV.sup.-] nonsmokers mRNA array No. of tumors 4 No. of tumors with matched normal tissue 4 [HPV.sup.+] smokers mRNA array No. of tumors 21 No. of tumors with matched normal tissue 16 miRNA array No. of tumors with matched normal tissue 18 [HPV.sup.+] nonsmokers mRNA array No. of tumors 6 No. of tumors with matched normal tissue 4 miRNA array No. of tumors with matched normal tissue 3 Abbreviations: HPV, human papillomavirus; miRNA, microRNA; mRNA, messenger RNA. (a) There were not enough [HPV.sup.-] tumors from nonsmokers to do a coefficient of variation analysis for the miRNA array. Table 2. Percentage of Probes Used for Messenger RNA (mRNA) Analyses No. of % of Group Probes Total Probes [HPV.sup.-] smokers (T-N) 15 670 40 [HPV.sup.-] nonsmokers (T-N) 16 784 43 [HPV.sup.+] smokers (T-N) 16 167 41 [HPV.sup.+] nonsmokers (T-N) 16 700 42 [HPV.sup.-] smokers and [HPV.sup.+] nonsmokers 14 285 36 Abbreviations: HPV, human papillomavirus; miRNA, microRNA; N, normal; T, tumor. A total of 39 364 probes (not including controls) were used for the mRNA analyses. Table 3. Functions and Interactions for 18 MicroRNAs (miRNAs) With High Expression Variation in Human Papillomavirus-Negative Tumors From Nonsmoking Patients miRNA Cellular Function/Interactions HS_240 Implicated in esophageal carcinoma hsa-miR-122 (a) Liver-specific miRNA hsa-miR-154 (a) ZNFs are inferred targets hsa-miR-302a G1/S phase transition hsa-miR-302a (a) Nonfunctional hsa-miR-302b Epithelial-mesenchymal transition (57) hsa-miR-302d G1/S phase transition hsa-miR-376b Expressed by endothelial cells exposed to hypoxia (58) hsa-miR-431 Binds CD81 (signal transduction) hsa-miR-506 Binds CDK6 (phosphorylates Rb) hsa-miR-507 Colony formation, invasion, growth, and apoptosis (inferred) hsa-miR-508 MBD1, p53, and Myc are inferred targets hsa-miR-509 Binds NTRK3 (neuronal differentiation and survival) hsa-miR-510 Binds SPDEF (SAM pointed domain containing ETS transcription factor) hsa-miR-513 Association with skin cancer (inferred) hsa-miR-514 Association with skin cancer (inferred) hsa-miR-539 Colony formation, invasion, growth, and apoptosis (inferred) hsa-miR-551b DNMT3A, TRAF1, and E2F6 are inferred targets Abbreviations: CD81, cluster of differentiation 81; CDK6, cyclin-dependent kinase 6; DNMT3A, DNA (cytosine-5)-methyltransferase 3A; E2F6, transcription factor E2F6; ETS, E26 transformation specific; MBD1, methyl-CpG-binding domain protein 1; NTRK3, neurotrophic tyrosine kinase receptor, type 3; p53, tumor protein 53; SAM, serine alpha motif; TRAF1, tumor necrosis factor receptor-associated factor 1; ZNF, zinc finger protein. (a) Less prominent form of the miRNA.