Printer Friendly

Global gene expression profiling of a population exposed to a range of benzene levels.

BACKGROUND: Benzene, an established cause of acute myeloid leukemia (AML), may also cause one or more lymphoid malignancies in humans. Previously, we identified genes and pathways associated with exposure to high (< 10 ppm) levels of benzene through transcriptomic analyses of blood cells from a small number of occupationally exposed workers.

OBJECTIVES: The goals of this study were to identify potential biomarkers of benzene exposure and/or early effects and to elucidate mechanisms relevant to risk of hematotoxicity, Leukemia, and lymphoid malignancy in occupationally exposed individuals, many of whom were exposed to benzene levels > 1 ppm, the current U.S. occupational standard.

METHODS: We analyzed global gene expression in the peripheral blood mononuclear cells of 125 workers exposed to benzene levels ranging from > 1 ppm to < 10 ppm. Study design and analysis with a mixed-effects model minimized potential confounding and experimental variability.

RESULTS: We observed highly significant widespread perturbation of gene expression at all exposure levels. The AML pathway was among the pathways most significantly associated with benzene exposure. Immune response pathways were associated with most exposure levels, potentially providing biological plausibility for an association between lymphoma and benzene exposure. We identified a 16-gene expression signature associated with all levels of benzene exposure.

CONCLUSIONS: Our findings suggest that chronic benzene exposure, even at levels below the current U.S. occupational standard, perturbs many genes, biological processes, and pathways. These findings expand our understanding of the mechanisms by which benzene may induce hematotoxicity, leukemia, and lymphoma and reveal relevant potential biomarkers associated with a range of exposures.

Key WORDS: benzene, biomarker, human, microarray, transcriptomics. Environ Health Perspect 119:628-634(2011). doi:10.1289/ehp.1002546 [Online 13 December 2010]

Benzene is an established cause of acute myeloid leukemia (AML) and myelodysplastic syndromes, and is a probable cause of lymphocytic malignancies (Baan et al. 2009; Vlaanderen et al. 2010), including non-Hodgkin lymphoma (NHL) in humans, as recently reviewed by Smith (2010). Benzene is also hematotoxic, even at relatively tow levels of exposure (Lan et al. 2004). Possible mechanisms underlying these pathologies include the generation of free radicals leading to oxidative stress, immune system dysfunction, and decreased immune surveillance (Smith 2010). Studies of global gene expression in the bone marrow of very highly exposed mice have revealed additional potential mechanisms of benzene toxicity (Faiola et al. 2004; Yoon et al. 2003), but their relevance to risk in occupationally exposed individuals is uncertain. Toxicogenomic studies of exposed human populations are an important alternative approach to the human health risk assessment of environmental exposures. Such studies that have examined environmental exposures have identified potential biomarkers of early effects and revealed potential mechanisms underlying associated diseases (McHale et al. 2010). However, these studies have been of limited size, have mainly addressed high levels of exposure, and have often lacked precise, individual estimates of exposure. Further, such studies are limited by confounding effects and laboratory variation, especially at low doses.

We previously compared global gene expression in the peripheral blood mononuclear cell (PBMC) fractions of six to eight pairs of unexposed controls and workers exposed to high levels of benzene (< 10 ppm) and identified potential biomarkers of exposure and mechanisms of toxicity (Forrest et al. 2005; McHale et al. 2009). We chose PBMCs because they are widely used in human toxicogenomic studies. As an extension of these earlier studies, here we sought to identify potential gene expression biomarkers of exposure and early effects, as well as mechanisms of toxicity, in 125 individuals occupationally exposed to a range of benzene levels, including > 1 ppm, the current U.S. occupational standard (Occupational Safety and Health Administration 1987). In the cross-sectional molecular epidemiological study population, which includes the 125 individuals analyzed here, we previously found that white blood cell counts were decreased in workers exposed to > 1 ppm benzene compared with controls and that a highly significant dose-response relationship was present (Lan et al. 2004), with no apparent threshold within the occupational exposure range (0.2-75 ppm benzene) (Lan et al. 2006). We employed a rigorous study design that included randomization of samples across experimental variables, incorporation of precise individual measurements of exposure, and analysis with a mixed-effects model, with the aim of removing sources of biological and experimental variability (nuisance variability).

Materials and Methods

Study subjects and exposure assessment. All subjects were from a molecular epidemiology study of occupational exposure to benzene that comprised 250 benzene-exposed shoe manufacturing workers and 140 unexposed age-and sex-matched controls who worked in three clothes-manufacturing factories in the same region near Tianjin, China (Lan et al. 2004; Vermeulen et al. 2004). This study complied with all applicable requirements of U.S. and Chinese regulations, including institutional review board approval. Participation was voluntary, and written informed consent was obtained.

Exposure assessment to benzene was performed as described previously (Vermeulen et al. 2004). For this study, we categorized exposure groups using mean individual air benzene measurements obtained during the 3 months preceding phlebotomy. A subgroup of subjects was selected from each benzene exposure category as follows: 13 workers with very high exposure (> 10 ppm), 11 workers with high exposure (5-10 ppm), 30 workers with low exposure (< 1 ppm; average < 1 ppm), and 29 workers with very low exposure (<< 1 ppm; average < 1 ppm, with most individual measurements < 1 ppm) (Table 1). We previously reported that urinary benzene and mean individual air levels of benzene were strongly correlated (Spearman r = 0.88, p < 0.0001) in the epidemiological study population (Lan et al. 2004). Among the individuals with occupational exposure to benzene in the present study for which urinary benzene levels were available (n = 82), a similar correlation was noted (Spearman r = 0.76, p < 0.0001). A group of 42 unexposed controls were frequency matched to the exposed subjects on the basis of age and sex. Mean age ([+ or -] SD) was 29.5 [+ or -] 8.7 years for the 83 exposed workers and 29.5 [+ or -] 8.2 years for the controls.

Biological sample collection was described previously (Forrest et al. 2005; Vermeulen et al. 2004). We transferred field-stabilized samples on dry ice. We isolated RNAs using the mirVana miRNA (microRNA) isolation kit (Applied Biosystems, Austin, TX, USA), stored them in aliquots at -80[degrees]C, and thawed them immediately before microartay analysis. All RNA samples analyzed had absorbance ratios for A260:A280 and A260:A230 between 1.7 and 2.1, and we confirmed integrity by the presence of sharp 28S and 18S rRNA bands and a ratio of 28S:18S intensity of approximately 2:1 after denaturing gel electrophoresis.

Microarray study design and analysis. We randomized samples, and thus exposure groups, across labeling and hybridization reactions and across chips as uniformly as possible [see Supplemental Material, Table 1 (doi:10.1289/ehp.1002546)]. Technical replicates (n = 19), randomly chosen from among the 125 study subject samples, were included in the study to assess variability in the labeling, hybridization, and chip steps of the microarray procedure. We labeled samples (200 ng) in batches of 24 using the Illumina RNA Amplification kit (Ambion, Austin, TX, USA) and hybridized them to Illumina HumanRef-8 V2 BeadChips in batches of 32 (four chips) following the manufacturer's protocol. All sample processing was performed in a blinded manner.
Table 1. Characteristics of study subjects.

Benzene       Subjects  Air         WBC count
exposure                benzene

category      (n)       (ppm) (a)   (per [mu]L    Age
(ppm)                               blood)        (years)

Control (--)  42        < 0.04 (b)  6454.8 [+ or  29.5 [+ or
                                    -] 1746.5     -] 8.2

Very low      29        0.3 [+ or   5524.1 [+ or  30.3 [+ or
([much less             -] 0.9      -] 1369.2     -] 9.2
than]1) (c)

Low (< 1)     30        0.8 [+ or   5510.0 [+ or  27.9 [+ or
(d)                     -] 0.8      -] 1170.7     -] 7.2

High (5-10)   11        7.2 [+ or   5418.2 [+ or  29.7 [+ or
                        -] 1.3      -] 1376.8     -] 9.1

Very high (>  13        24.7 [+ or  5176.9 [+ or  30.9 [+ or
10)                     -] 15.7     -] 1326.8     -] 10.5

Benzene           Sex [n (%)]       Currently smoking [n (%)]

category      Male         Female   Yes               No

Control (--)  17 (33)      25 (34)  9 (35)            33 (33)

Very low      8 (16)       21 (28)  6 (23)            23 (23)
([much less
than]1) (c)

Low (< 1)     19 (37)      11 (15)  5 (19)            25 (25)

High (5-10)   1 (2)        10 (14)  1 (4)             10 (10)

Very high (>  6 (12)       7 (9)    5 (19)            8 (8)

WBC, white blood cell. Values for air benzene, WBC count, and age
are mean [+ or -] SD.
(a) Air benzene level in the 3 months preceding phlebotomy. (b) The
limit of detection for benzene was 0.04 ppm (Lan et al. 2004).
(c) The average level of benzene was < 1 ppm and dosimetry levels
were < 1 ppm at most measurements in the 3 months preceding phlebotomy
and at all measurements in the prior month. (d) The average level of
benzene was < 1 ppm (in the 3 months preceding phlebotomy) but
dosimetry levels were not always < 1 ppm in the previous 3 months.

Data analysis. We conducted variance components analysis using a linear mixed model (Laird and Ware 1982) to assess the proportion of total variation due to variation between subjects, hybridizations, labels, and chips, both before and after normalization [quantile normalization in the affy package (Gamier et al. 2004) in R (R Development Core Team 2010)]. For each probe, we estimated the association between exposure level and expression level using a mixed-effects model with random intercepts that accounted for clustering by subject, hybridization, and label. The fixed effects in our model, in addition to benzene exposure level, included sex (1 = male, 0 = female), current smoking status (1 = yes, 0 = no), and age (in years, linear term) as potential confounders of" associations between gene expression and benzene exposure. We fitted the mixed-effects model in .R with the Imer function in the lme4 package (Bates and Maechler 2010). We identified differentially expressed probes as those with a statistically significant log-fold change (based on likelihood ratio tests). We computed p-values adjusted for multiple testing by controlling the false discovery rate (FDR) with the Benjamini-Hochberg procedure (Benjamini and Hochberg 1995), using the multtest package in R. These values are FDR-adjusted p-values and were considered significant if they were [less than or equal to] 0.05, the traditional experiment-wise type I error rate. The raw data discussed here have been deposited in the National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) (Edgar et al. 2002) and are accessible through the GEO database (accession number GSE21862; NCBI 2002).

Pathway analysis. We imported microarray probe IDs into Pathway Studio software (Ariadne Genomics, Rockville, MD, USA), and queried the ResNet 7.0 database (Ariadne Genomics) for interactions among genes and gene products derived from the current literature (Nikitin et al. 2003). We also used a method known as "structurally enhanced pathway enrichment analysis" (SEPEA_NT3) (Thomas et al. 2009), which incorporates the associated network information of KEGG (Kyoto Encyclopedia of Genes and Genomes) biochemical pathways (Kanehisa and Goto 2000; Kyoto Encyclopedia of Genes and Genomes 2000). KEGG pathways are manually drawn pathway maps representing current knowledge on the molecular interaction and reaction networks involved in cellular processes such as merabolism and the cell cycle.

Gene Ontology (GO) analysis. The GO project (The Gene Ontology Consortium 2000) provides an ontology of defined terms representing gene product properties in the domains, cellular components, molecular functions, and biological processes. GO has a hierarchical structure that forms a directed acyclic graph in which each term has defined relationships to one or more other terms in the same domain, which can be described as parent-child relationships. Every GO term is represented by a node in this graph, and the nodes are annotated with a set of genes. We used TopGO (topology-based GO scoring; Bioconductor 2010) to calculate the significance of biological terms from gene expression data taking the GO structure into account (Alexa et al. 2006). We used the "elim" algorithm, which differs from standard GO analyses in that it eliminates genes from parent nodes that are members of "significant" child nodes. The elim score is the p-value returned by Fisher's exact test, and a node is marked as significant if the p-value is smaller than a previously defined threshold (Alexa et al. 2006). Typically this threshold is set to be 0.01 divided by the number of nodes in the GO graph with at least one annotated gene. This corresponds to a Bonferroni adjustment of the p-values. The most highly significant nodes thus derived are denoted as key nodes.

Both TopGO and SEPEA_NT3 have limitations (Barry et al. 2005; Nettleton et al. 2008). They assume independence between expressions of the genes, violation of which can lead to greater false positives than allowed by the nominal threshold set. These methods were chosen over more computationally intensive permutation-based subject sampling approaches.

Hierarchical clustering. We performed simple supervised clustering based on complete linkage (Murtagh 1985) in order to make heat maps [hierarchical agglomerative clustering with complete linkage; implemented in the hclust function in R (R Development Core Team 2010), called by the hearmap.2 function available with the gplots library in Bioconductor (Gentleman et al. 2004)]. Input data consisted of the four columns of [log.sub.2]-adjusted ratios (the coefficients from the linear mixed-effects models adjusted for both random and fixed effects). This provides clusters driven by average responses within dose groups rather than by potential confounding within groups.


Application of a mixed-effects model to analyze gene expression. We applied a mixed model (variance components analysis) to assess the proportion of total variation due to variation among subjects, hybridizations, labels, and chips, among the randomly selected within-subject replicates (n = 19). Plotting the distribution of the contribution of variance across all probes after notmalization revealed that the greatest source of variation was between subjects and was therefore consistent with biological causes (Figure 1). We also found substantial variation between labeling reactions. Therefore, for each probe, we estimated the association between exposure level and expression level using a mixed-effects model with (crossed) random intercepts that account for clustering by subject and by label (Laird and Ware 1982). Because the study design included randomization of samples--and thus exposures--across labeling reactions, an inferential procedure was necessary that allowed the existence of nonnested sources of correlation (labeling and subject). Thus, we used mixed models with so-called crossed random effects (Fitzmaurice et al. 2004), with the goal of providing more trustworthy inference than procedures that would have ignored, for instance, the variability caused by the labeling. (Many microarray studies are not designed to partition out the sources of variability and thus, if such sources are important, could provide misleading inference. In addition, it is often assumed that normalization will eliminate these sources of variability, but this assumption cannot be verified unless the study design allows for partitioning of the variance.) In the model, we also adjusted, as simple fixed effects, for biological variation in expression associated with differences in sex, age, and smoking status.

Effects of benzene exposure on gene expression, biological processes, and pathways. Analysis of the overall effect of benzene across the four exposure categories (very high, high, low, and very low) relative to unexposed controls (n = 42) revealed significantly altered expression (FDR-adjusted p-values [less than or equal to] 0.05) of 3,007 probes representing 2,846 genes [see Supplemental Material, Table 2 (doi:10.1289/ehp.1002546)]. Immune response (p = 3.78E-07) was the most significant key node among the GO processes associated with exposure (see Supplemental Material, Table 3), as determined by TopGO analysis. Pathway analysis by SEPEA_NT3 (Thomas et al. 2009) revealed highly significant (p < 0.001) impacts on the Toll-like receptor signaling pathway, oxidative phosphorylation, B-cell receptor signaling pathway, apoptosis, AML, and T-cell receptor signaling (see Supplemental Material, Table 4).
Table 2. Summary of GO categories overrepresented at each benzene
exposure category.

                           Total no.  Very low (n = 29)
GO ID (a)   GO term        of genes   No.        p-Value
                           (b)        genes      (c)

GO:0006412  translation    456        64         2.0E-06

GO:0006512  ubiquitin      480        48         7.5E-04

GO:0006917  induction of   216        27         4.1E-04

GO:0006955  immune         653        58         3.7E-03
            response                             (d)

GO:0015986  ATP synthesis  40         11         2.2E-05

GO:0006915  apoptosis      804        80         5.6E-03

GO:0030301  cholesterol    8          5          4.4E-05

GO:0006954  inflammatory   318

              Low (n = 30)             High (n = 11)
GO ID (a)   No.            p-Value    No.        p-Value
            genes          (c)        genes      (c)

GO:0006412  93             1.2E-03

GO:0006512  98             1.6E-05

GO:0006917  49             1.6E-04    19         1.5E-03

GO:0006955  124            4.6E-05    54         4.9E-06

GO:0015986  14             5.0E-04

GO:0006915  158            9.2E-04

GO:0030301  4              1.5E-02

GO:0006954  60             4.6E-03    34         2.8E-05

1           Very high (n = 13)
GO ID (a)   No. genes      p-Value




GO:0006955  97             1.1E-04

GO:0015986  11             1.8E-03

GO:0006915  107            2.7E-03

GO:0030301  4              5.5E-03


(a) GO categories that are significant at [greater than or equal to]
2 doses. (b) Number of annotated genes included on the chip.
(c) p-Values were determined using the elim method in TopGO, which
computes the statistical significance of a parent node dependent on the
significance of its children by Fisher's exact test; nodes are
significant if the p-value is smaller than a previously defined
threshold (Alexa et al. 2006), 0.01 divided by the number of nodes in
the GO graph with at least one annotated gene. (d) Significantly
enriched term in classic analysis (which does not take GO hierarchy
into account) but not in elim analysis in TopGO. Complete GO data are
available in Supplemental Material, Table 9 (doi:10.1289/ehp.1002546).

Table 3. p-Values for pathways altered at each benzene exposure

                            Benzene exposure category

Pathway name (a)      Very low (n  Low (n =  High (n =  Very high
                      = 29)        30)       11)        (n = 13)

Chronic myeloid             0.034     0.033

Pancreatic cancer           0.023     0.007

Oxidative                 < 0.001     0.003      0.001
phosphorylation (b)

Small-cell lung             0.004     0.002      0.027
cancer (b)

B-cell receptor             0.008     0.003      0.004
signaling pathway

Insulin signaling           0.015     0.035      0.052

Adipocytokine               0.034     0.002      0.019
signaling pathway

Circadian                    0.04     0.045      0.004

RNA polymerase            < 0.001                0.048

Toll-like receptor        < 0.001     0.002      0.001      0.004
signaling pathway

Epithelial cell           < 0.001     0.003      0.006      0.011
signaling in
Helicobacter pylori
infection (b)

GPI-anchor                < 0.001     0.041    < 0.001      0.007
biosynthesis (b)

T-cell receptor             0.005     0.002      0.005      0.018
signaling pathway

Apoptosis (b)               0.007     0.002      0.007      0.013

Cytokine--cytokine          0.036     0.011      0.030      0.004
receptor interaction

AML (b)                     0.037     0.002                 0.045

Fatty acid                  0.037                0.049      0.033

Nucleotide excision         0.001                0.008      0.005

Renal cell carcinoma                  0.024      0.015

Protein export                        0.053      0.024

Steroid biosynthesis                             0.004      0.034

Fc epsilon RI                         0.006                 0.046
signaling pathway

Jak-STAT signaling                                          0.048

MAPK signaling                        0.009                 0.023

(a) KEGG pathways that are significant at [greater than or equal to]
2 doses. (b) FDR-adjusted p-value (Benjamini and Hochberg 1995)
< 0.005 in overall analysis. Details of all KEGG pathways are
available from Kyoto Encyclopedia of Genes and Genomes (2000).

Table 4. Potential biomarkers of benzene exposure based on gene
expression ratios relative to unexposed controls.

                                          Benzene exposure category

                                          Very low (n = 29)
Probe    Symbol     Definition            Ratio      p-Value
ID                                                   (a)

5090327  SERPINB2   serpin peptidase           2.47    0.002
         (b)        inhibitor, clade B,
                    member 2

2370524  TNFAIP6    tumor necrosis             2.26    0.000
                    protein 6

6590338  IL1A (b)   interleukin 1,             2.00    0.001

1260746  KCNJ2      potassium                  1.97    0.000
                    channel, subfamily

2230131  PTX3 (b)   pentraxin-related          1.80    0.000
                    gene, rapidly
                    induced by IL-1

5860333  F3         coagulation factor         1.73    0.003
                    III (thromboplastin,
                    tissue factor)

1410189  CD44 (b)   CD44 antigen (Indian       1.64    0.000
                    blood group)

2470100  CCL20      chemokine (C-C             1.63    0.005
                    motif) ligand 20

4880717  ACSL1      acyl-CoA synthetase        1.63    0.001
                    long-chain family
                    member 1

1470682  PTGS2      prostaglandin-             1.60    0.000
         (b)        endoperoxide
                    synthase 2

1770152  CLEC5A     C-type lectin domain       1.57    0.009
                    family 5, member A

4060674  IL1RN      interleukin 1              1.55    0.003
                    receptor antagonist

7320646  PRG2       proteoglycan 2, bone       1.37    0.011

650709   SLC2A6     solute carrier             1.36    0.005
                    family 2, member 6

2900286  GPR132     G protein- coupled         1.34    0.047
                    receptor 132

3710379  PLAUR      plasminogen                1.29    0.035
                    activator, urokinase

            Low (n = 30)                     High (n = 11)
Probe         Ratio              p-Value      Ratio  p-Value
ID                                  (a)               (a)

5090327       5.19                 0.000       3.03    0.005

2370524       2.94                 0.000       1.72    0.030

6590338       3.03                 0.000       2.36    0.000

1260746       2.54                 0.000       2.09    0.000

2230131       2.30                 0.000       1.62    0.003

5860333       2.83                 0.000       1.78    0.034

1410189       1.76                 0.000       1.64    0.005

2470100       2.30                 0.000       1.59    0.041

4880717       1.79                 0.000       1.59    0.010

1470682       1.98                 0.000       1.68    0.003

1770152       2.26                 0.000       1.78    0.014

4060674       2.26                 0.000       1.54    0.020

7320646       1.83                 0.000        1.5    0.007

650709        1.72                 0.000        1.5    0.000

2900286       1.87                 0.000        1.6    0.003

3710379       1.80                 0.000        1.6    0.002

          Very high (n = 13)

Probe         Ratio  p-Value
ID                   (a)

5090327       3.39     0.001

2370524       2.13     0.000

6590338       2.53     0.000

1260746       1.56     0.012

2230131       1.81     0.000

5860333       2.41     0.001

1410189       1.78     0.000

2470100       2.11     0.000

4880717       1.68     0.002

1470682       1.75     0.000

1770152       2.26     0.000

4060674       1.61     0.004

7320646       1.69     0.000

650709        1.60     0.000

2900286       1.80     0.000

3710379       1.58     0.001

Genes shown are up- or down-regulated [greater than or equal to]
1.5-fold relative to unexposed controls at three or four doses.
(a) FDR-adjusted p-value (Benjamini and Hochberg 1995). (b) Genes
that have p-values [less than or equal to] 0.005 at all four doses.

Large numbers of genes were significantly differentially expressed (FDR-adjusted p-values [less than or equal to] 0.05) in samples from each of the four exposure categories relative to controls [see Supplemental Material, Figure I and Tables 5-8 (doi:10.1289/ehp.1002546)]. We identified several GO processes implicated in the overall analysis as key nodes across three to four dose categories, including immune response, apoptosis, and ATP synthesiscoupled proton transport [Table 2; for complete data, see Supplemental Material, Table 9).

Similarly, multiple pathways found to be highly significant in the overall analysis (p [less than or equal to] 0.005), including Toll-like receptor signaling, oxidative phosphorylation, B-cell receptor signaling, apoprosis, AML, and T-cell receptor signaling, were enriched among the differentially expressed genes associated with three (including the very low dose category) or four exposure categories [Table 3; for complete data, see Supplemental Material, Table 10 (doi:10.1289/ehp.102546)].

Twelve genes were up-regulated [greater than or equal to] 1.5-fold at all four doses relative to unexposed controls, including five genes [PTX3 (pentraxin-related gene), CD44 (CD44 antigen), PTGS2 (prostaglandin-endoperoxide synthase 2), IL1A (interleukin 1, alpha), and SERPINB2 (serpin peptidase inhibitor, clade B, member 2) with FDR-adjusted p-values [less than or equal to] 0.005. An additional four genes were up-regulated > 1.5-fold at the top three doses, and > 1.3-fold at the lowest dose (Table 4). Expression of each of the 16 signature genes across the five exposure categories shows a distinct pattern, with the highest expression in the < 1-ppm (low) exposure group [see Supplemental Material, Figure 2 (doi:10.1289/ehp.1002546)]. The 16 genes are involved in immune response, inflammatory response, cell adhesion, cell-matrix adhesion, and blood coagulation (see Supplemental Material, Table 11). Ten of the 16 genes (or their products), 7 of which are involved in inflammatory response (p = 1.4E-12), form a network (Figure 2) with central roles for IL1A and PTGS2.

Dose-specific effects. We used supervised hierarchical clustering to generate a heat map to allow visualization of patterns of gene expression across exposure categories. One group of genes (~ 100) exhibited reduced expression (ratios < 1) with increasing dose relative to controls, whereas a second group (~ 100) appeared to be elevated at all doses but more so at low-dose exposure (Figure 3).

We also observed dose-dependent effects on biological processes and pathways. For example, nucleosome assembly [see Supplemental Material, Table 9 (doi:10.1289/ehp.1002546)] and the ATP-binding cassette (ABC) transporter pathway (see Supplemental Material, Table 10) appeared to be deregulated only at the very high exposure level. Among 78 genes that were highly significantly (FDR p-value [less than or equal to] 0.05) associated with a [greater than or equal to] 1.5-fold increase in expression in the very high exposure group, and not significantly altered at any of the other exposure categories relative to controls, a network involving 19 genes (or their products) was apparent, in which v-src sarcoma viral oncogene homolog (SRC) and matrix metallopeptidase 9 (MMP9) play central roles (see Supplemental Material, Figure 3). Among 29 genes significantly altered only at low-dose benzene exposure, we identified a network of 15 genes involved in immune response (p = 4E-12), with central roles for interferon gamma (IFNG) and tumor necrosis factor (TNF) (see Supplemental Material, Figure 4). Together, these data suggest that benzene induces dose-dependent effects, with the caveat that differences in power among the different exposure categories may have influenced the resulting significant gene lists.


Technical variation is often ignored in human toxicogenomic studies, leading to potential bias in differential expression arising from correlation with technical variation. In the present study, we applied a rigorous study design to assess sources of both potential confounding and experimental variability (nuisance variation) and analyzed the data using statistical techniques that incorporate non-nested sources of variation (i.e., those not eliminated by normalization) and that return estimates of least variability with accurate inference (linear mixed-effects models). This approach increased the power to detect associations between benzene exposure and gene expression, even at low-dose exposure levels.

More genes remained significantly up- or down-regulated compared with controls after multiple test correction in the present study than in an earlier study examining samples from eight pairs of exposed workers and unexposed controls on the Illumina platform (McHale et al. 2009), likely because of the increased number of individuals and the rigorous approach to study design. Nonetheless, we identified 247 genes in both study populations using the Illumina platform. Of 488 significant genes cross-validated on both Illumina and Affymetrix platforms (McHale et al. 2009), 147 genes were significant in the present study. ZNF331 (zinc finger protein 331), significant after multiple test correction in individuals occupationally exposed to benzene at levels > 10 ppm compared with controls in two earlier studies (Forrest et al. 2005; McHale et al. 2009), was significantly up-regulated at both < 1 ppm and > 10 ppm in the present study.

The finding that genes in the AML pathway were strongly associated with multiple exposure levels of benzene provides support for our approach because epidemiological studies have established that benzene causes AML (Baan et al. 2009; Smith 2010). However, such disease associations must be treated cautiously because the KEGG pathway information, on which the pathway analyses were based, is limited for AML, and a KEGG pathway for NHL has not been defined. Information about altered molecular and cellular processes can provide biological plausibility for probable disease associations. Immune response, previously found to be associated with > 10 ppm benzene exposure in our earlier transcriptomic study of eight high-exposed control pairs (McHale et al. 2009), was one of the major processes significantly altered across multiple exposure levels in the present study, involving both innate (Toll-like receptor signaling) and adaptive (B-cell receptor signaling and T-cell receptor signaling pathway) responses. Additionally, we found central roles for the proinflammatory cytokines IFNG and TNF among genes uniquely altered at low-dose exposure in the present study. A single nucleotide polymorphism in TNF-[alpha] was previously associated with susceptibility to bone marrow dysplasia in chronic benzene poisoning (Lv et al. 2007). Further, genetic variation in TNF (Rothman et al. 2006), Toll-like receptor genes (Purdue et al. 2009), and IFNG (Colt et al. 2009) has previously been associated with NHL risk. Deregulation of pathways involving these genes through sustained alterations in expression provides biological plausibility for the association of benzene with lymphoid neoplasms.

Findings from the present study are consistent with previous reports of adverse effects of benzene on oxidative stress (Kolachana et al. 1993) and mitochondria (Inayat-Hussain and Ross 2005). Here, we found highly significant associations with ATP synthesis--coupled proton transport and oxidative phosphorylation at all levels of benzene exposure relative to unexposed controls. Expression of superoxide dismutase (SOD), a mitochondrial defense against reactive oxygen species, was up-regulated in the present study by 50-100% relative to controls. HMOX1 [heme oxygenase (decycling) 1], an antioxidant and suppressor of TNF-[alpha] signaling (Lee et al. 2009), was down-regulated in the low-dose benzene exposure group. Increased mitochondrial membrane permeability potential induced by benzene metabolites (Inayat-Hussain and Ross 2005) can lead to the initiation of apoptosis. Indeed, apoptosis was associated with all benzene doses in the present study, consistent with our earlier observation of an association with high-dose benzene exposure (> 10 ppm) (McHale et al. 2009).

Previously, we found that chromatin assembly was significantly altered after high-dose benzene exposure (McHale et al. 2009). The finding that nucleosome assembly (a GO category nested within chromatin assembly) was overrepresented in the highest exposure category in the present study confirms and clarifies this potential mechanism of benzene-associated leukemia.

Although significant involvement of the p53 response pathway was previously found in mice exposed to very high levels of benzene (Faiola et al. 2004; Yoon et al. 2003), we did not find such involvement in the present study or in our earlier studies, and the immune and inflammatory effects we found here in humans were not recapitulated in the mouse micro-array studies (Faiola et al. 2004; Yoon et al. 2003). These differences suggest that human toxicogenomic studies may be more relevant than animal studies, although differences in exposure levels, tissues examined, and uncontrolled confounding in the human study could also be contributing factors.

Our findings suggest two novel hypotheses regarding benzene toxicity. Glycosylphosphatidylinositol (GPI)-anchor biosynthesis was associated with all doses of benzene exposure in the present study. The GPI anchor is a C-terminal posttranslational modification that anchors the modified protein in the outer leaflet of the cell membrane and putatively plays roles in lipid raft partitioning, signal transduction, and cellular communication (Paulick and Bertozzi 2008). Because epigenetic silencing of genes involved in GPI-anchor biosynthesis may be important in human disease, including lymphomas (Hu et al. 2009), further investigation of its role in benzene-associated disease is warranted.

ABC transporters were associated highly significantly with only the highest (> 10 ppm) benzene dose. In addition to their capacity to extrude cytotoxic drugs, ABC transporters are known to play important roles in the development, differentiation, and maturation of immune cells and are involved in migration of immune effector cells to sites of inflammation (van de Ven et al. 2009).

Our findings also suggest a potential gene expression signature of benzene exposure. In particular, IL1A and PTGS2 played central roles in the interaction network characterizing the gene expression signature associated with benzene in this study. Both molecules are produced by activated macrophages and other cells in inflammatory responses. A single nucleotide polymorphism that increases IL1A mRNA expression has been inversely associated with granulocyte count in benzene-exposed individuals (Lan et al. 2005). Overexpression of PTGS2, which occurs frequently in premalignant and malignant neoplasms, including hematological malignancies (Bernard et al. 2008), together with overexpression of the prostaglandin cascade, leads to carcinogenesis through a progressive series of highly specific cellular and molecular changes (Harris 2009).

The expression pattern of the signature genes suggests a nonlinear response to benzene. Other biomarkers evaluated in populations exposed to benzene have shown similar patterns, including hematotoxicity (Lan et al. 2004), benzene metabolism (Kim et al. 2006), and the generation of protein adducts (Rappaport et al. 2002, 2005). Further characterization of the expression levels of these genes across a range of benzene exposures in a larger, independent study is necessary to determine the applicability of the signature genes as biomarkers of early effects and to explore more formally the shape of the dose--response curve.


We have identified gene expression biomarkers of early effects across a range of benzene exposures. Our findings support previously reported mechanisms relevant to adverse effects of benzene and suggest potential novel mechanisms for benzene toxicity. Future work should include validation of the potential biomarkers and determining whether the gene expression changes are effected through epigenetic processes such as DNA methylation (Bollati et al. 2007) and miRNA expression.


Alexa A, Rahnenfuhrer J, Lengauer T. 2006. Improved scoring of functional groups from gene expression data by decorrelating GO graph structure. Bioinformatics 22(13): 1600-1607.

Baan R, Grosse Y, Straif K, Secretan B, El Ghissassi F, Bouvard V, et al. 2009. A review of human carcinogens--part F: chemical agents and related occupations. Lancet Oncol 10(12): 1143-1144.

Barry WT, Nobel AB, Wright FA. 2005. Significance analysis of functional categories in gene expression studies: a structured permutation approach. Bioinformatics 21(9): 1943-1949.

Bates D, Maechler M. 2010. lme4: Linear Mixed-Effects Models Using S4 Classes. R Package Version 0.999375-34. Available: [accessed 30 September 2010].

Benjamini Y, Hochberg Y. 1995. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B (57):289-300.

Bernard MP, Bancos S, Sime PJ, Phipps RP. 2008. Targeting cyclooxygenase-2 in hematological malignancies: rationale and promise. Curr Pharm Des 14(21):2051-2060.

Bioconductor. 2010. topGO: Enrichment analysis for Gene Ontology. Available: [accessed 18 March 2011].

Bollati V, Baccarelli A, Hou L, Bonzini M, Fustinoni S, Cavallo D, et al. 2007. Changes in DNA methylation patterns in subjects exposed to low-dose benzene. Cancer Res 67(3):876-880.

Colt JS, Rothman N, Severson RK, Hartge P, Cerhan JR, Chatterjee N, et al. 2009. Organochlorine exposure, immune gene variation, and risk of non-Hodgkin lymphoma. Blood 113(9):1899-1905.

Edgar R, Domrachev M, Lash AE. 2002. Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res 30(1):207-210.

Faiola B, Fuller ES, Wong VA, Recio L. 2004. Gene expression profile in bone marrow and hematopoietic stem cells in mice exposed to inhaled benzene. Mutat Res 549(1-2):195-212.

Fitzmaurice GM, Laird NM, Ware JH. 2004. Applied Longitudinal Analysis. New York:John Wiley and Sons.

Forrest MS, Lan Q, Hubbard AE, Zhang L, Vermeulen R, Zhao X, et al. 2005. Discovery of novel biomarkers by microarray analysis of peripheral blood mononuclear cell gene expression in benzene-exposed workers. Environ Health Perspect 113:801-807.

Gautier L, Cope L, Bolstad BM, Irizarry RA. 2004. affy--Analysis of Affymetrix GeneChip data at the probe level. Bioinformatics 20(3):307-315.

Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, et al. 2004. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5(10):R80; doi:10.1186/gb-2004-5-10-r80 [Online 15 September 2004].

Harris RE. 2009. Cyclooxygenase-2 (cox-2) blockade in the chemoprevention of cancers of the colon, breast, prostate, and lung. Inflammopharmacology 17(2):55-67.

Hu R, Mukhina GL, Lee SH, Jones RJ, Englund PT, Brown P, et al. 2009. Silencing of genes required for glycosylphosphatidylinositol anchor biosynthesis in Burkitt lymphoma. Exp Hematol 37(4):423-434.

Inayat-Hussain SH, Ross D. 2005. Intrinsic pathway of hydroquinone induced apoptosis occurs via both caspase-dependent and caspase-independent mechanisms. Chem Res Toxicol 18(3):420-427.

Kanehisa M, Goto S. 2000. KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res 28(1):27-30.

Kim S, Vermeulen R, Waidyanatha S, Johnson BA, Lan Q, Smith MT, et al. 2006. Modeling human metabolism of benzene following occupational and environmental exposures. Cancer Epidemiol Biomarkers Prev 15(11):2246-2252.

Kolachana P, Subrahmanyam VV, Meyer KB, Zhang L, Smith MT. 1993. Benzene and its phenolic metabolites produca oxidative DNA damage in HL60 cells in vitro and in the bone marrow in vivo. Cancer Res 53(5):1023-1026.

Kyoto Encyclopedia of Genes and Genomes. 2000. KEGG Pathway Database, Available: [accessed 18 March 2011].

Laird NM, Ware JH. 1982. Random-effects models for longitudinal data. Biometrics 38(4):963-974.

Lan Q, Vermeulen R, Zhang L, Li G, Rosenberg PS, Alter BP, et al. 2006. Benzene exposure and hematotoxicity: response [Letter]. Science 312(5776):998-999.

Lan Q, Zhang L, Li G, Vermeulen R, Weinberg RS, Dosemeci M, et al. 2004. Hematotoxicity in workers exposed to low levels of benzene. Science 306(5702):1774-1776.

Lan Q, Zhang L, Shen M, Smith MT, Li G, Vermeulen R, et al. 2005. Polymorphisms in cytokine and cellular adhesion molecule genes and susceptibility to hematotoxicity among workers exposed to benzene. Cancer Res 65(20):9574-9581.

Lee IT, Luo SF, Lee CW, Wang SW, Lin CC, Chang CC, et al. 2009. Overexpression of H0-1 protects against TNF-[alpha]-mediated airway inflammation by down-regulation of TNFR1-dependent oxidative stress. Am J Pathol 175(2):519-532.

Lv L, Kerzic P, Lin G, Schnatter AR, Bao L, Yang Y, et al. 2007. The TNF-[alpha] 238A polymorphism is associated with susceptibility to persistent bone marrow dysplasia following chronic exposure to benzene. Leuk Res 31(11):1479-1485.

McHale CM, Zhang L, Hubbard AE, Smith MT. 2010. Toxicogenomic profiling of chemically exposed humans in risk assessment. Mutat Res 705(3):172-183.

McHale CM, Zhang L, Lan Q, Li G, Hubbard AE, Forrest MS, et al. 2009. Changes in the peripheral blood transcriptome associated with occupational benzene exposure identified by cross-comparison on two microarray platforms. Genomics 93:343-349.

Murtagh F. 1985. Multidimensional Clustering Algorithms. Compstat Lectures 4. Vienna:Phvsica-Verlag.

NCBI (National Center for Biotechnology Information). 2002. Gene Expression Omnibus (GEO). Available: http://www.ncbi.nlm.nih.gou/geo/ [accessed 22 March 2011].

Nettleton D, Recknor J, Reecy JM. 2008. Identification of differentially expressed gene categories in microarray studies using nonparametric multivariate analysis. Bioinformatics 24(2):192-201.

Nikitin A, Egorov S, Daraselia N, Mazo l. 2003. Pathway studio--the analysis and navigation of molecular networks. Bioinformatics 19(16):2155-2157.

Occupational Safety and Health Administration. 1987. Occupational exposure to benzene. Fed Reg 52:34460-34579.

Paulick MG, Bertozzi CR. 2008. The glycosylphosphatidylinositol anchor: a complex membrane-anchoring structure for proteins. Biochemistry 47(27):6991-7000.

Purdue MP, Lan Q, Wang SS, Kricker A, Menashe I, Zheng TZ, et al. 2009. A pooled investigation of Toll-like receptor gene variants and risk of non-Hodgkin lymphoma. Carcinogenesis 30(2):275-281.

R Development Core Team. 2010. R: A Language and Environment for Statistical Computing. Available: [accessed 25 March 2011].

Rappaport SM, Waidyanatha S, Yeowell-O'Connell K, Rothman N, Smith MT, Zhang L, et al. 2005. Protein adducts as biomarkers of human benzene metabolism. Chem Biol Interact 153-154:103-109.

Rappaport SM, Yeowell-O'Connell K, Smith MT, Dosemeci M, Hayes RB, Zhang L, et al. 2002. Non-linear production of benzene oxide-albumin adducts with human exposure to benzene. J Chromatogr B Analyt Technol Biomed Life Sci 778(1-2):367-374.

Rothman N, Skibola CF, Wang SS, Morgan G, Lan Q, Smith MT, et al. 2006. Genetic variation in TNF and IL10 and risk of non-Hodgkin lymphoma: a report from the InterLymph Consortium. Lancet Oncol 7(1):27-38.

Smith MT. 2010. Advances in understanding benzene health effects and susceptibility. Annu Rev Public Health 31:133-148.

The Gene Ontology Consortium 2000. Gene ontology: tool for the unification of biology. Nat Genet 25(1):25-29. Available: [accessed 18 March 2011].

Thomas R, Gohlke JM, Stopper GF, Parham FM, Portier CJ. 2009. Choosing the right path: enhancement of biologically relevant sets of genes or proteins using pathway structure. Genome Biol 10(4):R44; doi:10.1186/gb-2009-10-4-r44 [Online 24 April 2009].

van de Ven R, Oerlemans R, van der Heijden JW, Scheffer GL, de Gruijl TD, Jansen G, et al. 2009. ABC drug transporters and immunity: novel therapeutic targets in autoimmunity and cancer. J Leukoc Biol 86(5):1075-1087.

Vermeulen R, Li G, Lan Q, Dosemeci M, Rappaport SM, Bohong X, et al. 2004. Detailed exposure assessment for a molecular epidemiology study of benzene in two shoe factories in China. Ann Occup Hyg 48(2):105-116.

Vlaanderen J, Lan Q, Krombout H, Rothman N, Vermeulen R. 2010. Occupational benzene exposure and the risk of lymphoma subtypes: a meta-analysis of cohort studies incorporating three study quality dimensions. Environmental Health Perspect 119:159-167.

Yoon Bl, Li GX, Kitada K, Kawasaki Y, Igarashi K, Kodama Y, et al. 2003. Mechanisms of benzene-induced hematotoxicity and leukemogenicity: cDNA microarray analyses using mouse bone marrow tissue. Environ Health Perspect 111:1411-1420.

Cliona M. McHale, (1) Luoping Zhang, (1) Qing Lan, (2) Roel Vermeulen, (3) Guilan Li, (4) Alan E. Hubbard, (1) Kristin E. Porter, (1) Reuben Thomas, (5) Christopher J. Portier, (5) Min Shen, (2) Stephen M. Rappaport, (1) Songnian Yin, (4) Martyn T. Smith, (1) and Nathaniel Rothman (2)

(1) School of Public Health, University of California-Berkeley, Berkeley, California, USA; (2) Occupational and Environmental Epidemiology Branch, Division of Cancer Epidemiology and Genetics, National Cancer Institute, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland, USA; (3) Institute of Risk Assessment Sciences, Utrecht University, Utrecht, the Netherlands; (4) Institute of Occupational Health and Poison Control, Chinese Center for Disease Control and Prevention, Beijing, China; 5 Environmental Systems Biology Group, Laboratory of Molecular Toxicology, National Institute of Environmental Health Sciences, National Institutes of Health, Department of Health and Human Services, Research Triangle Park, North Carolina, USA

Address correspondence to C.M. McHale, School of Public Health, 211 Hildebrand Hall, University of California--Berkeley, Berkeley, CA 94720 USA. Telephone: (510) 643-5349. Fax: (510) 642-0470. -mail:

Supplemental Material is available online (doi:10.1289/ehp.1002546 via

We thank the participants for taking part in this study.

This research was supported by National Institutes of Health (NIH) grants R01ES06721 and P42ES04705 (to M.T.S.), National Institute of Environmental Health Sciences grants P42ES05948 and P30ES10126 (to S.M.R.), and the intramural research program of the National Cancer Institute.

G.L. has received funds from the American Petroleum Institute for consulting on benzene-related health research. S.M.R. has received consulting and expert testimony fees from law firms representing plaintiffs' cases involving exposure to benzene and has received research support from the American Petroleum Institute and the American Chemistry Council. M.T.S. has received consulting and expert testimony fees from law firms representing both plaintiffs and defendants in cases involving exposure to benzene. The other authors declare they have no actual or potential competing financial interests.

Received 10 June 2010; accepted 13 December 2010.
COPYRIGHT 2011 National Institute of Environmental Health Sciences
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2011 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Research
Author:McHale, Cliona M.; Zhang, Luoping; Lan, Qing; Vermeulen, Roel; Li, Guilan; Hubbard, Alan E.; Porter,
Publication:Environmental Health Perspectives
Article Type:Report
Geographic Code:9CHIN
Date:May 1, 2011
Previous Article:Effects of short-term exposure to inhalable particulate matter on telomere length, telomerase expression, and telomerase methylation in steel workers.
Next Article:Epigenetic alterations in liver of C57BL/6J mice after short-term inhalational exposure to 1,3-butadiene.

Terms of use | Copyright © 2018 Farlex, Inc. | Feedback | For webmasters