Printer Friendly

Sex-specific associations between particulate matter exposure and gene expression in independent discovery and validation cohorts of middle-aged men and women.


Particulate matter (PM) is a complex mixture of small particles and liquid droplets that contains a number of components, including acids, organic chemicals, metals, and soil or dust particles. PM exposure is known to increase overall mortality and morbidity, mainly due to its effect on the cardiorespiratory system (Alfaro-Moreno et al. 2007; Pope et al. 2004). Exposure to PM may disturb normal physiological pathways that maintain homeostasis and this may activate cellular processes that mediate the adverse effects of PM (Kleensang et al. 2014). Gene expression changes play an important role in the activation of pathways of toxicity and gene signatures have the potential to serve as biomarkers of exposure (van Leeuwen et al. 2008; van Breda et al. 2015) and recent reports demonstrate their potential use as biomarkers of effect (La Rocca et al. 2014; Fink et al. 2014). As it has been shown previously that transcriptomic responses to diverse environmental stimuli (i.e., chemical exposure, smoking) can be significantly different between men and women (De Coster et al. 2013; Paul and Amundson 2014), we have opted to perform a sex-specific analysis.

Several studies have suggested that elevated oxidative stress may mediate toxic effects of air pollutants (Donaldson et al. 2005; Nel et al. 2001). The systemic inflammatory response following acute inhalation exposure to PM can induce leukocytosis and monocyte release from the bone marrow (Fujii et al. 2002). Controlled exposure studies of recent diesel exhaust exposure (Pettit et al. 2012) and recent exposure to ultra-fine particles (Huang et al. 2010) have reported evidence of altered gene expression in leukocytes but, to our knowledge, associations between patterns of gene expression and long-term particulate air pollution have not been studied in general populations.

Materials and Methods

Study Design

As our goal was to identify transcriptomic biomarkers of exposure and effect in a healthy adult population, we started by applying microarray analysis in a discovery cohort of 98 adults for which we modelled particulate matter exposure. On the resulting dataset containing significantly modulated genes and pathways, we applied a literature and bio-informatics approach to identify potential exposure effect biomarkers. Subsequently, these were validated using qPCR analysis in an independent cohort with similar characteristics as the discovery population (Figure 1). Study protocols for the discovery and validation cohort were approved by the Institutional Review Board and the Ethical Committee of Antwerp University and informed consent was obtained from all participants.

Study Population

Discovery cohort. The original study population was described previously (van Leeuwen et al. 2008) and consisted of 398 participants from eight different regions of residence in Flanders (Belgium), as part of the first Flemish Environment and Health Survey (FLEHS I) during the period 2001-2006. Participants were recruited in several communities based on random sampling. Inclusion criteria were age 50-65 years, living in Flanders > 5 years, and being able to complete questionnaires in Dutch. Prior to blood collection, informed consent was obtained from all individuals. A subset of 98 samples was selected for microarray analysis based on previously measured exposure levels to several pollutants including cadmium, lead, polychlorinated biphenyls (PCBs) (138, 153, and 180), dioxins, polycyclic aromatic hydrocarbons (PAHs), and benzene. The overall exposure to these pollutants was estimated using a z-score for each pollutant, and study participants with both low- and high- exposure levels were chosen for inclusion. Z-scores were not correlated with long-term [PM.sub.10] exposure ([r.sup.2] = 0.0012). Smokers were excluded from the study population. PAXgene tubes (PreAnalytiX GmbH, Hombrechtikon, Switzerland) were used for RNA collection.

Validation cohort. The quantitative polymerase chain reaction (qPCR) validation study was performed in an independent cohort of 175 adults who were part of the third Flemish Environment and Health Survey (FLEHS III) during the period 2012-2015. Healthy volunteers who were between 50 and 65 years of age, living at the same residential address for at least 10 years, and able to complete questionnaires in Dutch were recruited through registers of general medical practices. Prior to blood collection, informed consent was obtained from all individuals. Participants completed a questionnaire covering age, sex, and smoking habits, among other demographic characteristics; they donated blood and urine samples; and subclinical measurements including height, weight, and blood pressure were determined. The sampling campaign lasted from May 2014 until 30 December 2014. We used PAXgene tubes (PreAnalytiX GmbH, Hombrechtikon, Switzerland) to stabilize whole blood RNA for storage.

Exposure Estimates

The [PM.sub.10] and [PM.sub.2.5] (10 and 2.5 [micro]m in aerodynamic diameter) concentrations for participants' residential addresses were calculated using a spatial temporal interpolation method (Kriging) that takes into account land cover data from satellite images [CORINE (coordination of information on the environment) land cover data set; http://www.] for interpolating the measurement data of the monitoring stations from the Belgian telemetric air quality network as described previously (Maiheu et al. 2013; Jacobs et al. 2010; Janssen et al. 2008). Validation statistics of the interpolation tool gave a temporal explained variance of > 0.7 for hourly [PM.sub.10] averages as well as for annual mean [PM.sub.10] (Maiheu et al. 2013). In combination with the Immission Frequency Distribution Model (IFDM)] using emissions from line sources and point sources, the model chain provides daily [PM.sub.10] and [PM.sub.2.5] values on a 25 x 25 m receptor grid (Lefebvre et al. 2013). Our model is based on input data from 38 monitoring stations in the study area. The Initiative on Harmonisation within Atmospheric Dispersion Modelling for Regulatory Use in Europe was the incentive for intensive model intercomparison. IFDM was thoroughly compared with other models currently in use for regulatory purposes in Europe (Olesen 1995; Maes et al. 1995; Cosemans et al. 1995, 2001; Mensink and Maes 1996).


Mean daily temperatures and relative humidity for the study region were provided by the Royal Meteorological Institute (Brussels, Belgium), and apparent temperature was calculated (Steadman 1979; Kalkstein and Valimont 1986).

All our estimates were annual mean exposures over a 2-year period because we were interested in developing biomarkers for long-term exposure. For the discovery cohort, annual means were based on 2011-2012 as these were the earliest years for which detailed 25 x 25 m grid information became available. Distribution patterns were used for the year 2008. We assumed that relative differences in annual mean concentrations of particulate matter were generally consistent from year to year. For the validation cohort, annual means were based on the 2 years prior to blood sampling (i.e., 2012-2013).

RNA Isolation

Total RNA was isolated from 2.5 mL whole blood from PAXgene Blood RNA vacutainers using the PAXgene Blood RNA system (PreAnalytiX, Qiagen, Hilden, Germany), according to the manufacturer's instructions. A globin reduction assay (GLOBINclear[TM] Kit by Ambion, Austin, TX, USA) was performed in order to remove hemoglobin mRNA from samples that were submitted to microarray analysis. RNA integrity was assessed using the BioAnalyzer (Agilent, Palo Alto, CA, USA) and purity was measured spectrophotometrically. Labeled samples were checked for specific activity and dye incorporation.

Microarray Preparation and Hybridization

We used 0.2 [micro]g total RNA from each sample to synthesize dye-labeled cRNA (Cy3) following the Agilent one-color Quick-Amp labeling protocol (Agilent Technologies). Individual samples were hybridized on Agilent 4 X 44 K Whole Human Genome microarrays (design ID 014850).

Microarray Data Analysis

Microarrays were scanned on an Agilent G2505C DNA Microarray Scanner (Agilent Technologies, Amstelveen, Netherlands). Raw data on pixel intensities were extracted from the scanned images using Agilent Feature Extraction Software (version; Agilent Technologies, Amstelveen, Netherlands), protocol GE1_107.sep09. Raw data were pre-processed using an in-house developed quality control pipeline in R (version 2.15.3; R Project for Statistical Computing) as follows: local background correction, flagging of bad spots, controls and spots with intensities below background, log2 transformation and quantile normalization. The R-scripts of the pipeline and additional information on the flagging can be found at https://github. com/BiGCAT-UM/arrayQC_Module. From the processed data files genes were omitted showing more than 30% flagged data, after which the data files were transferred to the Gene Expression Pattern Analysis Suite, GEPAS 2010 (Montaner et al. 2006) for further preprocessing, including merging replicate probes (based on median), and imputing missing values by means of K-nearest neighbor imputation (K = 15). Filtering for flat peaks was used with root mean square value 0.25. The filtered data, containing 28,786 genes, were used for further statistical analyses. Microarray gene expression data were analyzed and stratified for sex. In the original microarray data that set initially 28,786 unique Agilent probe IDs (out of 43,376 Agilent probe IDs) were annotated to 22,390 EntrezGene IDs. In case of multiple replicates (i.e., multiple probes for the same gene), the replicate with highest interquartile range (IQR) in relative gene expression was selected. This resulted in 15,589 unique EntrezGene IDs.

Gene Expression Analysis

Using linear regression models adjusted for age, body mass index (BMI), socioeconomic status (SES, classified in three groups: no high school degree, high school degree, or further education degree), daytime and season of blood sampling, we obtained estimates for each gene as the [log.sub.2]-fold change in gene expression for an increment of 5 [micro]g/[m.sup.3] in exposure. p-Values < 0.05 were considered statistically significant. p-Values were corrected for multiplicity using the Benjamini-Hochberg false discovery rate (FDR) correction. p-Values corrected for multiple testing are referred to as p-values.

Pathway Analysis

Gene Set Enrichment Analysis was performed utilizing the online pathway analysis tool Consensus Pathway Data Base (CPDB) ( CPDB contains ~ 5,200 pathways including protein complexes, metabolic, signaling, and gene regulatory pathways, as well as drug-target interactions. Data originate from 32 public resources curated from the literature (Kamburov et al. 2013). Gene Set Enrichment Analysis was performed in a sex-specific manner using the [log.sub.2]-fold changes of the gene expression data for all genes analyzed at the gene expression level, without preselection. For every predefined gene set in each pathway, a Wilcoxon signed-rank test was calculated, testing the null hypothesis that the distribution of their fold changes was around zero. As input, all genes without a priori selection (EntrezGene IDs) were uploaded with their fold changes in their gene expression. We selected the biological processes using pathways as output. The p-values were corrected for multiplicity and were presented as q-values. We defined significant biological processes and pathways by a threshold on the adjusted p-value (q < 0.05 or FDR 5%) and we included gene sets with a size between 5 and 100 members.

Selection of Potential Exposure/ Effect Biomarker Genes

We used a modified version of the meet-in-the-middle approach for biomarker identification in relation to clinical relevance (Vineis et al. 2013), a schematic representation is shown in Figure 1. We first identified the top 50 genes associated with [PM.sub.10] (i.e., with the smallest uncorrected p-values) in men and women, respectively, then performed a literature search using PubMed and ScienceDirect to identify genes within each sex-specific set that have been associated with air pollution-related health outcomes. Specifically, we searched for the name of each gene in combination with any of the following diseases or processes: allergy (Magnussen et al. 1993), chronic obstructive pulmonary disease (COPD) (Ko et al. 2007), asthma (Bowatte et al. 2015), lung cancer (Raaschou-Nielsen et al. 2011), cardiovascular disease (CVD) (Mills et al. 2009), cerebrovascular disease (CeVD) (Johnson et al. 2010), Alzheimer's disease (Finkelstein and Jerrett 2007) and cognition (Dadvand et al. 2015). Genes with lowest p-values and proven link to air pollution (AP)-related diseases were chosen for validation. For men, DNAJB5, RAC3, EAPP, HDLBP, PRG2, PER1, PIK31, and SLA2 were selected for validation, whereas for women the gene list for validation included AKAP6, LIMK1, SIRT7, ARHPGAP4, ATG16L2, TPM3, 5-HT1B, and PYGO2.

Validation of Candidate Biomarker Genes by qPCR

qPCR. Total RNA was reverse transcribed into cDNA by means of the GoScript Reverse Transcription System (Promega, Madison, WI, USA) using a Veriti 96 well Thermal cycler (TC-5000, Techne, Burlington, NJ, USA). A maximum of 3 pg of total RNA was used as input, and we used the protocol with an equal amount of oligo(dT) and random hexamer primers according to the manufacturer's instructions. cDNA was stored at -20[degrees]C until further measurements. A quantitative real-time polymerase chain reaction (qPCR) was set up by adding 2 [micro]L of a 10 ng/[micro]L dilution of cDNA together with TaqMan Fast Advanced Master Mix (Life Technologies, Foster City, CA, USA) and [PrimeTime.sup.TM] assay (Integrated DNA Technologies, Coralville, IA, USA), in a final reaction volume of 10 [micro]L. Standard cycling conditions were used to analyze samples in a 7900HT Fast Real-Time PCR system (Life Technologies, Foster City, CA, USA). Expression of eight candidate biomarker genes for each sex was studied and Cq values were collected with SDS 2.3 software. Minimum Information on qPCR Experiments (MIQE) guidelines were taken into account (Bustin et al. 2009). Amplification efficiencies were between 90% and 110% for all assays. Raw data were processed to normalized relative gene expression values with qBase plus (Biogazelle, Zwijnaarde, Belgium) (Hellemans et al. 2007). Triplicates were run for all samples; technical replicates were included when the difference in Cq value was < 0.5. A set of three genes was used for data normalization, namely Hypoxanthine Phosphoribosyltransferase 1 (HPRT), Importine 8 (IPO8) and tyrosine 3-mono-oxygenase/tryptophan 5-monooxygenase activation protein, zeta (YWHAZ).

Data analysis. Statistical analyses were carried out using SAS software (version 9.3, SAS Institute Inc., Cary, NC, USA). Continuous data were presented as mean [+ or -] standard deviation (SD) and categorical data as percentages (%) and frequencies. Models were adjusted for age, body mass index (BMI), SES, smoking (categorized as smokers, former smokers and never smokers), white blood cell counts (absolute number of leukocytes and percentage of neutrophils), time of day (< 1200 hours, 1200-1500 hours, 1500-1800 hours, > 2000 hours) and season (October-March or April-September) of blood sampling. p-Values < 0.05 were considered statistically significant, p-values corrected for multiple testing referred to as q-values. We plotted residuals for each gene to check whether significance was driven by outliers, these were removed where appropriate. To indicate significance of selected biomarker genes for each sex, we included an interaction term for sex in our main analysis. p-Values for the interaction term sex were calculated for all genes under study, not only those that were significant.

In validation analysis, we examined the association between gene expression and [PM.sub.10] exposure, stratified by sex using linear regression models for the eight selected genes for each sex.

ROC Curves Exposure Prediction

We calculated the ability to predict [PM.sub.10] exposure based on expression of the set of eight validated genes significantly associated with [PM.sub.10] exposure in the discovery cohort for each sex. For this purpose, we estimated sensitivity and specificity of the prediction using receiver operating characteristic (ROC) plots. Subjects were stratified according to their long-term [PM.sub.10] exposure levels with the 75th percentile as cut-off point (25.7 [micro]g/[m.sup.3] annual mean for women, 24.5 [micro]g/[m.sup.3] for men). All analyses were repeated similarly using long-term [PM.sub.2.5] exposure levels, the cut-off point for long-term [PM.sub.2.5] exposure, or the 75th percentile of exposure was 16.0 [micro]g/[m.sup.3] for both men and women.


Population Characteristics

Table 1 lists the characteristics of the study cohorts. All participants were of European origin. Distribution of sex, SES, age and BMI as well as exposure did not differ between the discovery and validation cohort. Both cohorts included just less than 50% men and age averaged (SD) 57.9 (4.3) years. Season of sampling differed between both cohorts, with sampling for the discovery phase of the study mainly occurring throughout the warm months of the year, whereas sampling for the validation study was mainly performed during the cold season. However, since we are working with average annual exposures over a 2-year period, this approach in itself corrects for the differences across seasons. Blood sampling was done [less than or equal to] 1500 hours for all discovery cohort participants, while most validation cohort participants had samples drawn after 1500 hours. The discovery cohort consistent only of non-smokers, whereas the validation cohort included smokers (n = 21).

Gene Level Analysis

Table 2 displays the 20 top genes for [PM.sub.10] and [PM.sub.2.5] exposure in men and women. Excel File Tables S1-S4 display the extended top 50 lists for each exposure/sex combination. An overview on the total number of significant genes identified in our analysis, indicating the overlap between men and women, is given in Figure 2. For 199 gene transcripts we noticed significant sex by particulate matter exposure ([PM.sub.10]) interactions (data not shown). The corresponding number of gene transcripts for [PM.sub.2.5] with a significant sex by exposure interaction was 601 (data not shown). In men, there were significant associations between 47 genes and [PM.sub.10] only, 149 genes and [PM.sub.2.5] only, and there were 92 genes associated with both exposures. In women there were significant associations between 91 genes and [PM.sub.10] only, 1,067 genes and [PM.sub.2.5] only, and there were 498 genes associated with both exposures. We identified two genes in common between long-term [PM.sub.10] exposure in men and women, namely RAC3 and DNAJB5, respectively ranked as the 290th and 331st most significant genes with [PM.sub.10] exposure in women (out of 592 genes). Furthermore RAC3 was also significantly associated with long-term [PM.sub.2.5] exposure in men and DNAJB5 with long-term [PM.sub.2.5] exposure in women. We did not observe any significant FDR-corrected q-values in the discovery phase of our study.

Pathway Analysis

There were 1,251 and 966 pathways significantly associated with [PM.sub.10] and [PM.sub.2.5], respectively, in men, and 280 and 182 pathways significantly associated with [PM.sub.10] and [PM.sub.2.5] in women, based on uncorrected p-values. The top 5 identified pathways for each indicator of exposure are summarized in Table 3.

Long-term [PM.sub.10] exposure in men is associated with response to elevated platelet cytosolic [Ca.sub.2+], the prolactin signaling pathway and platelet degranulation. The 5th top significant pathway in association with [PM.sub.10] exposure in men is signaling by insulin receptor, which ranks 4th when analyzing long-term [PM.sub.2.5] exposure. Other pathways associated with [PM.sub.2.5] exposure in men are cell-cell communication and signaling by Type 1 Insulin-like Growth Factor and Insulin receptor signaling cascade. For women, long-term [PM.sub.10] exposure was associated with, in descending order of significance, respiratory electron transport, packaging of telomere ends, electron transport chain, respiratory electron transport and telomere maintenance. [PM.sub.2.5] exposure was associated with respiratory electron transport, and the proteasome in women (Table 3).

Transcriptome Signature in Relation to Long-Term Exposure

We selected eight genes that were significantly (p < 0.05) associated with long-term [PM.sub.10] exposure in the microarray study and have a published link with air pollution-related disease (Table 4) for validation in an independent cohort. Of these we could confirm (i.e., they were also significantly associated with [PM.sub.10] in the validation cohort based on uncorrected p-values, and associations were in the same direction as in the discovery cohort) two out of eight genes for men [Dna] homolog, subfamily B, member 5 (DNAJB5), and E2F associated phosphoprotein (EAPP)] and one out of eight genes for women to be [Rho GTPase Activating protein 4 (ARHGAP4)] borderline significantly (p = 0.0535) associated with [PM.sub.10] exposure (Table 4). AKAP6 (p = 0.02) and LIMK1 (p = 0.006) were significantly associated with [PM.sub.10] in women in the validation cohort, albeit with significantly lower expression instead of higher expression as in the discovery cohort. We also tested the same sets of eight genes for each sex for associations with [PM.sub.2.5] exposure in the validation cohort, since all but one of the candidate genes (PYG02 in women, which also was not significant for [PM.sub.10] in the discovery cohort) were significantly associated with long-term [PM.sub.2.5] exposure in the discovery cohort. For [PM.sub.2.5] exposure, we could confirm two out of eight genes [DNAJB5 (borderline significant, p = 0.059) and EAPP] for men and four out of eight genes for women [ARHGAP4, PYGO2, sirtuin 7 (SIRT7) and Autophagy related 16-like 2 (ATG16L2)] (see Table S1). Excluding 21 current smokers (14 of 94 women and 7 of 75 men) from the validation cohort did not alter our conclusions, based on the similarity in the effect estimates, apart from expression of ARHGAP4 in association with long-term [PM.sub.10] exposure (see Table S2).

Validation Set

To determine whether gene expression candidate biomarkers identified in the discovery cohort were robust exposure markers, we performed ROC curve analysis with long-term [PM.sub.10] exposure level 24.5 [micro]g/[m.sup.3] (75th percentile) as cut-off point in men. Figure 3A shows the sensitivity and 1 minus specificity (false positive ratio) of [PM.sub.10] exposure levels for men in association with the candidate biomarker genes. The model including the eight genes in men had an area under the curve (AUC) value of 0.92 [95% confidence interval (CI): 0.85, 1.00; p = 0.0002]. In women the model including the eight genes had an AUC of 0.86 (95% CI: 0.76, 0.96; p = 0.004) (Figure 3B, cut-off point 25.7 [micro]g/[m.sup.3]). The combined gene set perfomed better both in men and women than the individual genes. Similarly, for [PM.sub.2.5] exposure prediction, the model for men had an AUC of 0.91 (95% CI: 0.83, 0.97; p = 0.007) (Figure 3C), the model for women had an AUC of 0.90 (95% CI: 0.81, 0.98; p = 0.0002) (Figure 3D).



We identified and validated transcriptome signatures that are associated with long-term exposure to particulate air pollution in apparently healthy men and women. These sets of eight sex-specific genes were predictive of exposure in the validation cohort, and including all eight genes in one model provided a better prediction than the eight genes individually. We found DNAJB5 and EAPP in men and ARHGAP4 in women based on a discovery set and a validation analysis to be significantly associated with [PM.sub.10] exposure. Besides ARHGAP4, our [PM.sub.2.5]-exposure analysis for women identified PYGO2, SIRT7, and ATG16L2 as significantly associated with particulate matter exposure. However, we cannot assume these associations indicate causal relations due to the observational nature of our study. ROC analysis revealed excellent separation between individuals with high and low exposure to long-term particulate air pollution using the genes selected for validation. We believe gene expression levels have the potential to be used as biomarkers of exposure and effect with high specificity to link particulate air pollution to its health consequences, as these can be measured at the personal level rather than be obtained through exposure modelling at the population level. Further studies looking at different age and ethnic groups are warranted to explore the capabilities of gene expression levels as predictors in more depth. Longitudinal studies that monitor disease incidence, exposure and gene expression over time would be excellent to provide more insights.

We observed different transcriptomic expression levels in association with particulate air pollution exposure in men and women. Sex-specific differences may be explained by differences in inflammatory responses between men and women. Immunologic differences between men and women have been reported based on gene expression profiles in blood between smokers and nonsmokers, where women seem to have a more specific (involving less extensive pathways) immunologic response to smoking than men (Faner et al. 2014). Furthermore, sex-specific associations were also reported for microarray expression profiles in relation to environmental exposure to diverse compounds such as PCBs, dioxin, benzene, and PAHs (De Coster et al. 2013). The sex-specific associations between PM and gene expression that we observed are in line with previous reports of sex-specific associations with other exposures. As such, prenatal exposure to bisphenol A (BPA) led to differential responses in murine placentae of female and male embryos (Imanishi et al. 2003). Prenatal stress exposure in rats was associated with sex-specific differences in gene expression and behavioral effects in male and female offspring (Van den Hove et al. 2013). This study clearly shows the same biological exposure (i.e., prenatal stress) leads to a highly differential response in male and female offspring.

To date, limited human data is available on microarray gene expression profiling in response to air pollution exposure. However, in an attempt to study the effects of in utero carcinogenic exposures, gene expression profiles in cord blood from 111 babies participating in the Norwegian BraMat cohort were assessed and correlation analyses of gene expression levels with biomarkers of exposure measured showed variable numbers of significantly correlating genes. Overall, separate analyses for male and female newborns resulted in higher numbers of significantly correlating genes per sex with low overlap of similarly expressed genes between the two sexes, thus indicating a clear sex-specific toxicogenomic response. More specifically, the authors reported only 1 gene in common between girls (39 significant genes) and boys (331 significant genes) for dioxin exposure (Hochstenbach et al. 2012).

Given evidence of the differential responses to PM exposure both at the gene and pathway levels between men and women, we hypothesize that different pathways could lead to the same disease outcome in both sexes. Recently, it was reported that the same personal exposure (i.e., smoking) could lead to disease in a differential manner in men and women. As such, Paul and Amundson (2014) described microarray analysis in smokers and nonsmoking men and women. They utilized a population of 24 middle-aged smoking men (n = 12) and women (n = 12) and an equal number of nonsmoking controls. The gene set correlated with smoking in men was incapable of separating female smokers from nonsmokers and vice versa. They identified a large number of oncogenic pathway gene sets that were significantly different in female smokers compared with male smokers with Gene Set Enrichment Analysis of microarray data. In addition, functional annotation with Ingenuity Pathway Analysis (IPA) identified smoking-correlated genes associated with biological functions in male and female smokers that are directly relevant to well-known smoking related pathologies. However, these relevant biological functions were overrepresented in female smokers compared with male smokers. Identified pathway categories in women were xenobiotic metabolism signaling, actin metabolism signaling, clathrin-mediated signaling, eicosanoid signaling, thrombin signaling, tight junction signaling, molecular mechanism of cancer, and natural killer cell signaling (Paul and Amundson 2014).

The expression of ARHGAP4 was borderline significantly associated with long-term [PM.sub.10] exposure in women in the discovery cohort, and borderline significant (p = 0.0535) in the validation cohort. ARHGAP4, SIRT7, and ATGA6L2 were furthermore significantly associated with long-term [PM.sub.2.5] exposure in women in the discovery cohort and validation cohort.

ARHGAP4 is a RhoGAP that regulates the cytoskeletal dynamics that control cell motility and axon outgrowth (Vogt et al. 2007). Pygosus 2 (PYGO2) is a component of the Wnt signaling pathway required for [beta]-catenin/T-cell factor (TCF)-dependent transcription and has been shown to be upregulated in lung cancer both in vitro in non-small cell lung cancer cell lines and in vivo in human primary tumor tissue samples (Zhou et al. 2014).

In vitro experiments using hematopoietic stem cells from sirtuin 7 (SIRT7) knockout mice have shown SIRT7 regulates mitochondrial activity and its inactivation causes reduced quiescence, increased mitochondrial protein folding stress, and compromised regenerative capacity of hematopoietic stem cells (Mohrin et al. 2015; Liu and Chen 2015). Mitochondrial DNA and function have been shown to be associated with chronic air pollution exposure in populations of newborns (Janssen et al. 2012) and elderly men (Zhong et al. 2016), hence NAD-dependent deacetylase SIRT7 might provide insight into a molecular mechanism underlying the mitochondrial damage following air pollution exposure. Autophagy related 16-like 2 (ATG16L2) is a core autophagy gene. Previously, we found in newborns epigenetic modifications in the mitochondrial genome, in association with [PM.sub.2.5] exposure during gestation and placental mtDNA content, which could reflect signs of mitophagy and mitochondrial death (Janssen et al. 2012).

The expression of the genes DNAJB5 and EAPP were significantly associated with [PM.sub.10] air pollution exposure in men, in the discovery cohort, and in the validation cohort. DNAJB5 is a member of the evolutionarily conserved DNAJ/HSP40 family of proteins, which regulate molecular chaperone activity by stimulating ATPase activity (Ohtsuka and Hata 2000). DNAJB5 contains a cysteine-rich domain which renders the protein sensitive to ROS. The protein forms a multiprotein complex together with Trx1 and class II histone deacetylases (HDACs) that functions as a master negative regulator of cardiac hypertrophy (Ago et al. 2008). E2F-associated phospho-protein (EAPP) is a nuclear phosphoprotein that interacts with the activating members of the E2F transcription factor family. In vitro overexpression of EAPP increased the fraction of G1 cells and led to heightened resistance against DNA damage. EAPP itself becomes upregulated after DNA damage and stimulates the expression of p21 independently of p53 (Andorfer and Rotheneder 2011).

In pathway analyses, we identified several respiratory chain related pathways significantly associated with long-term [PM.sub.10] and [PM.sub.2.5] exposure in women. Rossner et al. (2015) reported deregulation of expression of respiratory chain, oxidative phosphorylation, and mitochondrial membrane pathways when comparing gene expression profiles in adult nonsmoking men from a heavily polluted area versus a control region in the Czech Republic across different seasons (winter and summer 2009 and winter 2010).

Although sex-related differences have been observed for different environmental pollutions, to our knowledge, this is the first study on microarray gene expression profiles in association with long-term air pollution exposure among middle-aged men and women.

Our study has strengths and limitations. We did our investigations in two independent cohorts for discovery and validation, using the same exposure modeling and the gold standard qPCR as validation tool (Canales et al. 2006). Although sample size for the discovery cohort was limited, we believe validation in an independent cohort based on a reliable method such as qPCR indicates the robustness of our analyses. Our study also has its limitations inherent to the cross-sectional nature of our study. We used 2010-2012 air pollution data to develop our high-resolution exposure models, which we applied to the participants' baseline addresses (2004). Studies in the Netherlands (Brauer et al. 2003), Italy (Rome) (Rosenlund et al. 2008), the United Kingdom (Briggs et al. 2000), and Canada (Vancouver) (Henderson et al. 2007) have shown that during periods of about 10 years and longer, existing land use regression models predicted historic spatial contrasts well. The use of a relatively homogenous population limits the potential generalizability of our study to populations with different ages, races and ethnicities, or locations. Lastly, our study design did not allow to control for cell counts in the discovery phase of the study. As cell counts were not performed on the samples for microarray analysis, and there is no good means for imputation of these values for Agilent 4 X 44 K arrays during data analysis, we were not able to control for this.


This is the first time levels of gene expression of candidate genes have been used to accurately predict air pollution exposure levels ([PM.sub.10], [PM.sub.2.5]). For this purpose, we have established ROC curves based on the genes selected for validation in an independent cohort and were able to separate low- (< 75th percentile) from high- (> 75th percentile) exposed individuals. ROC curves are commonly used to compare the diagnostic performance of two or more tests, as they give a good indication of both the sensitivity and specificity of the studied test (Greiner et al. 2000). As such, it has been demonstrated that gene expression signatures can predict survival for instance in pancreatic (Newhook et al. 2014) or non-small cell lung cancer (Lu et al. 2006). In 2009, this technique was applied for the first time in an environmental epidemiology setting, showing that specific DNA methylation patterns could accurately predict the relationship between exposure to airborne PAHs and childhood asthma incidence. Perera et al. (2009) investigated PAH levels in cord blood samples from 20 newborns and replicated the association between PAH levels and candidate region methylation in 56 other newborns from the Columbia Center for Children's Environmental Health (CCCEH) cohort that recruits nonsmoking Dominican and African American women and their children residing in different areas of New York in the United States (Perera et al. 2009). However, the application of this approach to the field of gene expression data in association with air pollution exposure is novel.

In ROC curve analysis, an AUC of 0.80 is considered a ROC curve with good separation characteristics, and an AUC of 0.90 is considered excellent, in its ability to distinguish between true and false positives. We have identified sex-specific gene-sets that fulfill these criteria for [PM.sub.10] and [PM.sub.2.5] exposure. However, we must interpret the current set within the context of its limitations inherent to the cross-sectional nature of our study.


In conclusion, microarray analysis has identified different gene-expression levels in response to long-term air pollution in men and women. From gene-level analysis, candidate biomarker genes with a reported link to AP-related disease were selected and validated (i.e., significantly associated with PM exposure with the same direction of regulation of expression) in an independent cohort. For men, we propose DNAJB5 and EAPP as biomarkers of exposure. For women, we identified ARHGAP4, PYGO2, SIRT7, and ATG16L2 as biomarker genes of exposure. ROC analysis revealed that the genes were able to predict high or low [PM.sub.10] exposure accurately. Prospective studies in other populations are needed to confirm our findings with regard to sex-specific expression of these genes in association with PM exposure. Furthermore, it would be highly relevant to analyze the gene expression of these sex-specific gene-sets in cohorts with higher PM exposure as well as in subjects at different stages of life, including the more vulnerable stages such as early childhood and puberty. 10.1289/EHP370


Ago T, Liu T, Zhai P, Chen W, Li H, Molkentin JD, et al. 2008. A redox-dependent pathway for regulating class II HDACs and cardiac hypertrophy. Cell 133(6):978-993.

Alfaro-Moreno E, Nawrot TS, Nemmar A, Nemery B. 2007. Particulate matter in the environment: pulmonary and cardiovascular effects. Curr Opin Pulm Med 13(2):98-106.

Andorfer P, Rotheneder H. 2011. EAPP: gatekeeper at the crossroad of apoptosis and p21-mediated cell-cycle arrest. Oncogene 30(23):2679-2690.

Bowatte G, Lodge C, Lowe AJ, Erbas B, Perret J, Abramson MJ, et al. 2015. The influence of childhood traffic-related air pollution exposure on asthma, allergy and sensitization: a systematic review and a meta-analysis of birth cohort studies. Allergy 70(3):245-256.

Brauer M, Hoek G, van Vliet P, Meliefste K, Fischer P, Gehring U, et al. 2003. Estimating long-term average particulate air pollution concentrations: application of traffic indicators and geographic information systems. Epidemiology 14:228-239.

Briggs DJ, de Hoogh C, Gulliver J, Wills J, Elliott P, Kingham S, et al. 2000. A regression-based method for mapping traffic-related air pollution: application and testing in four contrasting urban environments. Sci Total Environ 253 (1-3):151-167.

Bustin SA, Benes V, Garson JA, Hellemans J, Huggett J, Kubista M, et al. 2009. The MIQE guidelines: minimum information for publication of quantitative real-time PCR experiments. Clin Chem 55(4):611-622.

Canales RD, Luo Y, Willey JC, Austermiller B, Barbacioru CC, Boysen C, et al. 2006. Evaluation of DNA microarray results with quantitative gene expression platforms. Nat Biotechnol 24(9):1115-1122.

Chen Q, Jiao D, Hu H, Song J, Yan J, Wu L, et al. 2013. Downregulation of LIMK1 level inhibits migration of lung cancer cells and enhances sensitivity to chemotherapy drugs. Oncol Res 20(11):491-498.

Cherpokova D, Bender M, Morowski M, Kraft P, Schuhmann MK, Akbar SM, et al. 2015. SLAP/ SLAP2 prevent excessive platelet (hem)ITAM signaling in thrombosis and ischemic stroke in mice. Blood 125(1):185-194.

Cosemans G, Kretzschmar J, Janssen L, Maes G. 1995. The Third Workshop's environmental impact assessment model intercomparison exercise. Int J Environ Pollution 5(4-6):785-798.

Cosemans G, Ruts R, Kretzschmar JG. 2001. Impact assessment with the Belgian dispersion model IFDM and the New Dutch National Model. In: Proceedings of the 7th International Conference on Harmonisation within Atmospheric Dispersion Modelling for Regulatory Purposes, 28-31 May 2001, Belgirate, Italy. Belgirate, Italy:JRC-EI (Institutes of the European Commission's Joint Research Centre), 125-129.

Dadvand P, Nieuwenhuijsen MJ, Esnaola M, Forns J, Basagana X, Alvarez-Pedrerol M, et al. 2015. Green spaces and cognitive development in primary schoolchildren. Proc Natl Acad Sci USA 112(26):7937-7942.

De Coster S, van Leeuwen DM, Jennen DG, Koppen G, Den Hond E, Nelen V, et al. 2013. Gender-specific transcriptomic response to environmental exposure in Flemish adults. Environ Mol Mutagen 54(7):574-588.

DeMuth JP, Jackson CM, Weaver DA, Crawford EL, Durzinsky DS, Durham SJ, et al. 1998. The gene expression index c-myc x E2F-1/p21 is highly predictive of malignant phenotype in human bronchial epithelial cells. Am J Respir Cell Mol Biol 19(1):18-24.

Donaldson K, Tran L, Jimenez LA, Duffin R, Newby DE, Mills N, et al. 2005. Combustion-derived nano-particles: a review of their toxicology following inhalation exposure. Part Fibre Toxicol 2:10, doi: 10.1186/1743-8977-2-10.

Faner R, Gonzalez N, Cruz T, Kalko SG, Agusti A. 2014. Systemic inflammatory response to smoking in chronic obstructive pulmonary disease: evidence of a gender effect. PLoS One 9(5):e97491, doi: 10.1371/journal.pone.0097491.

Fink SP, Yamauchi M, Nishihara R, Jung S, Kuchiba A, Wu K, et al. 2014. Aspirin and the risk of colorectal cancer in relation to the expression of 15-hydroxyprostaglandin dehydrogenase (HPGD). Sci Transl Med 6(233):233re2, doi: 10.1126/ scitranslmed.3008481.

Finkelstein MM, Jerrett M. 2007. A study of the relationships between Parkinson's disease and markers of traffic-derived and environmental manganese air pollution in two Canadian cities. Environ Res 104(3):420-432.

Fujii T, Hayashi S, Hogg JC, Mukae H, Suwa T, Goto Y, et al. 2002. Interaction of alveolar macrophages and airway epithelial cells following exposure to particulate matter produces mediators that stimulate the bone marrow. Am J Respir Cell Mol Biol 27(1):34-41.

Greiner M, Pfeiffer D, Smith RD. 2000. Principles and practical application of the receiver-operating characteristic analysis for diagnostic tests. Prev Vet Med 45(1-2):23-41.

Hellemans J, Mortier G, De Paepe A, Speleman F, Vandesompele J. 2007. qBase relative quantification framework and software for management and automated analysis of real-time quantitative PCR data. Genome Biol 8(2):R19, doi: 10.1186/ gb-2007-8-2-r19.

Henderson SB, Beckerman B, Jerrett M, Brauer M. 2007. Application of land use regression to estimate long-term concentrations of traffic-related nitrogen oxides and fine particulate matter. Environ Sci Technol 41:2422-2428.

Heredia L, Helguera P, de Olmos S, Kedikian G, Sola Vigo F, LaFerla F, et al. 2006. Phosphorylation of actin-depolymerizing factor/cofilin by LIM-kinase mediates amyloid p-induced degeneration: a potential mechanism of neuronal dystrophy in Alzheimer's disease. J Neurosci 26(24):6533-6542.

Hochstenbach K, van Leeuwen DM, Gmuender H, Gottschalk RW, L0vik M, Granum B, et al. 2012. Global gene expression analysis in cord blood reveals gender-specific differences in response to carcinogenic exposure in utero. Cancer Epidemiol Biomarkers Prev 21(10):1756-1767.

Huang L, Poke G, Gecz J, Gibson K. 2012. A novel contiguous gene deletion of AVPR2 and ARHGAP4 genes in male dizygotic twins with nephrogenic diabetes insipidus and intellectual disability. Am J Med Genet A 158A(10):2511-2518.

Huang YC, Schmitt M, Yang Z, Que LG, Stewart JC, Frampton MW, et al. 2010. Gene expression profile in circulating mononuclear cells after exposure to ultrafine carbon particles. Inhal Toxicol 22(10):835-846.

Husten L. 1998. More data reported for HDL's role in heart disease. Lancet 352(9140):1603, doi: 10.1016/ S0140-6736(98)00065-8.

Imanishi S, Manabe N, Nishizawa H, Morita M, Sugimoto M, Iwahori M, et al. 2003. Effects of oral exposure of bisphenol A on mRNA expression of nuclear receptors in murine placentae assessed by DNA microarray. J Reprod Dev 49(4):329-336.

Iwabayashi M, Taniyama Y, Sanada F, Azuma J, Iekushi K, Kusunoki H, et al. 2012. Role of serotonin in angiogenesis: induction of angiogenesis by sarpogrelate via endothelial 5-HT1B/Akt/eNOS pathway in diabetic mice. Atherosclerosis 220(2):337-342.

Jacobs L, Emmerechts J, Mathieu C, Hoylaerts MF, Fierens F, Hoet PH, et al. 2010. Air pollution related prothrombotic changes in persons with diabetes. Environ Health Perspect 118:191-196, doi: 10.1289/ ehp.0900942.

Janssen BG, Munters E, Pieters N, Smeets K, Cox B, Cuypers A, et al. 2012. Placental mitochondrial DNA content and particulate air pollution during in utero life. Environ Health Perspect 120:1346-1352, doi: 10.1289/ehp.1104458.

Janssen S, Dumont G, Fierens F, Mensink C. 2008. Spatial interpolation of air pollution measurements using CORINE land cover data. Atmos Environ 42(20):4884-4903.

Johnson JY, Rowe BH, Villeneuve PJ. 2010. Ecological analysis of long-term exposure to ambient air pollution and the incidence of stroke in Edmonton, Alberta, Canada. Stroke 41(7):1319-1325.

Kalkstein LS, Valimont KM. 1986. An evaluation of summer discomfort in the United-States using a relative climatological index. Bull Am Meteorol Soc 67(7):842-848.

Kamburov A, Stelzl U, Lehrach H, Herwig R. 2013. The ConsensusPathDB interaction database: 2013 update. Nucleic Acids Res 41(database issue):D793-D800.

Kleensang A, Maertens A, Rosenberg M, Fitzpatrick S, Lamb J, Auerbach S, et al. 2014. [t.sup.4] workshop report: Pathways of Toxicity. ALTEX 31(1):53-61.

Ko FW, Tam W, Wong TW, Chan DP, Tung AH, Lai CK, et al. 2007. Temporal relationship between air pollutants and hospital admissions for chronic obstructive pulmonary disease in Hong Kong. Thorax 62(9):780-785.

La Rocca C, Tait S, Guerranti C, Busani L, Ciardo F, Bergamasco B, et al. 2014. Exposure to endocrine disrupters and nuclear receptor gene expression in infertile and fertile women from different Italian areas. Int J Environ Res Public Health 11(10):10146-10164.

Lefebvre W, Van Poppel M, Maiheu B, Janssen S, Dons E. 2013. Evaluation of the RIO-IFDM-street canyon model chain. Atmos Environ 77:325-337.

Li SH, Luo YL, Lai WY. 2006. Expression of eosinophil major basic protein mRNA in bronchial asthma [in Chinese]. Nan Fang Yi Ke Da Xue Xue Bao 26(9):1330-1333.

Liu JP, Chen R. 2015. Stressed SIRT7: facing a crossroad of senescence and immortality. Clin Exp Pharmacol Physiol 42(6):567-569.

Liu TQ, Wang GB, Li ZJ, Tong XD, Liu HX. 2015. Silencing of Rac3 inhibits proliferation and induces apoptosis of human lung cancer cells. Asian Pac J Cancer Prev 16(7):3061-3065.

Liu Y, Dong QZ, Wang S, Fang CQ, Miao Y, Wang L, et al. 2013. Abnormal expression of Pygopus 2 correlates with a malignant phenotype in human lung cancer. BMC Cancer 13:346, doi: 10.1186/1471-2407-13-346.

Lu Y, Lemon W, Liu PY, Yi Y, Morrison C, Yang P, et al. 2006. A gene expression signature predicts survival of patients with stage I non-small cell lung cancer. PLoS Med 3(12):e467, doi: 10.1371/journal. pmed.0030467.

Maes G, Cosemans G, Kretzschmar J, Janssen L, Van Tongerloo J. 1995. Comparison of six Gaussian dispersion models used for regulatory purposes in different countries of the EU. Int J Environ Pollution 5(4-6):734-747.

Magne J, Gustafsson P, Jin H, Maegdefessel L, Hultenby K, Wernerson A, et al. 2015. ATG16L1 Expression in carotid atherosclerotic plaques is associated with plaque vulnerability. Arterioscler Thromb Vasc Biol 35(5):1226-1235.

Magnussen H, Jorres R, Nowak D. 1993. Effect of air pollution on the prevalence of asthma and allergy: lessons from the German reunification. Thorax 48(9):879-881.

Maiheu B, Veldeman B, Viaene P, De Ridder K, Lauwaet D, Smeets N, et al. 2013. Identifying the Best Available Large-Scale Concentration Maps for Air Quality in Belgium [in Dutch]. Flanders Environment Report. http://www.milieurapport. be/Upload/main/0_onderzoeksrapporten/2013/ Eindrapport_Concentratiekaarten_29_01_2013_ TW.pdf [accessed 1 December 2014].

Melchior JT, Sawyer JK, Kelley KL, Shah R, Wilson MD, Hantgan RR, et al. 2013. LDL particle core enrichment in cholesteryl oleate increases proteoglycan binding and promotes atherosclerosis. J Lipid Res 54(9):2495-2503.

Mensink C, Maes G. 1996. Comparative sensitivity study for operational short-range atmospheric dispersion models. Int J Environ Pollution 8(3-6):356-366.

Mills NL, Donaldson K, Hadoke PW, Boon NA, MacNee W, Cassee FR, et al. 2009. Adverse cardiovascular effects of air pollution. Nat Clin Pract Cardiovasc Med 6(1):36-44.

Mohrin M, Shin J, Liu Y, Brown K, Luo H, Xi Y, et al. 2015. Stem cell aging. A mitochondrial UPR-mediated metabolic checkpoint regulates hematopoietic stem cell aging. Science 347(6228):1374-1377.

Montaner D, Tarraga J, Huerta-Cepas J, Burguet J, Vaquerizas JM, Conde L, et al. 2006. Next station in microarray data analysis: GEPAS. Nucleic Acids Res 34(web server issue):W486-W491.

Nel AE, Diaz-Sanchez D, Li N. 2001. The role of particulate pollutants in pulmonary inflammation and asthma: evidence for the involvement of organic chemicals and oxidative stress. Curr Opin Pulm Med 7(1):20-26.

Newhook TE, Blais EM, Lindberg JM, Adair SJ, Xin W, Lee JK, et al. 2014. A thirteen-gene expression signature predicts survival of patients with pancreatic cancer and identifies new genes of interest. PLoS One 9(9):e105631, doi: 10.1371/journal.pone.0105631.

Ohtsuka K, Hata M. 2000. Mammalian HSP40/DNAJ homologs: cloning of novel cDNAs and a proposal for their classification and nomenclature. Cell Stress Chaperones 5(2):98-112.

Olesen H. 1995. The model validation exercise at Mol: overview of results. Int J Environ Pollution 5(4-6):761-784.

Oti M, Snel B, Huynen MA, Brunner HG. 2006. Predicting disease genes using protein-protein interactions. J Med Genet 43(8):691-698.

Paul S, Amundson SA. 2014. Differential effect of active smoking on gene expression in male and female smokers. J Carcinog Mutagen 5:1000198, doi: 10.4172/2157-2518.1000198.

Perera F, Tang WY, Herbstman J, Tang D, Levin L, Miller R, et al. 2009. Relation of DNA methylation of 5'-CpG island of ACSL3 to transplacental exposure to airborne polycyclic aromatic hydrocarbons and childhood asthma. PloS One 4(2):e4488, doi: 10.1371/journal.pone.0004488.

Pettit AP, Brooks A, Laumbach R, Fiedler N, Wang Q, Strickland PO, et al. 2012. Alteration of peripheral blood monocyte gene expression in humans following diesel exhaust inhalation. Inhal Toxicol 24(3):172-181.

Pope CA III, Burnett RT, Thurston GD, Thun MJ, Calle EE, Krewski D, et al. 2004. Cardiovascular mortality and long-term exposure to particulate air pollution: epidemiological evidence of general pathophysiological pathways of disease. Circulation 109(1):71-77.

Raaschou-Nielsen O, Andersen ZJ, Hvidberg M, Jensen SS, Ketzel M, S0rensen M, et al. 2011. Lung cancer incidence and long-term exposure to air pollution from traffic. Environ Health Perspect 119:860-865, doi: 10.1289/ehp.1002353.

Rosenlund M, Forastiere F, Stafoggia M, Porta D, Perucci M, Ranzi A, et al. 2008. Comparison of regression models with land-use and emissions data to predict the spatial distribution of traffic-related air pollution in Rome. J Expo Sci Environ Epidemiol 18:192-199.

Rossner P Jr, Tulupova E, Rossnerova A, Libalova H, Honkova K, Gmuender H, et al. 2015. Reduced gene expression levels after chronic exposure to high concentrations of air pollutants. Mutat Res 780:60-70.

Rostila A, Puustinen A, Toljamo T, Vuopala K, Lindstrom I, Nyman TA, et al. 2012. Peroxiredoxins and tropomyosins as plasma biomarkers for lung cancer and asbestos exposure. Lung Cancer 77(2):450-459.

Steadman RG. 1979. Assessment of sultriness. Part 1. A temperature-humidity index based on human physiology and clothing science. Journal of Applied Meteorology 18(7):861-873.

Vakhrusheva O, Smolka C, Gajawada P, Kostin S, Boettger T, Kubin T, et al. 2008. Sirt7 increases stress resistance of cardiomyocytes and prevents apoptosis and inflammatory cardiomyopathy in mice. Circ Res 102(6):703-710.

van Breda SG, Wilms LC, Gaj S, Jennen DG, Briede JJ, Kleinjans JC, et al. 2015. The exposome concept in a human nutrigenomics study: evaluating the impact of exposure to a complex mixture of phytochemicals using transcriptomics signatures. Mutagenesis 30(6):723-731.

Van den Hove DL, Kenis G, Brass A, Opstelten R, Rutten BP, Bruschettini M, et al. 2013. Vulnerability versus resilience to prenatal stress in male and female rats; implications from gene expression profiles in the hippocampus and frontal cortex. Eur Neuropsychopharmacol 23(10):1226-1246.

van Leeuwen DM, Gottschalk RW, Schoeters G, van Larebeke NA, Nelen V, Baeyens WF, et al. 2008. Transcriptome analysis in peripheral blood of humans exposed to environmental carcinogens: a promising new biomarker in environmental health studies. Environ Health Perspect 116:1519-1525, doi: 10.1289/ehp.11401.

Vineis P, van Veldhoven K, Chadeau-Hyam M, Athersuch TJ. 2013. Advancing the application of omics-based biomarkers in environmental epidemiology. Environ Mol Mutagen 54(7):461-467.

Vogt DL, Gray CD, Young WS III, Orellana SA, Malouf AT. 2007. ARHGAP4 is a novel RhoGAP that mediates inhibition of cell motility and axon outgrowth. Mol Cell Neurosci 36(3):332-342.

Young ME, Razeghi P, Taegtmeyer H. 2001. Clock genes in the heart: characterization and attenuation with hypertrophy. Circ Res 88(11):1142-1150.

Zhong J, Cayir A, Trevisi L, Sanchez-Guerra M, Lin X, Peng C, et al. 2016. Traffic-related air pollution, blood pressure, and adaptive response of mitochondrial abundance. Circulation 133(4):378-387.

Zhou SY, Xu ML, Wang SQ, Zhang F, Wang L, Wang HQ. 2014. Overexpression of Pygopus-2 is required for canonical Wnt activation in human lung cancer. Oncol Lett 7(1):233-238.

Karen Vrijens, (1) Ellen Winckelmans, (1) Maria Tsamou, (1) Willy Baeyens, (2) Patrick De Boever, (1,3) Danyel Jennen, (4) Theo M. de Kok, (4) Elly Den Hond, (3,5) Wouter Lefebvre, (3) Michelle Plusquin, (1) Hans Reynders, (6) Greet Schoeters, (3,7,8) Nicolas Van Larebeke, (9) Charlotte Vanpoucke, (10) Jos Kleinjans, (4) and Tim S. Nawrot (1,11)

(1) Centre for Environmental Sciences, Hasselt University, Diepenbeek, Belgium; (2) Department of Analytical and Environmental Chemistry, Free University of Brussels, Brussels, Belgium; (3) Environmental Risk and Health Unit, Flemish Institute for Technological Research (VITO), Mol, Belgium; (4) Department of Toxicogenomics, Maastricht University, Maastricht, Netherlands; (5) Provincial Institute for Hygiene, Antwerp, Belgium; (6) Environment, Nature and Energy Department, Flemish Government, Brussels, Belgium; (7) Department of Biomedical Sciences, University of Antwerp, Antwerp, Belgium; (8) University of Southern Denmark, Institute of Public Health, Department of Environmental Medicine, Odense, Denmark; (9) Department of Radiotherapy and Nuclear Medicine, Ghent University, Ghent, Belgium; (10) Belgian Interregional Environment Agency (IRCEL), Brussels, Belgium; (11) Department of Public Health and Primary Care, Leuven University, Leuven, Belgium

Address correspondence to T.S. Nawrot, Centre for Environmental Sciences, Hasselt University, Agoralaan gebouw D, B-3590 Diepenbeek, Belgium. Telephone: 0032/11-26.83.82. E-mail:

Supplemental Material is available online (http://

The project was funded by the Environment, Nature and Energy Department of the Flemish government (LNE/OL201100023/13034/M&G), Steunpunt Milieu- en Gezondheid and European Research Council (ERC-2012-StG 310898). K.V. is a postdoctoral Fellow of the Research Foundation-Flanders (12D7714N).

The authors declare they have no actual or potential competing financial interests.

Received: 21 December 2015; Revised: 12 August 2016; Accepted: 22 August 2016; Published: 14 October 2016.

Note to readers with disabilities: EHP strives to ensure that all journal content is accessible to all readers. However, some figures and Supplemental Material published in EHP articles may not conform to 508 standards due to the complexity of the information being presented. If you need assistance accessing journal content, please contact Our staff will work with you to assess and meet your accessibility needs within 3 working days.

Caption: Figure 1. Schematic representation of the application of the modified version of the meet-in-the-middle approach to identify biomarkers of disease. Note: CeVD, cerebrovascular disease; COPD, chronic obstructive pulmonary disease; CVD, cardiovascular disease.

Caption: Figure 2. Venn diagram showing the overlap of all genes significantly associated with long-term [PM.sub.10] and [PM.sub.2.5] exposure in men and women in the discovery cohort.

Caption: Figure 3. Receiver operating characteristics (ROC) curve for leukocyte gene expression of gene sets distinguishing between high and low long-term [PM.sub.10] or [PM.sub.2.5] exposure, respectively, based on the eight genes selected for validation for each sex. (4) performance of gene set consisting of DNAJB5, RAC3, SLA2, HDLBP, PRG2, PER1, PIK3R1, and EAPP to dinstinguish between high and low [PM.sub.10] exposure in men (above 75th percentile corresponding to 24.5 [micro]g/[m.sup.3]) and low (< 24.5 [micro]g/[m.sup.3]) and (B) performance of gene set consisting of ARHGAP4, AKAP6, PYGO2, HTR1B, ATG16L2, SIRT7, TPM3 and LIMK1 in women to distinguish between high (above 75th percentile corresponding to: 25.7 Mg/m3) and low (< 25.7 Mg/m3) long-term residential [PM.sub.10] exposure. (C) Performance of same malespecific gene set in men and (D) female-specific gene set in women to distinguish between high (above 75th percentile corresponding to: 16.0 [micro]g/[m.sup.3]) and low (< 16.0 [micro]g/[m.sup.3]) long-term residential [PM.sub.2.5] exposure.
Table 1. Study population and exposure characteristics.

Characteristics            Discovery cohort      Validation cohort
                             Men (n = 48)           Men (n = 75)

Age, years                 58.0 [+ or -] 4.5     58.0 [+ or -] 4.1
Body mass index,           27.4 [+ or -] 3.5     26.1 [+ or -] 3.8
Socioeconomic status
  Low                          20 (41.7)             14 (18.7)
  Medium                       15 (31.3)             26 (34.7)
  High                         13 (27.1)             35 (46.7)
Smoking status
  Nonsmokers                  48 (100.0)             25 (33.3)
  Former smoker                   NA                 43 (57.3)
  Current smoker                  NA                  7 (9.3)
Season of blood sampling
  Cold (October-March)         40 (83.3)             27 (36.0)
  Warm (April-September)       8 (16.7)              48 (64.0)
Time of blood sampling
  < 1200 hours                 41 (85.4)              0 (0.0)
  1200-1500 hours              7 (14.6)              20 (26.7)
  1500-1800 hours               0 (0.0)              32 (42.7)
  > 2000 hours                  0 (0.0)              23 (30.7)
White blood cell count
  Leukocytes(#/pL)                --           6981.5 [+ or -] 1632.1
  Neutrophils (%)                 --             56.8 [+ or -] 8.1
  [PM.sub.10] long-term    25.8 (21.5-30.4)       23.1 (20.3-27.4)
  [PM.sub.2.5] long-term   17.7 (15.5-20.8)       15.5 (14.5-17.6)

Characteristics            Discovery cohort      Validation cohort
                            Women (n = 50)         Women (n = 94)

Age, years                 57.8 [+ or -] 4.2     58.1 [+ or -] 4.0
Body mass index,           25.8 [+ or -] 3.7     25.5 [+ or -] 4.7
Socioeconomic status
  Low                          28 (56.0)             23 (24.5)
  Medium                       7 (14.0)              16 (17.0)
  High                         15 (30.0)             55 (58.5)
Smoking status
  Nonsmokers                  50 (100.0)             49 (52.1)
  Former smoker                   NA                  31 (33)
  Current smoker                  NA                 14 (14.9)
Season of blood sampling
  Cold (October-March)         40 (80.0)             40 (42.6))
  Warm (April-September)       10 (20.0)             54 (57.4))
Time of blood sampling
  < 1200 hours                 44 (88.0)              7 (7.5)
  1200-1500 hours              6 (12.0)              25 (26.6)
  1500-1800 hours               0 (0.0)              43 (45.7)
  > 2000 hours                  0 (0.0)              19 (20.2)
White blood cell count
  Leukocytes(#/pL)                --           6981.5 [+ or -] 1632.1
  Neutrophils (%)                 --             56.8 [+ or -] 8.1
  [PM.sub.10] long-term    26.0 (20.5-35.3)       24.2 (20.4-28.2)
  [PM.sub.2.5] long-term   17.8 (15.4-20.9)       16.0 (14.7-18.3)

Note: Data are mean [+ or -] SE or number (%), exposure data are mean
(5-95th percentile). -, data not available; NA, not applicable.

Table 2. Top 20 significant genes in association with 5-[micro]
g-[m.sup.3] increase in long-term [PM.sub.10] and [PM.sub.2.5]
exposure for men and women.


Rank                 [PM.sub.10]                  [PM.sub.2.5]

         Gene        FC (95% CI)       Gene        FC (95% CI)
1        EAPP     1.15 (1.07, 1.24)    ISL2     2.45 (1.58, 3.78)
2       DCTN6     1.23 (1.10, 1.38)    HDLBP    1.31 (1.14, 1.50)
3       DNAJB5    1.36 (1.14, 1.63)   B3GNT3    1.42 (1.18, 1.70)
4        ISL2     1.55 (1.17, 2.06)   RNF144    1.83 (1.28, 2.62)
5      KIAA1914   1.23 (1.07, 1.42)    ATOH8    2.24 (1.37, 3.66)
6       HDLBP     1.14 (1.04, 1.24)    RAC3     1.62 (1.21, 2.18)
7       B3GNT3    1.19 (1.06, 1.34)    ADCK1    1.49 (1.17, 1.91)
8       ATOH8     1.55 (1.14, 2.10)   DNAJB5    1.62 (1.20, 2.18)
9       LSM12     0.86 (0.77, 0.95)    ALX3     1.40 (1.13, 1.73)
10      ZNF187    1.16 (1.04, 1.28)   MAN2A2    1.39 (1.13, 1.72)
11     ARHGAP25   1.11 (1.03, 1.20)    DCTN6    1.35 (1.11, 1.64)
12      SERF1B    0.83 (0.72, 0.95)     DAK     1.34 (1.10, 1.64)
13      ANXA1     1.19 (1.05, 1.36)    PER1     1.37 (1.11, 1.69)
14      TKTL1     1.36 (1.09, 1.71)   GUCA2B    1.78 (1.20, 2.62)
15       PRG2     1.29 (1.07, 1.56)   ATXN7L3   1.30 (1.09, 1.55)
16       PER1     1.19 (1.05, 1.36)    LSM12    0.77 (0.64, 0.92)
17      GUCA2B    1.38 (1.09, 1.75)    PRG2     1.57 (1.15, 2.13)
18       ST14     1.20 (1.05, 1.37)    ABL2     1.40 (1.11, 1.78)
19       CDV3     0.84 (0.73, 0.96)    MAST3    1.27 (1.07, 1.49)
20      TTC30B    1.20 (1.04, 1.37)   PIK3R1    1.46 (1.12, 1.89)


Rank                  [PM.sub.10]                [PM.sub.2.5]

         Gene        FC (95% CI)       Gene        FC (95% CI)
1      ATG16L2    0.81 (0.73, 0.90)    EFNB1    0.64 (0.53, 0.77)
2       EFNB1     0.79 (0.69, 0.89)   SLC6A7    1.52 (1.25, 1.86)
3       SYTL1     0.86 (0.79, 0.93)     FXN     0.73 (0.62, 0.85)
4        SMG5     0.84 (0.76, 0.92)    SFPQ     1.41 (1.18, 1.67)
5      TBC1D10C   0.85 (0.78, 0.93)    NACAL    0.69 (0.58, 0.84)
6       NACAL     0.81 (0.72, 0.91)   ATG16L2   0.72 (0.61, 0.86)
7       NFKBIE    0.85 (0.78, 0.93)   SLC24A2   1.73 (1.28, 2.32)
8       CEMP1     0.80 (0.70, 0.91)    THEX1    0.34 (0.19, 0.62)
9      DCUN1D2    0.84 (0.76, 0.93)   TBC1D13   0.78 (0.68, 0.90)
10      SLC6A7    1.25 (1.10, 1.43)    VAPB     1.21 (1.09, 1.35)
11      DHRSX     1.21 (1.08, 1.36)    TPM3     0.44 (0.28, 0.70)
12     TBC1D13    0.86 (0.79, 0.94)   CYB5D1    0.39 (0.23, 0.67)
13       SFPQ     1.21 (1.08, 1.35)    ZNF77    0.61 (0.46, 0.81)
14      MAPK3     1.19 (1.07, 1.32)    GABRD    0.40 (0.24, 0.67)
15     ZFYVE27    0.91 (0.86, 0.96)   NFKBIE    0.78 (0.68, 0.90)
16     SLC39A2    1.32 (1.11, 1.55)   CEACAM3   1.67 (1.24, 2.23)
17      TSPAN4    1.37 (1.13, 1.65)   TSPAN4    1.69 (1.25, 2.28)
18      DNAJC5    0.87 (0.81, 0.95)   GPR137    0.64 (0.50, 0.83)
19       MIA      0.82 (0.73, 0.93)   DNAJC5    0.80 (0.70, 0.91)
20       CES2     1.16 (1.06, 1.28)    HSF1     0.85 (0.77, 0.93)

Note: Rank no. gene indicates its hierarchy for that particular
exposure and sex based on level of significance of the identified
association, so gene ranked as no. 1 has the lowest p-value. FC, fold

Table 3. The top five significant pathways defined by gene set
enrichment analysis for each indicator of exposure.

                                                           # measured/
Exposure/pathway                                q-Value    # genes in


  Response to elevated platelet cytosolic Ca2+    3.11E-07   76/87
  Prolactin signaling pathway                     5.78E-07   61/72
  Platelet degranulation                          5.90E-07   71/82
  Leukocyte transendothelial migration            1.25E-06   98/118
  Signaling by insulin receptor                   5.18E-06   89/109


  Cell-cell communication
  Chagas disease (American trypanosomiasis)       1.35E-08   95/130
  Signaling by type 1 insulin-like                1.40E-06   92/104
    growth factor 1 receptor (IGF1R
  Signaling by insulin receptor                   1.40E-06   76/96
  Insulin receptor signaling cascade              1.93E-06   96/120

Women                                           2.33E-06   74/76


  Respiratory electron transport,                 2.08E-04   89/97
  ATP synthesis by chemiosmotic
    coupling, and heat production by
    uncoupling proteins
  Packaging of telomere ends                      3.98E-04   46/53
  Electron transport chain                        8.11E-04   94/103
  Respiratory electron transport                  9.59E-04   71/76
  Telomere maintenance                            1.50E-03   72/81


  Respiratory electron transport                  9.07E-04   81/92
  Respiratory electron transport,                 1.77E-03   99/113
  ATP synthesis by chemiosmotic
    coupling, and heat production
    by uncoupling proteins
  Packaging of telomere ends                      4.54E-03   45/52
  Proteasome                                      4.93E-03   41/44
  Transcriptional regulation by small RNAs        4.93E-03   95/106

Note: Pathways were identified using the Gene Set Enrichment Analysis
Tool from the online Consensus Pathway Data Base (http:// #, number.

Table 4. Selection of biomarker candidate genes and their fold
changes for an increase of 5 [micro]g-[m.sup.3] long-term PM]o

Gene              Gene description


DNAJB5    DnaJ (Hsp40) homolog,
            subfamily B, member 5
RAC3      Ras-related C3 botulinum toxin
            substrate 3 (rho family, small
            GTP binding protein Rac3)
EAPP      E2F associated phosphoprotein

HDLBP     High density lipoprotein
            binding protein (vigilin)
PRG2      Proteoglycan 2

PER1      Period homolog 1 (Drosophila)
PIK3R1    Phosphoinositide-3-kinase,
            regulatory subunit 1 (p85
SLA2      Src-like adaptor 2

AKAP6     A kinase (PRKA) anchor
            protein 6
LIMK1     LIM domain kinase 1

SIRT7     Sirtuin (silent mating type
            information regulation 2
            homolog) 7 (S. cerevisiae)
ARHGAP4   Rho GTPase Activating
            protein 4

ATG16L2   Autophagy related 16-like 2
            (S. cerevisiae)
TPM3      Tropomyosin 3

5-HTR1B   5-Hydroxytryptamine
            (serotonin) receptor 1B
PYGO2     Pygophus homolog 2

Gene            Gene function                Link to disease


DNAJB5    Heat shock protein 40       CVD (Ago et al. 2008)

RAC3      Regulation of cellular      Lung cancer (Liu et al. 2015)
            responses (cell growth)

EAPP      Cell cycle/apoptosis        Lung cancer (DeMuth et al.
HDLBP     Sterol metabolism           CVD (Husten 1998)

PRG2      Eosinophil major basic      CVD (Melchior et al. 2013),
            protein                     asthma (Li et al. 2006)
PER1      Circadian rhythm            CVD (Young et al. 2001)
PIK3R1    Insulin metabolism          Lung cancer (Lu et al. 2006)

SLA2      SLAP adapter protein        CVD (Cherpokova et al. 2015)

AKAP6     Regulatory subunit of       CVD (Oti et al. 2006)
            protein kinase A
LIMK1     Regulation of actin         Lung cancer (Chen et al.
            filament dynamics           2013), Alzheimer's (Heredia
                                        et al. 2006)
SIRT7     Transcription repressor     CVD (Vakhrusheva et al.

ARHGAP4   Regulation of small         Cognition (Huang et al. 2012)
            GTP-binding proteins
            from the RAS
ATG16L2   Autophagy                   CVD (Magne et al. 2015)

TPM3      Actin-binding protein       Lung cancer (Rostila et al.
5-HTR1B   Neurotransmitter/           CVD (Iwabayashi et al. 2012)
PYGO2     Related to Wnt signaling    Lung cancer (Liu et al. 2013)

Gene      Discovery cohort    p-Value   Validation cohort    p-Value
name         FC (95% CI)                   FC (95% CI)


DNAJB5    1.36 (1.14, 1.63)   0.0014    1.64 (1.20, 2.23)    0.0026

RAC3      1.25 (1.04, 1.51)   0.024     1.26 (0.94, 1.96)    0.10

EAPP      1.15 (1.0, 1.24)    0.00055   1.18 (1.02, 1.38)    0.028

HDLBP     1.14 (1.04, 1.24)   0.0065    1.02 (0.88, 1.19)    0.75

PRG2      1.29 (1.07, 1.56)   0.012     1. 29 (0.98, 1.71)   0.066

PER1      1.19 (1.05, 1.36)   0.012     0.95 (0.74, 1.23)    0.72
PIK3R1    1.22 (1.03, 1.43)   0.023     1.01 (0.82, 1.26)    0.91

SLA2      1.22 (1.03, 1.44)   0.027     1.16 (0.97, 1.39)    0.11

AKAP6     1.21 (1.07, 1.36)   0.0036    0.72 (0.55-0.94)     0.017

LIMK1     1.28 (1.06, 1.55)   0.01      0.75 (0.61-0.91)     0.0057

SIRT7     0.89 (0.82, 0.96)   0.0038    0.80 (0.6-1.07)      0.14

ARHGAP4   0.88 (0.81, 0.95)   0.0035    0.62 (0.38-1.00)     0.054

ATG16L2   0.81 (0.73, 0.90)   0.00028   0.81 (0.59-1.11)     0.19

TPM3      0.65 (0.48, 0.88)   0.0086    1.02 (0.83-1.26)     0.85

5-HTR1B   1.31 (1.08, 1.59)   0.0097    1.28 (0.49-3.34)     0.62

PYGO2     0.93 (0.85, 1.01)   0.097     0.75 (0.61-0.92)     0.0078

Gene      q-Value


DNAJB5    0.02

RAC3      0.18

EAPP      0.12

HDLBP     0.86

PRG2      0.18

PER1      0.86

PIK3R1    0.91

SLA2      0.18


AKAP6     0.05

LIMK1     0.03

SIRT7     0.22

ARHGAP4   0.11

ATG16L2   0.25

TPM3      0.85

5-HTR1B   0.71

PYGO2     0.03

Note: Models adjusted for age, BMI, SES, smoking (validation cohort),
leukocyte and neutrophil count, daytime of blood sampling and season.
p-Values corrected for multiple testing are represented as g-values.
COPYRIGHT 2017 National Institute of Environmental Health Sciences
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2017 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Research
Author:Vrijens, Karen; Winckelmans, Ellen; Tsamou, Maria; Baeyens, Willy; De Boever, Patrick; Jennen, Danye
Publication:Environmental Health Perspectives
Geographic Code:4EUBL
Date:Apr 1, 2017
Previous Article:The prevalence of antibiotic-resistant Staphylococcus aureus nasal carriage among industrial hog operation workers, community residents, and children...
Next Article:Prenatal exposure to glycol ethers and Neurocognitive abilities in 6-year-old children: The PELAGIE cohort study.

Terms of use | Privacy policy | Copyright © 2020 Farlex, Inc. | Feedback | For webmasters