Estimating causal effects of long-term [PM.sub.2.5] exposure on mortality in New Jersey.
Many studies have reported the association of long-term exposure with fine particulate matter ([PM.sub.2.5]) with mortality by following cohorts of subjects over time (Beelen et al. 2008; Dockery et al. 1993; Jerrett et al. 2013; Krewski et al. 2009; Lepeule et al. 2012; Pope et al. 1995; Puett et al. 2009). Initial studies [the Harvard Six Cities (HSC) and the American Cancer Society (ACS) study] contrasted exposure across cities of residence (Dockery et al. 1993; Pope et al. 1995), and, more recently, land-use regression has been used to assign exposure, such as in the ACS Cancer Prevention II study (CPS-II) and the Nurses' Health Study (NHS) (Jerrett et al. 2013; Puett et al. 2009).
However, a number of issues remain unresolved. First, the cohorts were convenience samples, which are not representative of the population as a whole and often under-represent minorities. For example, both the ACS cohort and the NHS cohort examined populations with considerably higher levels of education than average (Pope et al. 1995; Puett et al. 2009). In addition, most cohorts (HSC, ACS, CPS-II, NHS) restricted the study population to city dwellers (Jerrett et al. 2013; Krewski et al. 2009; Lepeule et al. 2012; Puett et al. 2009), raising further issues about generalizability to the whole population. Second, temporal resolution of exposure has been limited. Because many land-use regression models rely on extensive monitoring in a single year (Henderson et al. 2007; Hoek et al. 2008) to supplement routine monitoring, they are only capable of estimating exposure for 1 year, which is taken as typical. Hence, only spatial variations in exposure can be used. In other studies, which used routine monitoring (Lepeule et al. 2012; Miller et al. 2007; Pope et al. 2009), lack of monitoring for [PM.sub.2.5] likewise limited exposure contrasts to geographic variations because the [PM.sub.2.5] level at the nearest monitoring site was assigned, and often, only a few monitoring sites were available for each city. This limitation makes control for geographic confounders critical in all of these studies.
Further, the causal modeling approach has not been used to estimate the effects of long-term exposure to [PM.sub.2.5] on mortality. To estimate causal effects, we need a counterfactual framework. Causal modeling seeks to estimate the difference in value of the expected mortality in the population under the exposure they received versus what it would have been had they received an alternative exposure. Because that counterfactual cannot be observed, various methods seek legitimate surrogates for the unobserved potential outcome. Randomized trials are one approach but are not feasible for environmental exposures. Causal methods in observational epidemiology seek alternative ways to estimate a substitute for the counterfactual outcome (Baiocchi et al. 2014; Hernan et al. 2008; Rubin 1997). One approach uses formal modeling techniques, such as inverse probability weighting and propensity scores, to make the exposure independent of all measured predictors and relies on the untestable assumption of no unmeasured confounding (Cole and Hernan 2008; Stampf et al. 2010). Another approach relies on natural experiments or "random shocks," which are used as instrumental variables. The variation in such an instrumental variable is a subset of the variation in exposure that is believed to be independent of measured and unmeasured confounders. However, the assumption that exposure variations caused by the instrumental variable are randomly assigned with respect to all measured or unmeasured confounders is untestable and often relies on external information for justification. When using natural experiments or random shocks, some studies made use of the temporal variation in exposure caused by the random shock. For example, Clancy et al. (2002) compared the mortality rates before (1984-1990) and after (1990-1996) the ban on coal sales in Dublin, Ireland (Clancy et al. 2002). The ban is an instrumental variable that was related to a substantial reduction in air pollution after its implementation. It is likely that the ban or a change in policy was independent of measured or unmeasured variables that confounded the association between air pollution and mortality. Other studies relied on the spatio-temporal variation in exposure caused by the instrumental variable, an example of which is the difference-in-differences approach. For example, Card and Krueger evaluated the difference in fast-food employment in New Jersey between February 1992 (2 months before an increase in the minimum wage) and November 1992 (5 months after the increase) and compared it with the difference in fast-food employment between February and November 1992 in Pennsylvania, a neighboring state that did not change its minimum wage (Card and Krueger 1994). The increase in the minimum wage was a random shock. In other words, the authors estimated the difference in the change (difference) in employment over time between the two states. Measured or unmeasured factors that might have confounded the association between the minimum wage and fast-food employment at each point in time (e.g., education) might have varied between the two states, but as long as any temporal variation in such factors was comparable between the states, they would not confound the difference in the change in employment over time between the states. Therefore, if the untestable assumption that the change in the minimum wage was the only factor influencing the difference in the rate of change in fast-food employment between New Jersey and Pennsylvania was true, the difference in differences was unconfounded.
In this paper, we describe a variant of the differences-in-differences approach to estimate the causal relationship between annual average [PM.sub.2.5] and mortality in > 1,900 census tracts within New Jersey during 2004-2009.
Death certificates in New Jersey from 2004 to 2009, including age, race, and the census tract of residence at the time of death for each individual, were obtained from the New Jersey Department of Health (NJDOH 2013). We only considered all natural-cause deaths. People who died of external causes including injuries and poisoning were excluded [i.e., International Statistical Classification of Diseases, 10th Revision (ICD-10) codes S00 through U99]. We regarded census tract as the unit of the analysis and aggregated annual natural-cause death in each of the census tracts.
The exposure assessment was based on a previously published hybrid model incorporating daily satellite remote sensing data at 1 km x 1 km spatial resolution (Kloog et al. 2014a). Briefly, we made use of a new algorithm developed by the National Aeronautics and Space Administration-Multi-Angle Implementation to Atmospheric Correction (NASA-MAIAC). The MAIAC algorithm provides aerosol optical depth (AOD) data that allow us to use high-resolution 1 km x 1 km (versus currently available 10 km) AOD data. [PM.sub.2.5] was predicted using a mixed model with AOD and spatial and temporal predictors including meteorology, land use, and point emission. For the whole prediction area, the northeastern United States, the mean out-of-sample [R.sup.2] values obtained from 10-fold cross-validation and slope of predictions were 0.88 and 0.99, respectively, suggesting excellent prediction ability. The annual [PM.sub.2.5] of a census tract in a given year was computed by averaging the predicted daily [PM.sub.2.5] over all 1 km x 1 km grids within that census tract in that year.
The daily mean air temperature at each 1 km x 1 km grid in New Jersey was estimated using a similar mixed, spatiotemporal-resolved, and satellite-based model with Moderate Resolution Imaging Spectroradiometer (MODIS)-measured surface temperature in 1 km x 1 km spatial resolution (Kloog et al. 2014b). For the whole prediction area, the northeastern United States, the mean out-of-sample [R.sup.2] value obtained from 10-fold cross-validation was 0.95 when surface temperature was available and 0.94 when surface temperature was not, suggesting excellent prediction performance. Additional details have been published elsewhere (Kloog et al. 2014b). The mean summer temperature of a census tract in a given year was computed by averaging the daily predicted air temperature from June to August in that year over all 1 kmx 1 km grids within that census tract, and the mean winter temperatures were the averages in January, February, and December. We controlled for the mean summer and winter temperatures when estimating the association between [PM.sub.2.5] and mortality. These two variables were also tested as potential effect modifiers.
Socioeconomic and Behavioral Data
From the U.S. Census for 2000, summary file 3, we obtained census tract-level data on population, socioeconomic status (SES) including percentage of black residents, median household income, and median value of owner-occupied homes (U.S. Census Bureau 2000). We also obtained age-adjusted yearly prevalence estimates of diabetes and smoking at the county level from 2004 to 2009 from the Centers for Disease Control and Prevention (CDC) Behavioral Risk Factor Surveillance System (BRFSS) (CDC 2013).
We begin with the potential outcomes framework of the Rubin Causal Model (Rubin 1991). Let [Y.sup.A = a.sub.c,t] be the potential outcome (aggregated number of deaths) in the population of census tract c if exposed to A = a in year t, and let [Y.sup.A = a'.sub.c,t] be the potential outcome under the alternative exposure a'. We would like to estimate E([Y.sup.A = a.sub.c,t])/E([Y.sup.A = a'.sub.c,t]). We assume that the potential outcome depends on predictors in the following manner:
ln(E([Y.sup.a.sub.c,t])) = [[beta].sub.0] + [[beta].sub.1]a + [[beta].sub.2][Z.sub.c] + [[beta].sub.3][U.sub.t] + [[beta].sub.4][W.sub.c,t] + ln ([P.sub.c]), 
where [Z.sub.c] represents spatial confounders that vary among census tracts but not over the time period of the study (e.g., SES and diet), [U.sub.t] represents temporal confounders that vary over time but not among census tracts, [W.sub.c,t] represents confounders that vary over time and among census tracts, and ln([P.sub.c]) is an offset term representing the natural log of the population in census tract c.
Although Equation 1 uses the aggregated number of deaths in a census tract in a year (in an ecological form), it is closely related to an individual-level model. Ecological bias is a potential concern when nonlinear dose--response relationships and within-area variability exist because an individual risk model may have a different form from the ecological model (Jackson et al. 2006). However, as shown by Lu and Zeger (2007), a model of aggregated event counts can be derived from an individual risk model when the exposure is common across individuals (Lu and Zeger 2007), as was the case for the present study, where [PM.sub.2.5] for each individual during each year was assigned as the average value over all 1 km x 1 km geographic grids within their census tract in that year. Although such assignment introduces Berkson error in exposure assessment, it will not bias the effect estimates.
Specifically, for individual i in census tract c in year t, the risk of death ([lambda]) could be modeled as follows:
[[lambda].sub.ci](t, [PM.sub.cit]) = [[lambda].sub.0ci](t)exp([[beta].sub.1] [PM.sub.cit]) = [[lambda].sub.0ci]exp([[beta].sub.1] [PM.sub.cit] + [[gamma].sub.cit]), 
where [[lambda].sub.0] represents the baseline risk of mortality, and [gamma] represents the individual-level confounders. Using the condition that [PM.sub.cit] = [PM.sub.ct],
[[lambda].sub.ci](t, [PM.sub.cit]) = [[lambda].sub.0ci] exp ([[beta].sub.1] [PM.sub.ct] + [[gamma].sub.cit]). 
This step introduces Berkson error. Then, we sum up both sides of Equation  over all of the subjects in tract c and year t,
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], 
where [[mu].sub.ct] is the expected mortality in tract c in year t. Because ln([[summation].sub.i][[lambda].sub.0ci]exp([[gamma].sub.cit])) is a function of t in tract c, we have
[[mu].sub.ct] = exp([[beta].sub.1] [PM.sub.ct] + [f.sub.c](t)), 
where [f.sub.c](t) is a function of time for each census tract that could be decomposed into a tract-specific component that is constant over time ([Z.sub.c]), a time-varying component that is homogeneous over all tracts ([U.sub.t]), and a component that varies over time and among census tracts ([W.sub.c,t]), which is essentially the same as Equation 1.
Then, let us look at Equation 1 again. If we look at differences between adjoining years, where the exposure in the other year is a', we have the following:
ln(E([Y.sup.a.sub.c,t])) - 1n(E([Y.sup.a'.sub.c,t-1])) = [[beta].sub.1](a - a') + [[beta].sub.3]([U.sub.t] - [U.sub.t-1]) + [[beta].sub.4]([W.sub.c,t] - [W.sub.c,t - 1]), 
and [Z.sub.c] and [[beta].sub.0] have disappeared. If we then take the difference of these differences between census tracts c and c', we have
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], 
where b and b' are the exposures in tract c' at times t and t - 1, respectively. If the change in [W.sub.c,t] over a year is the same in both locations, then ([W.sub.c,t] - [W.sub.ct-1]) - ([W.sub.c'-t] - [W.sub.c',t-1]) is zero, and the difference between locations in these within-location differences will only depend on the difference in their exposure differences; hence, this estimate will be causal. It is also a marginal, not a conditional, estimate because it is not conditioned on [Z.sub.c], [U.sub.t], and [W.sub.ct]. Alternatively, if differences in the rate of change of [W.sub.c,t] are uncorrelated with differences in the rate of change of exposure in different locations, then the results are still causal. This is the key assumption of this approach. The advantage of this approach is that when this assumption holds, the ability to control for unmeasured confounders ([Z.sub.c], [U.sub.t], and [W.sub.c,t]) need not be observed because they cancel out.
We can generalize this equation to include many census tracts instead of two, and to include 6 years instead of 2, and to deal with nonlinear changes over time. Estimating differences between years (Equation ) removes confounding by variables that vary by census tract but not by time ([Z.sub.c]). In the context of multiple tracts, we can accomplish this by controlling for indicator variables for each tract. Estimating differences between census tracts (Equation ) removes confounding by covariates that vary over time but are constant between census tracts ([U.sub.t]). Again, using indicator variables for each of the 6 years accomplishes the same thing even if the trend over time is not linear. More formally, from Equation 1, we have
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], 
where [I.sub.c] and [I.sub.t] (indicator variables for tract c and year t, respectively) effectively control for [Z.sub.c] and [U.sub.t] under multi-tract and multi-year scenarios, in the same way that the differencing in Equations 6 and 7 controls for [Z.sub.c] and [U.sub.t] when there are only two tracts and two years. [[beta].sub.c] is the time-invariant component for tract c, and [[beta].sub.t] is time trend for year t. We used [c.sub.R] to denote the reference census tract and [t.sub.R] to denote the reference year. In summary, spatial and temporal confounders are controlled because differences among census tracts and time trends are controlled by [I.sub.c] and [I.sub.t], and there is no confounding by person-specific factors that vary within years and census tracts because all persons in a census tract during a given year have the same exposure.
For the above to be a causal estimate, we must also assume that differences in [W.sub.c,t] from the tract-level mean (captured by the dummy variable for tract) and the state-level trend are uncorrelated with the same differences in exposure. This is the untestable hypothesis, which must be judged on external information. How plausible is it? Factors such as SES and smoking rate vary across census tracts in New Jersey, and it is possible that these variations might be correlated with air pollution. But all differences between census tracts in any such variables are removed by using a dummy variable for each tract. What remains is variation in, for example, smoking rates that varied differentially among census tracts and over time. These variations would have to be correlated with variations in [PM.sub.2.5] from the census tract average and mean yearly change in New Jersey for confounding to remain. This outcome seems highly implausible. Indeed, these tract-specific pollution changes mostly depend on EPA regulatory changes and on year-to-year variations in back trajectories (more- or less-polluted areas upwind), mixing height, and other meteorological factors that are unlikely to be related to smoking or to any other covariate over this 6-year time period, except temperature. Therefore, to account for potential confounding by temperature, we adjust for functions of temperatures as shown in Equation 9, where the difference-indifferences approach is modeled using Poisson regression with overdispersion (Donohue and Ho 2007):
[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], 
where [PM.sub.c,t] is the [PM.sub.2.5] concentration in tract c at time t, [I.sub.c] and [I.sub.t] represent indicators for each census tract and year, and [Ts.sub.c,t] and [Tw.sub.c,t] represent the mean summer and winter temperatures for each tract and year, which are modeled as linear splines (function s) with a single knot at their means to account for possible nonlinear associations of temperature with mortality. Seasonal temperatures are linked to mortality (Shi et al. 2015) and may also be related to aerosol concentration (Rosenfeld et al. 2014). Because an increase in temperature in the winter may have a different effect (and sign) on mortality than an increase in the summer, we chose to use the mean summer and mean winter temperature as two weather-related variables (as opposed to annual mean temperature) that may influence annual mortality rates (Shi et al. 2015). To summarize, the difference-in-differences approach controlled for a) geographical differences using dummy variables for each tract; b) a state-wide time trend using dummy variables for each year; and c) variables that varied differentially over time and across space that is correlated with [PM.sub.2.5], which are seasonal temperatures. For the estimate to be causal, we assumed that no variable other than temperature that changed differentially across space and over time confounded the association between the exposure and the outcome.
The difference-in-differences approach was applied to estimate the causal effects of long-term exposure to [PM.sub.2.5] on mortality among people in New Jersey. We also estimated the association for people > 65 years old and people [less than or equal to] 65 years old by stratification. We tested if the association was modified by the mean summer temperature and by the mean winter temperature. We performed this test by adding into the model two sets of product terms: one set comprised the product terms between the spline of the mean summer temperature and [PM.sub.2.5], and the other set comprised the product terms between the spline of the mean winter temperature and [PM.sub.2.5]. We also tested if the association was modified by ecological SES variables at the census tract level using Census 2000 data (the percentage of black residents, the median household income, and median home values) and by ecological health condition at the county level using BRFSS data from 2004-2009 (age-adjusted prevalence of diabetes and smoking). These effect modifications were tested by adding a product term between [PM.sub.2.5] and the modifier into the model. Not only did we test these effect modifications among the whole population, we also tested them in a subgroup analysis by restricting the study population to the white residents (70% of the total population) to determine whether the results were consistent within a race group.
Consistency could reflect whether the association estimated using the whole population was confounded by individual-level race group. We did not repeat the analysis for other race groups owing to insufficient power to detect effect modifications. In addition, because these effect modifiers all reflected the SES of a census tract and were potentially related to each other, we fitted a model with simultaneous interactions of [PM.sub.2.5] with percent of black residents, home value, household income, smoking rate, and diabetes rate to determine the most robust modifiers. We used backward elimination to select the modifiers. Specifically, we started with a model with all five interaction terms. Then, the interaction term with the largest Rvalue was dropped, and a model without that interaction term was refitted. We repeated this procedure and stopped dropping variables until each of the remaining interaction terms had a p-value < 0.05.
To compare the difference-in-differences approach with an estimate derived using only the within-tract variation of the exposure, we performed a sensitivity analysis fitting Poisson regression within each of the census tracts that regressed total mortality against [PM.sub.2.5] and pooled the effect estimates using random-effects meta-analysis.
All statistical analyses were performed using R 3.1.2 (R Core Team 2014). Statistical significance was defined as p-value < 0.05.
Using population counts from Census 2000 data, we studied 1,938 census tracts in New Jersey during 2004-2009. In total, there were 365,530 deaths from 2004 to 2009, among which 281,170 deaths were at ages > 65, representing 77% of the total. Table 1 and Table 2 summarize the spatial and temporal variation of mortality, [PM.sub.2.5], and temperature. The spatial variation of mortality was calculated by first averaging the annual deaths from 2004 to 2009 in each of the census tracts and then summarizing the distribution using these death counts. The spatial distribution of mortality had a mean of 31.4 deaths per year per census tract. Much of the variation in deaths was due to variations in the age distribution and size of the population in each tract. For example, the 5th-95th percentile range in the annual mortality rate of persons > 65 years old across census tracts was from 22.1 to 62.8 per thousand. The 5th-95th percentile range of average annual [PM.sub.2.5] over 6 years ranged from 9.9 to 12.9 [micro]g/[m.sup.3] across census tracts with a mean of 11.3 [micro]g/[m.sup.3]. The 5th-95th percentile range of mean temperature varied from 17.2[degrees]C to 19.6[degrees]C in summer, and from 4.6[degrees]C to 7.0[degrees]C in winter. The temporal trend is presented using the average of the variables over all of the census tracts in New Jersey in each year from 2004 to 2009. Mortality counts went down in 2006 and 2007 compared with 2004 and 2005, but they went back up slightly in 2008 and 2009, indicative of nonlinear or random pattern in temporal variation.
On the basis of the difference-in-differences approach (Equation 9), we found a 3.0% [95% confidence interval (CI): 0.2, 5.9%] increase in all natural-cause mortality for each interquartile range (IQR) increase in [PM.sub.2.5] (2 [micro]g/[m.sup.3]) among all residents in 1,938 census tracts in New Jersey during 2004-2009. By comparison, the meta-analysis pooling all within-census-tract effects showed a similar increase of 3.7% (95% CI: 2.9, 4.5%) in mortality per IQR increase in [PM.sub.2.5]. Restricting the study population to age of death > 65 years, we obtained a similar effect estimate: there was a 3.5% (95% CI: 0.1, 6.9%) increase in mortality per IQR increase in [PM.sub.2.5]. For people [less than or equal to] 65 years old, the percent change in mortality was similar, 3.1% (95% CI: -1.8, 8.2%), albeit with a wider confidence interval.
The percent change in mortality with an IQR increase in [PM.sub.2.5] was 1.8% (95% CI: -1.6, 5.2%) if the mean summer and winter temperatures were at the average across all tracts and years (Table 3). By comparison, the percent change in mortality with an IQR increase in [PM.sub.2.5] was -1.6% (95% CI: -4.2, 1.1%) if the mean summer temperature was 1[degrees]C below the average across tracts and years and the mean winter temperature was at the average (interaction p-value < 0.01); the percent change was 1.6% (95% CI: -0.6, 3.8%) if the mean summer temperature was 1[degrees]C above the average across tracts and years and the mean winter temperature was at the average (interaction p-value 0.73). The percent change in mortality was 1.6% (95% CI: -0.6, 3.9%) if the mean winter temperature was 1[degrees]C below the average across tracts and years and the mean summer temperature was at the average (interaction p-value 0.82); the percent change was 5.3% (95% CI: 2.9, 7.8%) if the mean winter temperature was 1[degrees]C above the average across tracts and years and the mean summer temperature was at the average (interaction p-value < 0.01).
Figure 1 shows the estimated effects per IQR increase in [PM.sub.2.5] on mortality rates in the upper and lower deciles of census tract--level percent of black residents, median home value, and median household income from Census 2000 data and age-adjusted diabetes and smoking rates from BRFSS data during 2004-2009. Among the whole population, the percent change in mortality associated with [PM.sub.2.5] was modified by the percent of black residents (interaction p < 0.01), median income (interaction p < 0.01), and home values (interaction p = 0.02). We did not find effect modifications by smoking rate (interaction p = 0.60) or percent of persons with diabetes (interaction p = 0.06). Using backward elimination to select interaction terms from the simultaneous interaction model, we found that median household income was the only robust modifier that finally remained in the model. We also tested the consistency of these results among white residents (70% of the total population). We found that [PM.sub.2.5] significantly interacted with percent of black residents (interactionp < 0.01), age-adjusted diabetes (interaction p < 0.01), and median income (interaction p < 0.01), but not with smoking rate (interaction p = 0.63) or median home value (interaction p = 0.13).
The present study used a variant of the difference-in-differences approach to estimate the causal effect of long-term exposure to [PM.sub.2.5] on mortality in a large and general population.
We estimated the association between [PM.sub.2.5] and mortality using a counterfactual framework. We accounted for SES, behavioral, and other risk factors that vary among census tracts by modeling dummy variables for each tract. We limited potential changes over time in such risk factors by focusing on a short time period (6 years) and by adjusting for average changes from year to year in New Jersey as a whole. If our assumption that yearly deviations from the state-wide yearly fluctuations in [PM.sub.2.5] by tract (mostly resulting from regulatory and meteorological fluctuations) are unlikely to be associated with changes in other risk factors holds, we have identified a causal association.
The results add to the still relatively small body of literature that uses the general population, including both high and low SES individuals, all occupations, and both rural and urban residents.
[FIGURE 1 OMITTED]
We have identified interactions between [PM.sub.2.5] and seasonal temperature. Very few studies have looked at the health effects of long-term temperature. An increase in the mean summer temperature, a decrease in the mean winter temperature, or an increase in the variability of summer or winter temperature was associated with a decrease in the risk of death among Medicare beneficiaries in New England during 2000-2008 (Shi et al. 2015). There are also very few studies that have investigated the interaction between long-term temperature and long-term [PM.sub.2.5]. A survival analysis among > 35 million Medicare beneficiaries residing in 207 U.S. cities during 2000-2010 found that an increase in annual, summer, or winter temperature was associated with an increase in the hazard ratio of death associated with [PM.sub.2.5] (Kioumourtzoglou et al. 2016). We consistently found that an increase in the mean winter temperature was associated with an increase in the effects of [PM.sub.2.5] on mortality. With regard to summer, the association between an IQR increase in [PM.sub.2.5] and mortality in tracts with mean summer temperatures that were higher than the average was similar to the overall association. Here, the interaction was driven by a reduced risk of mortality in association with [PM.sub.2.5] when mean summer temperatures were lower than the average. Under changing climate conditions, a rise in temperature not only would increase mortality through the direct effects of temperature but also would increase the effects of long-term [PM.sub.2.5] exposure on mortality.
By analyzing the population of an entire state, we had sufficient power to test interaction and found that the effects of [PM.sub.2.5] were greater in census tracts with a higher percentage of black residents, lower median home value, or lower median home income. Median household income was the most robust variable among these three SES variables. All of these analyses consistently suggested that the effects of [PM.sub.2.5] were greater in tracts with lower SES. Consistent with our findings, in a recent study, Kioumourtzoglou et al. (2016) also found that a unit increase in [PM.sub.2.5] in cities with higher percentages of black residents or lower household incomes was associated with a larger percent increase in mortality among > 35 million Medicare beneficiaries residing in 207 U.S. cities during 2000-2010 (Kioumourtzoglou et al. 2016). When restricting the analysis to white residents, we found that the interactions were basically consistent with the analyses for the whole population. This finding suggests that the estimates obtained using the whole population for [PM.sub.2.5] were not confounded by individual-level race. The consistency between these two analyses also suggested that the SES of the neighborhood (or other people) would be associated with an individual's susceptibility, which is a contextual effect.
We identified this association in a location and during a time period with low concentrations of [PM.sub.2.5]. The average [PM.sub.2.5] over the period of study was 11.3 [micro]g/[m.sup.3], and the range across the census tracts was from 8.2 [micro]g/[m.sup.3] to 13.7 [micro]g/[m.sup.3]. Hence, this association was estimated at [PM.sub.2.5] levels completely below the old EPA annual standard of 15 [micro]g/[m.sup.3] (U.S. EPA 1997) and predominantly below the current standard of 12 [micro]g/[m.sup.3] (U.S. EPA 2013).
For comparison with previous studies, we converted the percent change in mortality from our study to reflect a 10 [micro]g/[m.sup.3] increase. We found a 15.5% (95% CI: 0.8, 32.3%) increase in all natural-cause mortality for the entire population of New Jersey. By comparison, the HSC study reported an estimate of 13% (95% CI: 4, 23%), and its extended study reported a 14% (95% CI: 7, 22%) increase in mortality (Dockery et al. 1993; Lepeule et al. 2012). The ACS cohort, which examined the association among 500,000 residents of 51 cities found a 6% (95% CI: 2, 10%) increase in mortality (Pope et al. 1995, 2002). The NHS cohort, which examined the association with all-cause mortality among women, reported an increase of 26% (95% CI: 2, 54%) (Puett et al. 2009). Our results were at the higher end of the range compared with those of the cohort studies, possibly because we used a spatially resolved exposure model. The NHS study, which used geographically resolved exposure assessment, also tended to show a large effect size (Puett et al. 2009). Further, our model had a higher cross-validation [R.sup.2] than most land-use regression models. Hoek et al. (2008) summarized a number of land-use regressions. The highest [R.sup.2] of the model (typically higher than the cross-validation [R.sup.2]) was 0.82 (Hoek et al. 2008). It is typical for most models to have an [R.sup.2] value < 0.7 (Hoek et al. 2008). The land-use regression used in the NHS study had cross-validation [R.sup.2] values of 0.77 and 0.69 for post- and pre-1999 periods, respectively (Yanosky et al. 2009). By comparison, our model had a cross- validation R2 of 0.88, which produced exposure predictions with less measurement error. We found that the percent change in mortality among people > 65 years of age in New Jersey was 18.1% (95% CI: 0.6, 38.6%) for each 10 [micro]g/[m.sup.3] increase in long-term [PM.sub.2.5]. This estimate is larger than the estimated 4% (95% CI: 3, 6%) increase in all-cause mortality among Medicare beneficiaries residing in 4,568 ZIP codes (people [greater than or equal to] 65 years old) during 2000-2005 (Zeger et al. 2008), which was calculated by using average [PM.sub.2.5] concentrations measured by monitors within 6 mi of a ZIP code to approximate exposure. A lower exposure measurement error may be one of the reasons why our study found a larger effect of [PM.sub.2.5]. The sensitivity analysis (meta-analysis pooling within-census-tract effects) found a 3.7% (95% CI: 2.9, 4.5%) increase in mortality per IQR increase in [PM.sub.2.5], suggesting that our result was close to the result obtained using the within-census-tract analysis.
We acknowledge that our study has limitations. First, we did not control for some of the differential changes over time across census tracts. Although temperature may be the strongest confounder between [PM.sub.2.5] and mortality, the change over time in other variables such as the employment rate may also confound the relationship. Second, we did not measure individual-level predictors of mortality. Variations in these predictors within a census tract, however, cannot confound [PM.sub.2.5] because they are not correlated with exposure (everyone in the tract has the same exposure in the same year). Nor can these variations confound associations between census tracts because there is no exposure contrast between tracts (because of the dummy variables for each tract). Furthermore, they cannot confound over time because the dummy variables for each year remove that pattern from outcome and exposure. For these variations to confound, their difference from the general trend by tract would have to be correlated with the differences around the trend in [PM.sub.2.5], and we can see no mechanism that would produce this correlation. Although variations in the individual-level predictors cannot confound the association, we acknowledge that exposure misclassification can occur from assigning the same yearly averaged [PM.sub.2.5] in census tracts for all residents. This variation in exposure for each individual around a small area should be Berksonian, which should not bias our estimates but would increase the confidence intervals. By comparison, cohort studies assigning exposure for each subject according to the date of death will not suffer from this problem if they have address-specific exposure. Moreover, our model is not susceptible to the typical ecological bias in which those who are exposed may not be those who develop the outcome; here, everyone within a census tract was assigned to the same geographically averaged exposure. Third, using [PM.sub.2.5] at the census-tract level to assess exposure is still not as accurate as using [PM.sub.2.5] predictions at the address level. Fourth, in our analysis, the strong control for spatial confounding and temporal trend using dummy variables for each census tract and each year substantially lowered the exposure contrast across tracts and over time, which potentially increased the standard error of effect of [PM.sub.2.5]. Fifth, the population in each census tract was likely to have changed from 2004 to 2009. Our analyses used population data from Census 2000 to approximate the population in 2004-2009, which may have reduced the accuracy of the estimates.
Under the assumption that no variable changing differentially over time across census tracts other than seasonal temperatures could confound the association, we found causal associations between [PM.sub.2.5] and all natural-cause mortality. The effect estimates of [PM.sub.2.5] from our analyses were comparable to those of previous cohort studies, but on the higher end of the range. The association was modified by seasonal temperatures and by ecological SES variables.
Baiocchi M, Cheng J, Small DS. 2014. Instrumental variable methods for causal inference. Stat Med 33:2297-2340.
Beelen R, Hoek G, van den Brandt PA, Goldbohm RA, Fischer P, Schouten LJ, et al. 2008. Long-term effects of traffic-related air pollution on mortality in a Dutch cohort (NLCS-AIR Study). Environ Health Perspect 116:196-202, doi: 10.1289/ehp.10767.
Card D, Krueger AB. 1994. Minimum wages and employment: a case study of the fast-food industry in New Jersey and Pennsylvania. Am Econ Rev 84:772-793.
CDC (Centers for Disease Control and Prevention). 2013. Behavioral Risk Factor Surveillance System. BRFSS 2013 Survey Data and Documentation. Available: http://www.cdc.gov/brfss/annual_data/ annual_2013.html [accessed 15 November 2013].
Clancy L, Goodman P, Sinclair H, Dockery DW. 2002. Effect of air-pollution control on death rates in Dublin, Ireland: an intervention study. Lancet 360:1210-1214.
Cole SR, Hernan MA. 2008. Constructing inverse probability weights for marginal structural models. Am J Epidemiol 168:656-664.
Dockery DW, Pope CA III, Xu X, Spengler JD, Ware JH, Fay ME, et al. 1993. An association between air pollution and mortality in six U.S. cities. N Engl J Med 329:1753-1759.
Donohue JJ, Ho DE. 2007. The impact of damage caps on malpractice claims: randomization inference with difference-in-differences. J Empir Leg Stud 4:69-102.
Henderson SB, Beckerman B, Jerrett M, Brauer M. 2007. Application of land use regression to estimate long-term concentrations of traffic-related nitrogen oxides and fine particulate matter. Environ Sci Technol 41:2422-2428.
Hernan MA, Alonso A, Logan R, Grodstein F, Michels KB, Willett WC, et al. 2008. Observational studies analyzed like randomized experiments: an application to postmenopausal hormone therapy and coronary heart disease. Epidemiology 19:766-779.
Hoek G, Beelen R, de Hoogh K, Vienneau D, Gulliver J, Fischer P, et al. 2008. A review of land-use regression models to assess spatial variation of outdoor air pollution. Atmos Environ 42:7561-7578.
Jackson C, Best N, Richardson S. 2006. Improving ecological inference using individual-level data. Stat Med 25:2136-2159.
Jerrett M, Burnett RT, Beckerman BS, Turner MC, Krewski D, Thurston G, et al. 2013. Spatial analysis of air pollution and mortality in California. Am J Respir Crit Care Med 188:593-599.
Kioumourtzoglou MA, Schwartz J, James P, Dominici F, Zanobetti A. 2016. PM2.5 and mortality in 207 US cities: modification by temperature and city characteristics. Epidemiology 27:221-227.
Kloog I, Chudnovsky AA, Just AC, Nordio F, Koutrakis P, Coull BA, et al. 2014a. A new hybrid spatio-temporal model for estimating daily multiyear [PM.sub.2.5] concentrations across northeastern USA using high resolution aerosol optical depth data. Atmos Environ 95:581-590.
Kloog I, Nordio F, Coull BA, Schwartz J. 2014b. Predicting spatiotemporal mean air temperature using MODIS satellite surface temperature measurements across the Northeastern USA. Remote Sens Environ 150:132-139.
Krewski D, Jerrett M, Burnett RT, Ma R, Hughes E, Shi Y, et al. 2009. Extended follow-up and spatial analysis of the American Cancer Society Study linking particulate air pollution and mortality. Res Rep Health Eff Inst 140:5-114; discussion 115-136.
Lepeule J, Laden F, Dockery D, Schwartz J. 2012. Chronic exposure to fine particles and mortality: an extended follow-up of the Harvard Six Cities Study from 1974 to 2009. Environ Health Perspect 120:965-970, doi: 10.1289/ehp.1104660.
Lu Y, Zeger SL. 2007. On the equivalence of case-crossover and time series methods in environmental epidemiology. Biostatistics 8:337-344.
Miller KA, Siscovick DS, Sheppard L, Shepherd K, Sullivan JH, Anderson GL, et al. 2007. Long-term exposure to air pollution and incidence of cardiovascular events in women. N Engl J Med 356:447-458.
NJDOH (State of New Jersey, Department of Health). 2013. Vital Statistics. Available: http://www.state. nj.us/health/vital/ [accessed 15 November 2013].
Pope CA III, Burnett RT, Thun MJ, Calle EE, Krewski D, Ito K, et al. 2002. Lung cancer, cardiopulmonary mortality, and long-term exposure to fine particulate air pollution. JAMA 287:1132-1141.
Pope CA III, Ezzati M, Dockery DW. 2009. Fine-particulate air pollution and life expectancy in the United States. N Engl J Med 360:376-386.
Pope CA III, Thun MJ, Namboodiri MM, Dockery DW, Evans JS, Speizer FE, et al. 1995. Particulate air pollution as a predictor of mortality in a prospective study of U.S. adults. Am J Respir Crit Care Med 151(3 pt 1):669-674.
Puett RC, Hart JE, Yanosky JD, Paciorek C, Schwartz J, Suh H, et al. 2009. Chronic fine and coarse particulate exposure, mortality, and coronary heart disease in the Nurses' Health Study. Environ Health Perspect 117:1697-1701, doi: 10.1289/ehp.0900572.
Rosenfeld D, Andreae MO, Asmi A, Chin M, de Leeuw G, Donovan DP, et al. 2014. Global observations of aerosol-cloud-precipitation-climate interactions. Rev Geophys 52:750-808.
R Core Team. 2014. R: A Language and Environment for Statistical Computing. Vienna, Austria:R Foundation for Statistical Computing. Available: http://www.R-project.org [accessed 1 July 2014].
Rubin DB. 1991. Practical implications of modes of statistical inference for causal effects and the critical role of the assignment mechanism. Biometrics 47:1213-1234.
Rubin DB. 1997. Estimating causal effects from large data sets using propensity scores. Ann Intern Med 127(8 pt 2):757-763.
Shi L, Kloog I, Zanobetti A, Liu P, Schwartz JD. 2015. Impacts of temperature and its variability on mortality in New England. Nat Clim Chang 5:988-991.
Stampf S, Graf E, Schmoor C, Schumacher M. 2010. Estimators and confidence intervals for the marginal odds ratio using logistic regression and propensity score stratification. Stat Med 29:760-769.
U.S. Census Bureau. 2000. Summary File 3 (SF 3). Available: http://www.census.gov/census2000/ sumfile3.html [accessed 15 November 2013].
U.S. EPA (U.S. Environmental Protection Agency). 1997. 40 CFR Part 50. National Ambient Air Quality Standards for particulate matter. Final rule. Fed Reg 62:38652-38460.
U.S. EPA. 2013. 40 CFR Parts 50, 51, 52, 53 and 58. National Ambient Air Quality Standards for particulate matter. Final rule. Fed Reg 78:3086-3287.
Yanosky JD, Paciorek CJ, Suh HH. 2009. Predicting chronic fine and coarse particulate exposures using spatiotemporal models for the Northeastern and Midwestern United States. Environ Health Perspect 117:522-529, doi: 10.1289/ehp.11692.
Zeger SL, Dominici F, McDermott A, Samet JM. 2008. Mortality in the Medicare population and chronic exposure to fine particulate air pollution in urban centers (2000-2005). Environ Health Perspect 116:1614-1619, doi: 10.1289/ehp.11449.
Yan Wang, (1) Itai Kloog, (2) Brent A. Coull, (3) Anna Kosheleva, (1) Antonella Zanobetti, (1) and Joel D. Schwartz (1)
(1) Department of Environmental Health, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA; (2) Department of Geography, Ben Gurion University, Beer-Sheva, Israel; (3) Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, Massachusetts, USA
Address correspondence to J.D. Schwartz, Department of Environmental Health, Harvard T.H. Chan School of Public Health, 401 Park Dr., Suite 404H, PO Box 15677, Boston, MA 02215 USA. Telephone: (617) 384-8752. E-mail: firstname.lastname@example.org
We would like to thank L. Shi for fruitful discussions on this work.
This study was supported by National Institute of Environmental Health Sciences (NIEHS) grant ES000002 and by U.S. Environmental Protection Agency (EPA) grant RD-83479801.
The contents of this study are solely the responsibility of the grantee and do not necessarily represent the official views of the U.S. EPA. Further, the agency does not endorse the purchase of any commercial products or services mentioned in the publication.
The authors declare they have no actual or potential competing financial interests.
Received: 31 December 2014; Revised: 8 September 2015; Accepted: 23 March 2016; Published: 15 April 2016.
Caption: Figure 1. Percent change in mortality with 95% confidence intervals for each interquartile range (2.0 [micro]g/[m.sup.3]) increase in [PM.sub.2.5] at the upper and lower decile of each modifier: percent of black residents (10th percentile = 0.2%, 90th percentile = 52.0%), percent of persons with diabetes (10th percentile = 6.1%, 90th percentile = 9.2%), smoking rate (10th percentile = 7.8%, 90th percentile = 15.9%), median home value (10th percentile = 189,300, 90th percentile = 578,600 USD), and median household income (10th percentile = 35,625, 90th percentile = 115,049 USD) among (A) the whole population and (B) the white residents in New Jersey. Census tract-specific percent of black residents, median home value, and median household income came from Census 2000 data (U.S. Census Bureau 2000). County-level percent diabetics and smoking rate came from BRFSS data from 2004 to 2009 (CDC 2013).
* Indicates interaction p < 0.05.
Table 1. Distribution of census tract-specific mean values for 2004 through 2009 for annual all natural-cause mortality, annual mean P[M.sub.25], mean summer temperature, and mean winter temperature among 1,938 census tracts in New Jersey. 5th 25th Variable Mean percentile percentile Death counts per census tract 31.4 7.7 17.8 per year (all age groups) Mortality rate (all age 7.3 3.0 4.9 groups, per 1,000) Population [all age groups, based 4,412 1,853 3,152 on Census 2000 data (U.S. Census Bureau 2000)] Death counts per census tract per 24.2 4.3 12.5 year (age > 65) Mortality rate (age > 65, per 40.1 22.1 31.2 1,000) Population [age > 65, based on 598 175 350 (U.S. Census Bureau 2000)] Death counts per census tract per 7.2 2.0 4.5 year (age [less than or equal to] 65) Mortality rate (age < 65, per 2.1 0.8 1.3 1,000) Population [age < 65, based on 3,814 1,535 2,712 Census 2000 data (U.S. Census Bureau 2000)] Annual PM25 (pg/[m.sup.3]) 11.3 9.9 10.8 Summer temperature (a) ([degrees]C) 18.6 17.2 18.2 Winter temperature (a) ([degrees]C) 5.9 4.6 5.6 75th 95th Variable Median percentile percentile Death counts per census tract 27.0 39.8 70.0 per year (all age groups) Mortality rate (all age 6.6 8.5 13.6 groups, per 1,000) Population [all age groups, based 4,181 5,562 7,527 on Census 2000 data (U.S. Census Bureau 2000)] Death counts per census tract per 19.5 30.3 58.7 year (age > 65) Mortality rate (age > 65, per 38.5 47.2 62.8 1,000) Population [age > 65, based on 525 756 1,207 (U.S. Census Bureau 2000)] Death counts per census tract per 6.7 9.3 14.8 year (age [less than or equal to] 65) Mortality rate (age < 65, per 1.8 2.4 4.2 1,000) Population [age < 65, based on 3,639 4,868 6,555 Census 2000 data (U.S. Census Bureau 2000)] Annual PM25 (pg/[m.sup.3]) 11.2 11.9 12.9 Summer temperature (a) ([degrees]C) 18.7 19.1 19.6 Winter temperature (a) ([degrees]C) 5.9 6.2 7.0 (a) Summer (winter) temperature is an average of the predicted daily temperatures across all 1 km x 1 km grids in a given census tract during June, July, and August (January, February, and December) in a given year. Table 2. Annual mean values ([+ or -] SD) across 1,938 New Jersey census tracts for all natural-cause mortality, annual mean P[M.sub.2.5], mean summer temperature, and mean winter temperature. Variable 2004 2005 Death counts per census 34.3 [+ or -] 23.9 34.2 [+ or -] 23.7 tract per year (all age groups) Death counts per census 26.4 [+ or -] 21.2 26.5 [+ or -] 21.0 tract per year (age > 65) Death counts per census 7.9 [+ or -] 5.4 7.7 [+ or -] 5.2 tract per year (age [less than or equal to] 65) Annual P[M.sub.2.5] 12.3 [+ or -] 1.0 12.8 [+ or -] 1.2 ([micro]g/[m.sup.3]) Summer temperature (a) 18.1 [+ or -] 0.6 20.3 [+ or -] 0.8 ([degrees]C) Winter temperature (a) 4.3 [+ or -] 0.7 5.0 [+ or -] 0.7 ([degrees]C) Variable 2006 2007 Death counts per census 29.2 [+ or -] 22.5 28.7 [+ or -] 21.6 tract per year (all age groups) Death counts per census 22.2 [+ or -] 19.8 22.2 [+ or -] 19.1 tract per year (age > 65) Death counts per census 7.0 [+ or -] 5.1 6.6 [+ or -] 4.7 tract per year (age [less than or equal to] 65) Annual P[M.sub.2.5] 11.7 [+ or -] 0.9 11.6 [+ or -] 1.0 ([micro]g/[m.sup.3]) Summer temperature (a) 19.1 [+ or -] 0.7 18.4 [+ or -] 0.7 ([degrees]C) Winter temperature (a) 7.8 [+ or -] 0.6 5.9 [+ or -] 0.7 ([degrees]C) Variable 2008 2009 Death counts per census 30.2 [+ or -] 21.3 32.0 [+ or -] 22.6 tract per year (all age groups) Death counts per census 23.2 [+ or -] 19.1 24.6 [+ or -] 20.2 tract per year (age > 65) Death counts per census 7.0 [+ or -] 4.6 7.4 [+ or -] 4.7 tract per year (age [less than or equal to] 65) Annual P[M.sub.2.5] 10.6 [+ or -] 0.8 9.1 [+ or -] 0.7 ([micro]g/[m.sup.3]) Summer temperature (a) 18.6 [+ or -] 0.8 17.3 [+ or -] 0.7 ([degrees]C) Winter temperature (a) 6.7 [+ or -] 0.7 5.7 [+ or -] 0.8 ([degrees]C) (a) Summer (winter) temperature is an average of the predicted daily temperatures across all 1 km x 1 km grids in a given census tract during June, July, and August (January, February, and December) in a given year. Table 3. Percent change (95% confidence interval) in mortality per interquartile range increase (2 [micro]g/[m.sup.3]) increase in P[M.sub.2.5] at given summer and winter temperatures. Mean summer Mean winter Percent change (95% CI) temperature temperature in mortality per IQR ([degrees]C) ([degrees]C) increase in P[M.sub.2.5] 18.6 (a) (Average) 5.9 (b) (Average) 1.8% (-1.6, 5.2%) 17.6 (Average -1) 5.9 (Average) -1.6% (-4.2, 1.1%) 19.6 (Average +1) 5.9 (Average) 1.6% (-0.6, 3.8%) 18.6 (Average) 4.9 (Average -1) 1.6% (-0.6, 3.9%) 18.6 (Average) 6.9 (Average +1) 5.3% (2.9, 7.8%) Abbreviations: CI, confidence interval; IQR, interquartile range (a) Average of the census tract-specific mean summer temperature across 1,938 census tracts during 2004-2009. (b) Average of the census tract-specific mean winter temperature across 1,938 census tracts during 2004-2009.
|Printer friendly Cite/link Email Feedback|
|Author:||Wang, Yan; Kloog, Itai; Coull, Brent A.; Kosheleva, Anna; Zanobetti, Antonella; Schwartz, Joel D.|
|Publication:||Environmental Health Perspectives|
|Date:||Aug 1, 2016|
|Previous Article:||Prenatal triclosan exposure and anthropometric measures including anogenital distance in Danish infants.|
|Next Article:||Childhood exposure to ambient air pollutants and the onset of asthma: an administrative cohort study in Quebec.|