GIS approaches for the estimation of residential-level ambient PM concentrations.Spatial estimations are increasingly used to estimate geocoded ambient Surrounding. For example, ambient temperature and humidity are atmospheric conditions that exist at the moment. See ambient lighting. particulate matter particulate matter n. Abbr. PM Material suspended in the air in the form of minute solid particles or liquid droplets, especially when considered as an atmospheric pollutant. Noun 1. (PM) concentrations in epidemiologic studies epidemiologic study A study that compares 2 groups of people who are alike except for one factor, such as exposure to a chemical or the presence of a health effect; the investigators try to determine if any factor is associated with the health effect because measures of daily PM concentrations are unavailable in most U.S. locations. This study was conducted to a) assess the feasibility of large-scale kriging estimations of daily residential-level ambient PM concentrations, b) perform and compare cross-validations of different kriging models, c) contrast three popular kriging approaches, and d) calculate SE of the kriging estimations. We used PM data for PM with aerodynamic diameter Drug particles for pulmonary delivery are typically characterized by aerodynamic diameter rather than geometric diameter. The velocity at which the drug settles is proportional to the aerodynamic diameter, da. [less than or equal to] 10 [micro]m (P[M.sub.10]) and aerodynamic diameter [less than or equal to] 2.5 [micro]m (P[M.sub.2.5]) from the U.S. Environmental Protection Agency Environmental Protection Agency (EPA), independent agency of the U.S. government, with headquarters in Washington, D.C. It was established in 1970 to reduce and control air and water pollution, noise pollution, and radiation and to ensure the safe handling and for the year 2000. Kriging estimations were performed at 94,135 geocoded addresses of Women's Health Initiative Women's Health Initiative A 15-yr, $628 million project involving 1. An observational study of the health habits and medical Hx of ±100,000 ♀ 2. study participants using the ArcView geographic information system geographic information system (GIS) Computerized system that relates and displays data collected from a geographic entity in the form of a map. The ability of GIS to overlay existing data with new information and display it in colour on a computer screen is used primarily to . We developed a semiautomated sem·i·au·to·mat·ed adj. Partially automated. program to enable large-scale daily kriging estimation estimation In mathematics, use of a function or formula to derive a solution or make a prediction. Unlike approximation, it has precise connotations. In statistics, for example, it connotes the careful selection and testing of a function called an estimator. and assessed validity of semivariogram models using prediction error (PE), standardized standardized pertaining to data that have been submitted to standardization procedures. standardized morbidity rate see morbidity rate. standardized mortality rate see mortality rate. prediction error (SPE SPE - Software Practice and Experience ), root mean square standardized (RMSS RMSS Rocky Mountain Star Stare (Colorado) RMSS Resilient Mass Storage Server RMSS Rocky Mountain Stamp Show (Denver, Colorado) RMSS Roland Michener Secondary School (Ontario, Canada) ), and SE of the estimated PM. National- and regional-scale kriging performed satisfactorily, with the former slightly better. The average PE, SPE, and RMSS of daily P[M.sub.10] semivariograms using regular ordinary kriging with a spherical model The spherical model in statistical mechanics is a model of ferromagnetism similar to the Ising model, which was solved in 1952 by T.H. Berlin and M. Kac. It has the remarkable property that when applied to systems of dimension d were 0.0629, -0.0011, and 1.255 [micro]g/[m.sup.3], respectively; the average SE of the estimated residential-level P[M.sub.10] was 27.36 [micro]g/[m.sup.3]. The values for P[M.sub.2.5] were 0.049, 0.0085, 1.389, and 4.13 [micro]g/[m.sup.3], respectively. Lognormal log·nor·mal adj. Mathematics Of, relating to, or being a logarithmic function with a normal distribution. log ordinary kriging yielded a smaller average SE and effectively eliminated out-of-range predicted values compared to regular ordinary kriging. Semiautomated daily kriging estimations and semivariogram cross-validations are feasible on a national scale. Lognormal ordinary kriging with a spherical model is valid for estimating daily ambient PM at geocoded residential addresses. Key words: cross-validation, geographic information systems, kriging, particulate par·tic·u·late adj. Of or occurring in the form of fine particles. n. A particulate substance. particulate composed of separate particles. air pollution, population-based studies. Environ en·vi·ron tr.v. en·vi·roned, en·vi·ron·ing, en·vi·rons To encircle; surround. See Synonyms at surround. [Middle English envirounen, from Old French environner Health Perspect 114:1374-1380 (2006). doi:10.1289/ehp.9169 available via http://dx.doi.org/ [Online 8 June 2006] ********** Large-scale, population-based epidemiologic ep·i·de·mi·ol·o·gy n. The branch of medicine that deals with the study of the causes, distribution, and control of disease in populations. [Medieval Latin epid investigations of the health effects of ambient air pollution often rely on measurements from a network of air quality monitors maintained by the U.S. Environmental Protection Agency (U.S. EPA EPA eicosapentaenoic acid. EPA abbr. eicosapentaenoic acid EPA, n.pr See acid, eicosapentaenoic. EPA, n. 1995a, 1995b, 2005). The Air Quality System (AQS AQS American Quilter's Society AQS Air Quality Standard AQS Arbeitsgemeinschaft zur Förderung der Qualitätssicherung in der Medizin (Koeln, Germany) AQS Air Quality Subsystem AQS Advanced Quality System AQS AetherQuest Solutions ) is the only national ambient air pollution database currently available for public use in the United States United States, officially United States of America, republic (2005 est. pop. 295,734,000), 3,539,227 sq mi (9,166,598 sq km), North America. The United States is the world's third largest country in population and the fourth largest country in area. . The availability of individual-level health outcome and covariable data from national-scale studies that often characterize participants over the course of several years enables researchers to study the acute effects of ambient air pollution using individual-level data (Liao et al. 2004, 2005a; Sullivan et al. 2005; Wellenius et al. 2005; Whitsel et al. 2004). This approach requires measures of daily particulate matter (PM) exposures, ideally assessed as close to the individual level as possible, such as at participant residences or in immediate proximity to participants themselves. Because daily measures of ambient PM concentrations from the AQS are unavailable in the large majority of locations, spatial estimation methods using geographic information systems (GIS (1) (Geographic Information System) An information system that deals with spatial information. Often called "mapping software," it links attributes and characteristics of an area to its geographic location. ) are increasingly being considered to estimate geocoded location-specific ambient PM concentrations, such as kriging methods. Important methodologic and practical issues still need to be resolved, however. This study was designed to a) assess the feasibility of large-scale kriging estimation of daily residential-level ambient PM concentrations, b) perform and compare cross-validations of different kriging models, c) determine and contrast the most appropriate kriging approaches, and d) calculate the SEs of the kriging estimations. Materials and Methods We obtained from AQS the P[M.sub.10] and P[M.sub.2.5] (PM with aerodynamic diameter [less than or equal to] 10 and 2.5 [micro]m, respectively) data from 1993-2004 (U.S. EPA 2005). The data from 2000 were used for this study after eliminating duplicate DUPLICATE. The double of anything. 2. It is usually applied to agreements, letters, receipts, and the like, when two originals are made of either of them. Each copy has the same effect. records and converting all measures to the same units and denominator denominator the bottom line of a fraction; the base population on which population rates such as birth and death rates are calculated. denominator . We calculated "monitor-specific" daily averages based on [greater than or equal to] 18 hourly measures. Monitor-specific daily averages were set to missing for monitors reporting < 18 hourly measures on any given day. If more than one monitor was operating at the same location on a given day, we then computed "site-specific" daily P[M.sub.10] and P[M.sub.2.5] averages by taking the mean of the monitors' measures. We also obtained the longitude longitude (lŏn`jĭt d'), angular distance on the earth's surface measured along any latitude line such as the equator east or west of the prime meridian. and latitude latitude, angular distance of any point on the surface of the earth north or south of the equator. The equator is latitude 0°, and the North Pole and South Pole are latitudes 90°N and 90°S, respectively. for each site from
the AQS database. These data served as pollutant- and site-specific
daily source data for our study (Liao et al. 2005b).
We geocoded 94,135 addresses of Women's Health Initiative (WHI WHI Women's Health Initiative WHI Women's Health Issues (journal) WHI Women's Health Institute ) Clinical Trial (CT) participant residences and examination sites in the contiguous Adjacent or touching. Contrast with fragmentation. See contiguous file. 48 United States and District of Columbia District of Columbia, federal district (2000 pop. 572,059, a 5.7% decrease in population since the 1990 census), 69 sq mi (179 sq km), on the east bank of the Potomac River, coextensive with the city of Washington, D.C. (the capital of the United States). , after assessing geocoding vendor error (Whitsel et al. 2004, 2005). Daily P[M.sub.10] and P[M.sub.2.5] concentrations and the associated estimation errors (SEs) are estimated at these geographic locations by the Environmental Epidemiology epidemiology, field of medicine concerned with the study of epidemics, outbreaks of disease that affect large numbers of people. Epidemiologists, using sophisticated statistical analyses, field investigations, and complex laboratory techniques, investigate the cause of Arrhythmogenesis arrhythmogenesis /ar·rhyth·mo·gen·esis/ (ah-rith?mo-jen´e-sis) the development of an arrhythmia. in WHI study (Whitsel 2006). We used ArcView GIS (version 8.3) and its Geostatistical Analyst Extension (ESRI (Environmental Systems Research Institute, Inc., Redlands, CA, www.esri.com) The world's leading developer of geographic information systems (GIS) software, including programs that plot ZIP codes and addresses, demographic information and detailed, color-coded data. Inc., Redlands, CA) for semivariogram determination and cross-validation and for subsequent spatial estimation of daily location-specific PM concentrations. Three frequently referenced spatial models (spherical spher·i·cal adj. Having the shape of or approximating a sphere; globular. , exponential 1. (mathematics) exponential - A function which raises some given constant (the "base") to the power of its argument. I.e. f x = b^x If no base is specified, e, the base of natural logarthims, is assumed. 2. , Gaussian Gaussian A system whose probabilities are well described by the normal distribution, or bell shaped curve. ) (Cressie 1993a; Davis 2002) were considered using the weighted least-squares method (Gribov et al. 2004; Jian et al. 1996) to obtain the "optimal" daily semivariogram parameters (range, partial sill, and nugget Nugget A 15 year Gold FHLMC (Freddie Mac) bond; similar to a Dwarf. ). Based on the daily semivariograms, we performed ordinary kriging to estimate the daily mean PM concentration and its SE at each of the 94,135 geocoded addresses. Next, we performed the standard cross-validation--an iterative it·er·a·tive adj. 1. Characterized by or involving repetition, recurrence, reiteration, or repetitiousness. 2. Grammar Frequentative. Noun 1. procedure that omits site-specific PM data points one at a time and refits the model using the remaining data to estimate the PM concentration at the site of the omitted observation. We assessed the validity (also termed "goodness of fit Goodness of fit means how well a statistical model fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question. Such measures can be used in statistical hypothesis testing, e. ") of each semivariogram using three cross-validation parameters readily available from the ArcView software package: a) the average of prediction error (PE), where PE is the average of the difference between the predicted and measured daily PM values at each monitoring site; b) the average of standardized prediction error (SPE), where SPE is the PE divided by the SE of estimation across all sites; and c) root mean square standardized (RMSS), the standard deviation In statistics, the average amount a number varies from the average number in a series of numbers. (statistics) standard deviation - (SD) A measure of the range of values in a set of numbers. (SD) of all SPEs across all sites. Additionally, we assessed the goodness of fit of each semivariogram by the average of the SEs of the estimations, generated by the kriging procedure, across all 94,135 geocoded addresses. The expectations for a good-fitting semivariogram and kriging model are an average PE and SPE near 0, an RMSS near 1, and a small SE. If RMSS < 1, there is a tendency toward overestimation o·ver·es·ti·mate tr.v. o·ver·es·ti·mat·ed, o·ver·es·ti·mat·ing, o·ver·es·ti·mates 1. To estimate too highly. 2. To esteem too greatly. of the variance The discrepancy between what a party to a lawsuit alleges will be proved in pleadings and what the party actually proves at trial. In Zoning law, an official permit to use property in a manner that departs from the way in which other property in the same locality ; if > 1, there is a tendency toward underestimation (ESRI Inc. 2001). These criteria were consistently used to guide our model selection processes throughout this study (Liao et al. 2005c). As an alternative to using the automatically calculated semivariogram (calculated using the weighted least-squares method (Gribov et al. 2004; Jian et al. 1996), one can also manually specify the semivariogram parameters to improve the cross-validation parameters in ArcView. We selected six least satisfactory daily semivariograms throughout year 2000 and manually adjusted the semivariogram parameters to obtain the best achievable average RMSS and SPE (RMSS as close to 1 and average SPE as close to 0 as possible). The cross-validation parameters from the weighted least-squares method-calculated semivariograms were then compared to those of the manually adjusted semivariograms. We performed daily ordinary krigings on both the original scale (regular ordinary kriging) and the lognormal scale (lognormal ordinary kriging) (Cressie 1993b; Johnston 2001) for all WHI CT addresses for the year 2000 and compared the cross-validation parameters between the two kriging procedures. Lognormal ordinary kriging was used because it has the ability to eliminate the negative predicted values, which is a problem in ordinary kriging, especially when the source data contain extreme values. Results Characteristics of the site-specific daily average P[M.sub.10] and P[M.sub.2.5] concentrations. During 1994-2003, the number of monitoring sites that provided GIS-usable daily P[M.sub.10] data varied widely (range, 120-1,340). On 17% of days, GIS-usable data were provided by [greater than or equal to] 400 monitoring sites; on 39% of days, by 200-400 sites; and on 44% of days, by 120-200 sites. The corresponding values for P[M.sub.2.5] during 1999-2003 were 33% of days by [greater than or equal to] 400 sites and 67% of days by 148-400 sites. Specific to the year 2000, there were averages of 325 P[M.sub.10] and 456 P[M.sub.2.5] monitoring sites operating per day across the contiguous United States, with minima and maxima of 148 and 1,061 sites for P[M.sub.10] and 178 and 1,019 sites for P[M.sub.2.5]. As a result, there were 118,791 site-days during 2000 for which we can retrieve measured P[M.sub.10] data and 166,796 site-days for P[M.sub.2.5] data. The mean ([+ or -] SD) of P[M.sub.10] and P[M.sub.2.5] from these retrievable site-days were 26.29 [+ or -] 58.13 and 13.14 [+ or -] 8.59 [micro]g/[m.sup.3], respectively, with medians of 21.33 and 11.20 [micro]g/[m.sup.3], respectively. A right-skewed distribution of both P[M.sub.10] and P[M.sub.2.5] are evident, especially for P[M.sub.10]. Figure 1 illustrates the spatial relationships between the geocoded addresses and the PM monitoring sites on an optimal day and a typical day. The mean distance between each address and its nearest PM monitor was 12.35 km, with an SD of 13.98 km, a median of 7.81 km, an interquartile range In descriptive statistics, the interquartile range (IQR), also called the midspread, middle fifty and middle of the #s, is a measure of statistical dispersion, being equal to the difference between the third and first quartiles. of 10.53 km, and 99th percentile percentile, n the number in a frequency distribution below which a certain percentage of fees will fall. E.g., the ninetieth percentile is the number that divides the distribution of fees into the lower 90% and the upper 10%, or that fee level of 68.36 km. Comparisons of three widely used spatial models. Tables 1 and 2 present summary statistics of the cross-validation parameters (PE, SPE, and RMSS) comparing three widely used spatial models (spherical, exponential, Gaussian) for P[M.sub.10] and P[M.sub.2.5], respectively. In general, both average PE and average SPE are very close to 0, with a very narrow range of variation from the 366 daily cross-validations. More specifically, > 95% of average PEs were within [+ or -] 2 [micro]g/[m.sup.3] of measured P[M.sub.10], and [+ or -] 0.5 [micro]g/[m.sup.3] of measured P[M.sub.2.5], an average measurement error that we considered acceptable. In terms of RMSS, we considered > 95% of cross-validations as acceptable, but there were days when RMSS indicated a slight over- or underestimation of the prediction variability. These data support the overall validity of using kriging-based estimation approaches to estimate location-specific PM concentrations across the contiguous United States. Comparisons of default and manually adjusted semivariograms. Table 3 presents the cross-validations and actual kriging estimations from the weighted least-squares mean method calculated semivariogram and manually adjusted semivariogram. For the 6 days when the PE, SPE, or RMSS indicated a less satisfactory default-calculated semivariogram, these three cross-validation parameters could be improved satisfactorily through adjustment of the semivariogram parameters by an operator. However, the application of such "improved" semivariograms to the estimation of P[M.sub.10] concentrations at geocoded locations across the United States did not necessarily provide better estimation of location-specific PM (i.e., smaller SEs). To the contrary, the average SEs from the default semivariograms were smaller than those from manually adjusted semivariograms. Because each average SPE of the default-calculated daily semivariograms was close to 0, and each default-calculated daily semivariogram produced a smaller estimation error, we recommend using the default-calculated semivariogram, even though the RMSS from the default-calculated semivariogram was not fully satisfactory. Comparisons of regular versus lognormal ordinary krigings. We applied regular ordinary kriging (spherical model, default-calculated daily semivariograms) to estimate daily P[M.sub.10] concentrations at geocoded addresses (n = 94,135) of WHI CT participants and examination sites in the contiguous United States. We examined the estimated P[M.sub.10] concentrations and identified 22 days during 2000 when estimated values exceeded the range of observed values. In some cases, the estimated values were negative. The number of addresses affected by this problem ranged from a few on most days to 3.5% of all addresses. This problem was related to skewed skewed curve of a usually unimodal distribution with one tail drawn out more than the other and the median will lie above or below the mean. skewed Epidemiology adjective Referring to an asymmetrical distribution of a population or of data P[M.sub.10] distributions and to small numbers of extreme outlying out·ly·ing adj. Relatively distant or remote from a center or middle: outlying regions. outlying Adjective far away from the main area Adj. 1. values or operating sites on some days. We therefore compared regular ordinary kriging and lognormal ordinary kriging anticipating that lognormal kriging would attenuate To reduce the force or severity; to lessen a relationship or connection between two objects. In Criminal Procedure, the relationship between an illegal search and a confession may be sufficiently attenuated as to remove the confession from the protection afforded by the this problem. Table 4 lists the 22 days on which regular ordinary kriging yielded estimated P[M.sub.10] values that were outside the range of measured values. For comparison, the minima and maxima of the measured and estimated P[M.sub.10] concentrations from both regular and lognormal ordinary krigings are also listed in Table 4. In summary, during 2000, lognormal ordinary kriging effectively reduced the number of problematic days from 22 to 1. Even on this one day, lognormal ordinary kriging yielded a minimum value that was closer to the range of measured data than that from regular ordinary kriging. Table 5 shows the mean values of cross-validation parameters of daily P[M.sub.10] semivariograms for both regular ordinary kriging and lognormal ordinary kriging. Cross-validation parameters were within the acceptable range from both regular and lognormal ordinary krigings, except for the 22 "out-of-range" days as defined above. On these out-of-range days, the SPE was well within the acceptable range for both regular and lognormal krigings, but the RMSS was > 1 from both approaches. Even so, for these out-of-range days RMSS from lognormal ordinary kriging was closer to 1 than that from regular ordinary kriging. We then performed regular and lognormal ordinary kriging to estimate P[M.sub.10] concentrations at geocoded addresses of WHI CT participants and examination sites, based on year 2000 P[M.sub.10] data (94,135 locations and 366 days). The mean, SD, median, and maximum of the daily mean SE of the estimated P[M.sub.10] from the regular ordinary kriging were 27.36, 83.35, 13.93, and 1160.20 [micro]g/[m.sup.3], respectively. In contrast, those from the lognormal ordinary kriging were 16.29, 6.65, 15.05, and 67.46 [micro]g/[m.sup.3]. Clearly, the distribution of the estimation errors from lognormal ordinary kriging was considerably less skewed and had fewer outlying values than that from regular ordinary kriging. Alternative methods (winsorizing extreme P[M.sub.10] values; using ArcView's "no-sector" option to search for measured data points from a circle centered around a location that needs of an estimation--i.e., disabling dis·a·ble tr.v. dis·a·bled, dis·a·bling, dis·a·bles 1. To deprive of capability or effectiveness, especially to impair the physical abilities of. 2. Law To render legally disqualified. the default "sector" search for measured data points in the four sectors of a circle, reducing the range or nugget) were less effective in estimating predicted values within the range of measured values (data not shown). Similar to the situation observed in P[M.sub.10] estimations, lognormal ordinary kriging also effectively eliminated the negative or out-of-range problem that occurred in about 5% of P[M.sub.2.5] data when using regular ordinary kriging. Other cross-validation parameters were comparable between the lognormal and regular ordinary krigings (data not shown). Comparisons between national and regional krigings. From the 61 days when 900 or more monitoring sites were operating in the year 2000 in the 48 contiguous states, the first of such days from each month was selected for comparisons between ordinary kriging models on a national versus regional scale. National krigings and cross-validations were performed on these 12 selected days using daily site-specific P[M.sub.10] data. Regional krigings and cross-validations were performed on the same data using the regional map (Figure 1) that divides the U.S. continent into five regions (northwest, southwest, middle north, southeast, and northeast). These five regions were created based on the assumption that different semivariogram parameters would be needed for different geographic areas. In general, for both regional and national krigings, the average SPE and RMSS from cross-validations of semivariograms calculated for the 12 selected days were very close to 0 and 1, respectively (Table 6) Discussion Classical methods often assume that measures are uniformly or randomly distributed. The assumptions are often inappropriate for analysis of environmental measures because values at neighboring neigh·bor n. 1. One who lives near or next to another. 2. A person, place, or thing adjacent to or located near another. 3. A fellow human. 4. Used as a form of familiar address. v. locations are rarely independent, particularly over short distances. This form of dependence (spatial autocorrelation Autocorrelation The correlation of a variable with itself over successive time intervals. Sometimes called serial correlation. ) nonetheless makes it possible to interpolate See interpolation. values at unmonitored locations from known values at monitored locations. Kriging is one such interpolation interpolation In mathematics, estimation of a value between two known data points. A simple example is calculating the mean (see mean, median, and mode) of two population counts made 10 years apart to estimate the population in the fifth year. method originally developed by mining engineers (Krige 1966). It is especially attractive in this setting because it takes the spatial autocorrelation structure function (variogram) into account by considering known values from monitored locations, weighting them with values read from the variogram at corresponding distances, and splitting weights among adjacent locations. The method thereby ensures that interpolations do not depend on monitor density (Legendre and Fortin 1989). By doing so, kriging yields best linear unbiased estimates, in this setting, of location-specific daily mean ambient PM concentrations and their SEs. Large-scale population-based epidemiologic investigations of the health effects of ambient air pollution often rely on data collected from a network of air quality monitors maintained by the U.S. EPA--the AQS data (U.S. EPA 1995a, 1995b, 2005). It is revealing to compare kriging with interpolation methods used in the well-known time-series and cohort studies A cohort study is a form of longitudinal study used in medicine and social science. It is one type of study design. In medicine, it is usually undertaken to obtain evidence to try to refute the existence of a suspected association between cause and disease; failure to refute of PM effects on mortality and cardiovascular disease Cardiovascular disease Disease that affects the heart and blood vessels. Mentioned in: Lipoproteins Test cardiovascular disease (Abbey abbey, monastic house, especially among Benedictines and Cistercians, consisting of not less than 12 monks or nuns ruled by an abbot or abbess. Many abbeys were originally self-supporting. In the Benedictine expansion after the 8th cent. et al. 1991, 1999; Dockery et al. 1993; Katsouyanni et al. 1996, 2001; Miller et al. 2004, 2005; Pope et al. 2004; Samet et al. 2000a, 2000b). These studies uniformly estimated PM exposures using area-based arithmetic averaging or nearest-neighbor imputation--alternative methods that have important limitations (Moore Moore, city (1990 pop. 40,761), Cleveland co., central Okla., a suburb of Oklahoma City; inc. 1887. Its manufactures include lightning- and surge-protection equipment, packaging for foods, and auto parts. and Carpenter 1999). Such limitations include the assumption of homogeneous The same. Contrast with heterogeneous. homogeneous - (Or "homogenous") Of uniform nature, similar in kind. 1. In the context of distributed systems, middleware makes heterogeneous systems appear as a homogeneous entity. For example see: interoperable network. exposures within study areas and the inability (or failure) to estimate exposures or associated PEs. For example, when daily exposure was of interest and there were no operating PM monitors with a study area, data pairs (daily PM concentrations, death counts) were unavailable in these studies. In addition, when longer-term (monthly to yearly) exposure was of interest, area aggregated exposures were based on available measurements within a given time frame. If there were five 24-hr measures in a month, for example, the monthly average exposure was calculated as the mean of the five readings. In contrast, our kriging-based approach estimated daily mean exposures and SEs at geocoded addresses of participants and their examination sites across the contiguous United States that can be readily integrated over time with little influence of missing data. Studies in the geosciences have also found that kriging provides consistently improved interpolation accuracy over traditional inverse-distance weighting and other, simpler spatial interpolation methods (Zimmerman 1999). Another important advantage of GIS-based estimation over the traditional area-average approach is the availability of both the location-specific estimated pollutant pol·lut·ant n. Something that pollutes, especially a waste material that contaminates air, soil, or water. concentrations and their SEs. Our goal in this study was to contribute methodologic and practical insights toward standardized, semiautomated GIS approaches to estimation of daily air pollution concentrations and their associated estimation errors. The air pollution data estimated using these approaches will support the Environmental Epidemiology of Arrhythmogenesis in WHI study (Whitsel 2006) examining the cardiac effects of air pollution in 68,133 postmenopausal post·men·o·paus·al adj. Of or occurring in the time following menopause. postmenopausal Change of life Gynecology adjective Referring to the time in ♀ when menstrual periods stop for ≥ 1 yr women 50-79 years of age at baseline The horizontal line to which the bottoms of lowercase characters (without descenders) are aligned. See typeface. baseline - released version in the WHI CT (WHI Study Group 1998). Here we describe our experience resolving several important methodologic and practical issues in adopting a systematic, standardized, and semiautomated kriging approach to estimate daily air pollution concentrations and the associated estimation errors at geocoded addresses across the contiguous United States over 10 years. We successfully downloaded from AQS the P[M.sub.10] and P[M.sub.2.5] raw data from 1993-2004. We then cleaned, calculated, and reconstructed re·con·struct tr.v. re·con·struct·ed, re·con·struct·ing, re·con·structs 1. To construct again; rebuild. 2. site-specific daily PM concentration data ready for GIS applications. It is well known that the monitoring sites in AQS are not randomly distributed, which is one of the assumptions in kriging estimation, and the density of the monitoring sites is relatively low given the size of the contiguous United States. However, the AQS is the only currently available nationwide database. Our cross-validation studies suggest that the AQS data can be used as source data for kriging estimation of ambient pollution concentrations at various locations across the 48 contiguous states. In this study, we performed cross-validation to assess the goodness of fit of various semivariogram and spatial models using four major parameters: the average PE, SPE, RMSS, and SE of estimation. Details can be found elsewhere (Webster Webster, town (1990 pop. 16,196), Worcester co., S Mass., near the Conn. line; settled c.1713, set off from Dudley and Oxford and inc. 1832. The chief manufactures are footwear, fabrics, and textiles. and Oliver 2001), but it is worth noting that in addition to using the SE as a measure of the goodness of fit of a kriging model, one could improve the health effects models by incorporating SE in the models to account for the error in the estimation of location-specific PM concentrations. We consider this an important advantage of GIS-based estimation over the traditional area-average approach and are performing studies of using SE in health effects models. We compared the performance of three widely spatial models (spherical, exponential, Gaussian) for P[M.sub.10] and P[M.sub.2.5] estimations using regular ordinary kriging on a national scale (Tables 1 and 2). In general, the cross-validation parameters suggest that all three models performed fairly well. Overall, the spherical model seemed to perform slightly better, consistent with the observation that the spatial distribution pattern of ambient air pollutants pollutants see environmental pollution. is closest to the assumption of the spherical model. The spherical model has been used most often in modeling spatially distributed data, providing a further rationale rationale (rash´ n the fundamental reasons used as the basis for a decision or action. for its use in our large-scale population-based study of the health effects of PM. Furthermore, from the perspective of the cross-validation results, both average PE and average SPE are very close to 0, with a very narrow range of variation from the 366 daily cross-validations. These data support the overall validity of using kriging-based estimation approaches to estimate location-specific PM concentrations across the contiguous United States. We completed an empirical analysis to investigate whether manually adjusting semivariogram parameters improves a) cross-validation parameters and b) estimated P[M.sub.10] concentrations and their SEs (Table 3). From these data, we conclude that manually adjusting semivariogram parameters improves cross-validation parameters. However, the application of such "improved" semivariograms to the estimation of P[M.sub.10] concentrations at geocoded locations across the United States did not necessarily provide better estimation of location-specific PM. Therefore, we recommend using the default-calculated semivariogram. Semivariograms are sensitive to strong positive skewness Skewness A statistical term used to describe a situation's asymmetry in relation to a normal distribution. Notes: A positive skew describes a distribution favoring the right tail, whereas a negative skew describes a distribution favoring the left tail. . As a result, regular ordinary kriging can yield negative predicted values or values exceeding the range of the source data. Kriging works best if the input data have a normal distribution. One solution is to log-transform the input data--using "lognormal kriging." In the ArcView software package, performing lognormal kriging is a standard option. This option log-transforms the input data to normalize normalize to convert a set of data by, for example, converting them to logarithms or reciprocals so that their previous non-normal distribution is converted to a normal one. its distribution and attenuate the impact of very large values. It also back-transforms the estimated values and the "unbiased" SE of the estimation to the original scale (Cressie 1993b; Johnston 2001). Our results comparing lognormal ordinary kriging versus regular-scale ordinary kriging suggest that lognormal ordinary kriging not only effectively estimated location-specific PM concentrations within the range of the measured data for the days regular ordinary kriging yielded negative or "out of range" PM estimations, but also yielded a smaller average SE than did regular ordinary kriging and estimations. Therefore, our results support the use of lognormal ordinary kriging as an acceptable solution to the problem commonly posed by positively skewed distributions Skewed distribution Probability distribution in which an unequal number of observations lie below (negative skew) or above (positive skew) the mean. of environmental data. Our comparisons of national- versus regional-scale kriging indicate that, in terms of cross-validation results, both performed similarly. However, such comparisons are based on krigings using the source data from optimal days (when > 900 sites across the country were reporting data), which account for only 17% of all days in a year. Therefore, there is additional justification for using national-scale kriging: Usually, there were very few operating sites within a region. On typical days--when only about 200 monitoring sites were operating--ability to derive stable and meaningful semivariograms was greatly impaired. Regional kriging also poses problems for estimation at locations near regional borders. For example, at locations within Washington State but near the Washington-Idaho border, regional kriging is based solely on P[M.sub.10] concentrations in the "Washington/Oregon, Northern California Northern California, sometimes referred to as NorCal, is the northern portion of the U.S. state of California. The region contains the San Francisco Bay Area, the state capital, Sacramento; as well as the substantial natural beauty of the redwood forests, the northern " region. It is not based on P[M.sub.10] concentrations measured immediately across the border in Idaho, despite the real possibility that they would have the largest weights in national-scale kriging estimation. For all these reasons, we recommend national-scale kriging. Considering the number of study participants and the length of study period (1994-2003) for the Environmental Epidemiology of Arrhythmogenesis in WHI study, development of an automated au·to·mate v. au·to·mat·ed, au·to·mat·ing, au·to·mates v.tr. 1. To convert to automatic operation: automate a factory. 2. procedure enabling large-scale daily krigings and semivariogram cross-validations was critical. In this study, we decided to use ArcView for predicting individuals' PM exposure concentrations because of the flexibility it offers for automation. Because ArcView GIS relies on either the weighted least-squares method or visual adjustment to create semivariograms, we did not compare the relative performance of semivariograms generated using alternative methods such as maximum likelihood and restricted maximum likelihood. For generating semivariograms, we compared only three popular spatial models (spherical, exponential, and Gaussian). Our results, however, do not invalidate in·val·i·date tr.v. in·val·i·dat·ed, in·val·i·dat·ing, in·val·i·dates To make invalid; nullify. in·val alternative spatial models (e.g., power). In the end, we selected the spherical model for our study because it is the most studied model, and its assumption pertaining per·tain intr.v. per·tained, per·tain·ing, per·tains 1. To have reference; relate: evidence that pertains to the accident. 2. to the spatial correlation of data is probably closest to our pollutant data. Furthermore, the spherical model seemed to perform as well as or slightly better than the remaining models in terms of cross-validation parameters. We chose ordinary kriging instead of universal or simple kriging for several reasons. First, the assumption for simple kriging of a known mean concentration on any given day across space is not practical for our data. Although it may seem more appropriate because of the "varying mean" concentration across the contiguous U.S. assumption, universal kriging requires a predetermined pre·de·ter·mine v. pre·de·ter·mined, pre·de·ter·min·ing, pre·de·ter·mines v.tr. 1. To determine, decide, or establish in advance: set of "exploratory variables" to explain the varying means. The candidates, many of which are spatial variables, include emissions, land use, population, road network distribution, altitude altitude, vertical distance of an object above some datum plane, such as mean sea level or a reference point on the earth's surface. It is usually measured by the reduction in atmospheric pressure with height, as shown on a barometer or altimeter. , rainfall, latitude, climatology climatology Branch of atmospheric science concerned with describing climate and analyzing the causes and practical consequences of climatic differences and changes. Climatology treats the same atmospheric processes as meteorology, but it also seeks to identify slower-acting , and other quality data. Denby et al. (2005) recently recommended a method that uses measured concentration data in combination with some "exploratory variables" as suggested above. However, their approach may not be feasible for a national-scale study such as ours, because little guiding information is available as to how to identify a set of widely acceptable variables that can be applied to the entire nation. Moreover, even if we could identify a set of exploratory variables, we do not know the forms or shapes of their independent and joint relations to the air pollution measures. Further studies that involve large-scale national data using universal kriging are still needed. In this study, we empirically tested whether the nonconstant mean assumption for universal kriging was needed; we performed five regional ordinary krigings so that different parts of the country would assume a different mean PM concentration. Our data suggested that regional and national ordinary kriging performed similarly. Therefore, our data indirectly validated val·i·date tr.v. val·i·dat·ed, val·i·dat·ing, val·i·dates 1. To declare or make legally valid. 2. To mark with an indication of official sanction. 3. and supported the use of national ordinary kriging. Although the primary objective of our study is to assess the short-term Short-term Any investments with a maturity of one year or less. short-term 1. Of or relating to a gain or loss on the value of an asset that has been held less than a specified period of time. relationship between PM and cardiac responses, the proposed kriging method also enables us to calculate the long-term Long-term Three or more years. In the context of accounting, more than 1 year. long-term 1. Of or relating to a gain or loss in the value of a security that has been held over a specific length of time. Compare short-term. cumulative exposure of an individual by taking into account the change of his or her residences over time, because the WHI study recorded the residential location history over 10 years. Nevertheless, from the environmental perspective, an inherited inherited received by inheritance. inherited achondroplastic dwarfism see achondroplastic dwarfism. inherited combined immunodeficiency see combined immune deficiency syndrome (disease). limitation of the kriging-based approach is that the estimations of the PM concentrations will provide only surrogates, or the best guesses, of the true exposure levels at the locations of interest. Thus, the accuracy of the estimations depends highly on the quality of the measured data and their spatial correlation. Even if the estimations were made with a high level of confidence, they cannot be directly interpreted as the true individual-level exposures. However, to correlate individual level cardiac responses with a surrogate surrogate n. 1) a person acting on behalf of another or a substitute, including a woman who gives birth to a baby of a mother who is unable to carry the child. 2) a judge in some states (notably New York) responsible only for probates, estates, and adoptions. of location-specific exposure, our approach represents one of the best available methods for a large-scale population-based study. In summary, our investigation of GIS approaches for estimating daily mean geocoded location-specific air pollutant concentrations and their SEs supports the use of a spherical model to perform lognormal ordinary kriging on a national scale. Our findings also support the use of default-generated semivariograms (estimated using the weighted least-squares method) without visual adjustment. We developed a semiautomated program to access and execute ArcView to implement these approaches for large-scale daily kriging estimations and semivariogram cross-validations. Detailed information about this program can be obtained on request. REFERENCES Abbey DE, Moore J, Petersen F, Beeson L. 1991. Estimating cumulative ambient concentrations of air pollutants: description and precision of methods used for an epidemiological study An Epidemiological study is a statistical study on human populations, which attempts to link human health effects to a specified cause. . Arch Environ Health 46:281-287. Abbey DE, Nishino N, McDonnell WF, Burchette RJ, Knutsen SF, Lawrence Beeson W, et al. 1999. Long-term inhalable particles <onlyinclude> This is a list of particles in particle physics, including currently known and hypothetical elementary particles, as well as the composite particles that can be built up from them. and other air pollutants related to mortality in nonsmokers. Am J Respir Crit Care Med 159:373-382. Cressie NAC See network access control. . 1993a. Geostatistics. In: Statistics for Spatial Data Data that is represented as 2D or 3D images. A geographic information system (GIS) is one of the primary applications of spatial data (land maps). See spatial analysis, spatial resolution and GIS glossary. (Cressie NAC, ed). New York New York, state, United States New York, Middle Atlantic state of the United States. It is bordered by Vermont, Massachusetts, Connecticut, and the Atlantic Ocean (E), New Jersey and Pennsylvania (S), Lakes Erie and Ontario and the Canadian province of :John Wiley John Wiley may refer to:
Cressie NAC. 1993b. Spatial prediction and kriging. In: Statistics for Spatial Data (Cressie NAC, ed). New York:John Wiley & Sons, 105-209. Davis JC. 2002. Analysis of sequences of data. In: Statistics and Data Analysis in Geography (Davis JC, ed). 3rd ed. New York:John Wiley & Sons, 159-292. Denby B, Walker SE, Horalek OJ, Eben K, Fiala J. 2005. Interpolation and Assimilation Assimilation The absorption of stock by the public from a new issue. Notes: Underwriters hope to sell all of a new issue to the public. See also: Issuer, Underwriting Assimilation Methods for European European emanating from or pertaining to Europe. European bat lyssavirus see lyssavirus. European beech tree fagussylvaticus. European blastomycosis see cryptococcosis. Scale Air Quality Assessment and Mapping. Part I: Review and Recommendations. European Topic Centre on Air and Climate Change Technical Paper 2005/7. Bilthoven, the Netherlands:European Topic Centre on Air and Climate Change. Dockery DW, Pope CA III CA III Challenge Athena version III (Navy SATCOM link) , Xu X, Spengler JD, Ware JH, Fay ME, et al. 1993. An association between air pollution and mortality in six U.S. cities. N Engl J Med 329:1753-1759. ESRI Inc. 2001. Using analytic an·a·lyt·ic or an·a·lyt·i·cal adj. 1. Of or relating to analysis or analytics. 2. Expert in or using analysis, especially one who thinks in a logical manner. 3. Psychoanalytic. tools when generating surfaces. In: Geostatistical Analyst Extension. Redlands, CA:ESRI Inc., 128-167. Gribov A, Krivoruchko K, Hoef JMV JMV Jugador Mas Valioso JMV Joint METOC Viewer (US Navy) JMV JMTK Visualization (software) JMV John Michael Vincent (radio show personality) . 2004. Modeling the semivariogram: new approach, methods comparison and case study. In: Stochastic By guesswork; by chance; using or containing random values. stochastic - probabilistic Modeling and Geostatistics--Principles, Methods and Case Studies (Yarus JM, Chambers RL, eds). Vol 2. Bath, UK:American Association of Petroleum Geologists The American Association of Petroleum Geologists (or AAPG) is one of the world's largest professional geological societies with over 31,000 members as of 2007. The AAPG works to advance the science of geology (especially in regard to exploration for and production of . Available: http://campus.esri.com/campus/library/books/GeostatisticsTeam/Krivoruchko_2001_Modeling.pdf [accessed 31 July 2006]. Jian X, Olea RA, Yu YS. 1996. Semivariogram modeling by weighted least squares Weighted least squares is a method of regression, similar to least squares in that it uses the same minimization of the sum of the residuals: Johnston K. 2001. Lognormal linear kriging. In: Using ArcGIS Geostatistical Analyst (ESRI, ed). Redlands, CA:ESRI Press, 247-273. Katsouyanni K, Schwartz J, Spix C, Touloumi G, Zmirou D, Zanobetti A, et al. 1996. Short term effects of air pollution on health: a European approach using epidemiologic time series data: the APHEA APHEA Australasian and Pacific Hansard Editors Association protocol. J Epidemiol Community Health 50(suppl 1):S12-S18. Katsouyanni K, Touloumi G, Samoli E, Gryparis A, Le Tertre A, Monopolis Y, et al. 2001. Confounding confounding when the effects of two, or more, processes on results cannot be separated, the results are said to be confounded, a cause of bias in disease studies. confounding factor and effect modification effect modification Epidemiology An interaction among multiple possible cause-and-effect relationships, where the estimate of the effect of one factor on a disease process depends on other factors in the study in the short-term effects of ambient particles on total mortality: results from 29 European cities within the APHEA2 project. Epidemiology 12:521-531. Krige DG. 1966. Two-dimensional weighted moving average trend surfaces for ore evaluation. J S Afr Inst Min Metall 66:13-38. Legendre P, Fortin M-J. 1989. Spatial pattern and ecological ecological emanating from or pertaining to ecology. ecological biome see biome. ecological climax the state of balance in an ecosystem when its inhabitants have established their permanent relationships with each analysis. Vegetatio 80:107-138. Liao D, Duan Y, Whitsel EA, Zheng zheng (zhēng), n a Chinese term for an acupuncture diagnosis achieved by thoroughly examining and interviewing a patient. ZJ, Heiss G, Chinchilli VM, et al. 2004. Association of higher levels of ambient criteria pollutants with impaired cardiac autonomic autonomic /au·to·nom·ic/ (aw?to-nom´ik) not subject to voluntary control. See under system. au·to·nom·ic adj. 1. Functionally independent; not under voluntary control. control: a population-based study. Am J Epidemiol 159:768-777. Liao D, Heiss G, Chinchilli VM, Duan Y, Folsom AR, Lin H, et al. 2005a. Association of criteria pollutants with plasma hemostatic/inflammatory markers--a population-based study. J Expo Anal anal (a´n'l) relating to the anus. a·nal adj. 1. Of, relating to, or near the anus. 2. Environ Epidemiol 15:319-328. Liao D, Peuquet DJ, Duan Y, Whitsel EA, Dou J, Smith RL, et al. 2005b. Estimation of residential-level ambient PM concentrations from the U.S. EPA's air quality monitoring database [Abstract]. Epidemiology 16(5):S27-S28. Liao D, Peuquet DJ, Duan Y, Dou J, Smith RL, Whitsel EA, et al. 2005c. GIS approaches for estimation of residential-level ambient PM concentrations [Abstract]. Epidemiology 16(5):S28. Miller KA, Siscovick DS, Sheppard L, Anderson GL, Kaufman JD. 2004. Air pollution and cardiovascular disease events in the women's health initiative observational (WHI-OS) study [Abstract]. Circulation 109:E189. Miller KA, Siscovick DS, Sheppard L, Sheppard K, Anderson GL, Kaufman JD. 2005. Effect of traditional risk factors on the association of air pollution and incident cardiovascular disease in the women's health initiative observational study In statistics, the goal of an observational study is to draw inferences about the possible effect of a treatment on subjects, where the assignment of subjects into a treated group versus a control group is outside the control of the investigator. (WHI-OS) [Abstract]. Circulation 111:E228-E229. Moore DA, Carpenter TE. 1999. Spatial analytical analytical, analytic pertaining to or emanating from analysis. analytical control control of confounding by analysis of the results of a trial or test. methods and geographic information systems: use in health research and epidemiology. Epidemiol Rev 21:143-161. Pope CA III, Burnett RT, Thurston GD, Thun MJ, Calle EE, Krewski D, et al. 2004. Cardiovascular cardiovascular /car·dio·vas·cu·lar/ (-vas´ku-ler) pertaining to the heart and blood vessels. car·di·o·vas·cu·lar adj. Abbr. mortality and long-term exposure to particulate air pollution: epidemiological epidemiological emanating from or pertaining to epidemiology. epidemiological associations the associative relationships between the frequency of occurrence of a disease and its determinants, its predisposing and precipitating evidence of general pathophysiological pathways of disease. Circulation 109:71-77. Samet JM, Dominici F, Curriero FC, Coursac I, Zeger SL. 2000a. Fine particulate air pollution and mortality in 20 U.S. cities, 1987-1994. N Engl J Med 343:1742-1749. Samet JM, Zeger SL, Dominici F, Curriero F, Coursac I, Dockery DW, et al. 2000b. The National Morbidity morbidity /mor·bid·i·ty/ (mor-bid´it-e) 1. a diseased condition or state. 2. the incidence or prevalence of a disease or of all diseases in a population. mor·bid·i·ty n. , Mortality, and Air Pollution Study. Part II: Morbidity and mortality Morbidity and Mortality can refer to:
Sullivan J, Sheppard L, Schreuder A, Ishikawa N, Siscovick D, Kaufman J. 2005. Relation between short-term fine-particulate matter exposure and onset of myocardial infarction myocardial infarction: see under infarction. . Epidemiology 16:41-48. U.S. EPA. 1995a. Air Quality Criteria for Particulate Matter. Vol 1. EPA/600/p-95/001aF. Research Triangle Park Research Triangle Park, research, business, medical, and educational complex situated in central North Carolina. It has an area of 6,900 acres (2,795 hectares) and is 8 × 2 mi (13 × 3 km) in size. Named for the triangle formed by Duke Univ. , NC:U.S. Environmental Protection Agency, Environmental Criteria and Assessment Office. U.S. EPA. 1995b. Office of Air Quality Planning and Standards: Aerometric Information Retrieval information retrieval Recovery of information, especially in a database stored in a computer. Two main approaches are matching words in the query against the database index (keyword searching) and traversing the database using hypertext or hypermedia links. System (AIRS). Vol 2. Research Triangle Park, NC:U.S. Environmental Protection Agency. U.S. EPA. 2005. Air Quality System. Research Triangle Park, NC:U.S. Environmental Protection Agency. Available: http://www.epa.gov/ttn/AQS/AQSaqs/ [accessed 15 December 2005]. Webster R, Oliver MA. 2001. Geostatistics for environmental scientists. In: Statistics in Practice (Barnett V, ed). New York:John Wiley & Sons, 149-192. Wellenius GA, Schwartz J. Mittleman MA. 2005. Air pollution and hospital admissions for ischemic Ischemic An inadequate supply of blood to a part of the body, caused by partial or total blockage of an artery. Mentioned in: Antiangiogenic Therapy, Subarachnoid Hemorrhage, Ventricular Fibrillation ischemic and hemorrhagic stroke hemorrhagic stroke Neurology An ischemic stroke in which blood enters necrotic brain tissue, which may not be accompanied by a worsening clinical status Risks for HS Hemophilia, thrombocytopenia, sickle cell anemia, DIC, anticoagulants, HTN. See Stroke. among Medicare beneficiaries. Stroke 36:2549-2553. WHI Study Group. 1998. Design of the Women's Health Initiative Clinical Trial and Observational Study. Control Clin Trials 19:61-109. Whitsel EA. 2006. The environmental epidemiology of arrhythmo genesis in WHI [Abstract]. Available: http://crisp.cit.nih.gov/crisp/CRISP_LIB lib n. Informal A movement that seeks to achieve equal rights for a group; liberation. lib Noun Informal liberation: used in the name of certain movements: .getdoc?textkey=6599396&p_grant_num=1R01ES012238-1&p_query=&ticket=6776514&p_audit_session_id=30381838&p_keywords= [accessed 15 December 2005]. Whitsel EA, Rose KM, Wood JL, Henley AC, Liao D, Heiss G. 2004. Accuracy and repeatability of commercial geocoding. Am J Epidemiol 160:1023-1029. Whitsel EA, Rose KM, Wood JL, Henley AC, Liao D, Smith RL, et al. 2005. Accuracy and repeatability of commercial geocoding [Abstract]. Circulation 111:237. Zimmerman D. 1999. An experimental comparison of ordinary and universal kriging and inverse distance weighting Inverse distance weighting (IDW) is a method for multivariate interpolation, a process of assigning values to unknown points by using values from usually scattered set of known points. . Math Geol 31:375-390. Duanping Liao, (1) Donna J. Peuquet, (2) Yinkang Duan, (1) Eric A. Whitsel, (3,4) Jianwei Dou, (2) Richard L. Smith, (5) Hung-Mo Lin, (1) Jiu-Chiuan Chen, (3) and Gerardo Heiss (3) (1) Department of Health Evaluation Sciences, Pennsylvania State University Pennsylvania State University, main campus at University Park, State College; land-grant and state supported; coeducational; chartered 1855, opened 1859 as Farmers' High School. College of Medicine, Hershey, Pennsylvania; USA; (2) Department of Geography, Pennsylvania State University, College Park, Pennsylvania, USA; (3) Department of Epidemiology, (4) Department of Medicine, and (5) Department of Statistics, University of North Carolina-Chapel Hill, Chapel Hill, North Carolina Chapel Hill is a town in North Carolina and the home of the University of North Carolina at Chapel Hill (UNC-CH), the oldest state-supported university in the United States. As of the 2000 census, it had a population of 48,715. As of 2004 its estimated population was 52,440. , USA Address correspondence to D. Liao, Department of Health Evaluation Sciences, Pennsylvania State University College of Medicine, 600 Centerview Dr., A210, Hershey, PA 17033 USA. Telephone (717) 531-4149. Fax: (717) 531-5779. E-mail: dliao@psu.edu We acknowledge the contributions of WHI investigators and institutions (Appendix). The National Institute of Environmental Health Sciences The National Institute of Environmental Health Sciences (NIEHS) is one of 27 Institutes and Centers of the National Institutes of Health (NIH),which is a component of the Department of Health and Human Services (DHHS). The Director of the NIEHS is Dr. David A. Schwartz. funded this ancillary Subordinate; aiding. A legal proceeding that is not the primary dispute but which aids the judgment rendered in or the outcome of the main action. A descriptive term that denotes a legal claim, the existence of which is dependent upon or reasonably linked to a main claim. study (5-R01-ES012238). The National Heart, Lung, and Blood Institute National Heart, Lung, and Blood Institute, n.pr established in 1948, this division of the National Institutes of Health is responsible for research and education on cardiovascular, pulmonary, systemic diseases, and sleep disorders. , U.S. Department of Health and Human Services Noun 1. Department of Health and Human Services - the United States federal department that administers all federal programs dealing with health and welfare; created in 1979 Health and Human Services, HHS , funded the Women's Health Initiative (WHI) program. The authors declare they have no competing financial interests. Received 15 March 2006; accepted 8 June 2006.
Table 1. Cross-validation summary statistics and semivariogram parameter
estimates for P[M.sub.10] from three different spatial models, year
2000.
Model Days (a) Mean SD
PE ([micro]g/[m.sup.3])
Exponential 366 0.2347 1.3212
Gaussian 366 -0.1097 1.0509
Spherical 366 0.0629 1.1999
RMSS
Exponential 366 1.8374 1.5431
Gaussian 366 1.1709 0.9891
Spherical 366 1.2549 0.7988
SPE
Exponential 366 0.0118 0.0330
Gaussian 366 -0.0094 0.0333
Spherical 366 -0.0011 0.0212
Nugget ([micro]g/
[m.sup.3])
Exponential 366 2,837.28 27,839.3
Gaussian 366 4,096.10 38,738.9
Spherical 366 3,515.02 33,349.0
Partial sill ([micro]g/
[m.sup.3])
Exponential 366 7,957.38 91,589.2
Gaussian 366 6,483.31 73,915.9
Spherical 366 6,374.25 71,024.0
Range (m)
Exponential 366 2,696,226 2,832,621
Gaussian 366 2,163,126 2,277,023
Spherical 366 2,447,936 2,375,933
2.5th 97.5th
Model Median percentile percentile
PE ([micro]g/[m.sup.3])
Exponential 0.0294 -0.6437 1.6690
Gaussian -0.1216 -1.1230 1.0020
Spherical -0.0705 -0.7914 1.4810
RMSS
Exponential 1.1410 0.8638 6.0240
Gaussian 1.0070 0.8140 2.2660
Spherical 1.0270 0.8094 4.1550
SPE
Exponential 0.0036 -0.0274 0.1058
Gaussian -0.0071 -0.0418 0.0274
Spherical -0.0034 -0.0318 0.0470
Nugget ([micro]g/
[m.sup.3])
Exponential 93.5230 0.0000 5,332.40
Gaussian 181.975 26.6230 7,466.20
Spherical 142.955 0.0000 7,143.10
Partial sill ([micro]g/
[m.sup.3])
Exponential 258.515 49.1340 23,007.0
Gaussian 176.240 39.4570 23,716.0
Spherical 201.215 36.6550 22,736.0
Range (m)
Exponential 1,392,250 282,500 9,064,200
Gaussian 1,207,050 262,460 8,958,300
Spherical 1,424,050 280,820 8,958,300
(a) Daily operating monitoring sites range from 148 to 1,061 sites.
Table 2. Cross-validation summary statistics and semivariogram parameter
estimates for P[M.sub.2.5] from three different spatial models, year
2000.
Model Days (a) Mean SD
PE ([micro]g/[m.sup.3])
Exponential 366 0.1067 0.1162
Gaussian 366 -0.0323 0.0846
Spherical 366 0.0491 0.0883
RMSS
Exponential 366 2.0953 1.6086
Gaussian 366 0.9562 0.4500
Spherical 366 1.3887 1.3037
SPE
Exponential 366 0.0253 0.0356
Gaussian 366 -0.0102 0.0155
Spherical 366 0.0085 0.0242
Nugget ([micro]g/
[m.sup.3])
Exponential 366 9.4120 14.0622
Gaussian 366 26.8536 19.8300
Spherical 366 16.4381 16.5187
Partial sill ([micro]g/
[m.sup.3])
Exponential 366 94.0859 81.4191
Gaussian 366 80.2910 102.183
Spherical 366 84.3554 82.4740
Range (m)
Exponential 366 4,944,054 3,364,623
Gaussian 366 3,137,407 2,199,286
Spherical 366 3,840,664 2,669,710
2.5th
Model Median percentile 97.5th percentile
PE ([micro]g/[m.sup.3])
Exponential 0.0857 -0.0756 0.3835
Gaussian -0.0349 -0.2084 0.1187
Spherical 0.0413 -0.1033 0.2571
RMSS
Exponential 1.4365 0.5974 6.1640
Gaussian 0.9114 0.5517 1.5960
Spherical 1.0014 0.5532 4.5810
SPE
Exponential 0.0127 -0.0178 0.1097
Gaussian -0.0096 -0.0379 0.0178
Spherical 0.0038 -0.0219 0.0749
Nugget ([micro]g/
[m.sup.3])
Exponential 4.2819 0.0000 46.2270
Gaussian 22.2560 3.3694 76.4140
Spherical 12.0995 0.0000 64.1640
Partial sill ([micro]g/
[m.sup.3])
Exponential 70.0215 13.0410 304.610
Gaussian 49.9360 8.8309 326.550
Spherical 56.7625 10.1850 299.980
Range (m)
Exponential 4,047,800 758,590 9,064,200
Gaussian 2,683,950 564,450 8,904,000
Spherical 3,370,250 667,310 8,944,000
(a) Daily operating monitoring sites range from 178 to 1,019 sites.
Table 3. Comparison of estimated P[M.sub.10] ([micro]g/[m.sup.3]) at
94,135 geocoded addresses of WHI CT participant residences and
examination sites using default and manually modified semivariograms.
Summary statistics of cross-validations
PE RMSS SPE
Date Default Modified Default Modified Default Modified
02/16/2000 0.0122 -0.0099 5.034 1.037 0.0470 0.0021
03/05/2000 0.1660 0.0474 5.134 1.360 0.0469 0.0058
07/15/2000 0.5278 0.0193 5.564 1.180 0.0674 -0.0024
08/07/2000 0.5524 -0.1056 6.183 1.134 0.1417 -0.0053
08/19/2000 0.7609 0.3651 4.744 1.146 0.0963 0.0142
10/28/2000 0.4590 0.0363 4.243 1.276 0.0780 0.0018
Summary statistics of estimation
P[M.sub.10] difference
Mean P[M.sub.10] Mean SE (default--modified)
Date Default Modified Default Modified Mean SD
02/16/2000 31.19 28.76 9.73 14.02 2.43 3.61
03/05/2000 20.85 20.10 10.99 13.80 0.75 4.24
07/15/2000 24.01 23.83 7.57 10.13 0.18 3.11
08/07/2000 34.84 33.79 14.09 17.27 1.06 2.70
08/19/2000 25.07 24.76 13.59 13.54 0.30 3.64
10/28/2000 25.17 24.23 5.57 7.40 0.93 2.16
Table 4. Minima and maxima of measured and estimated P[M.sub.10]
([micro]g/[m.sup.3]) on the 22 days in 2000 when estimated values
exceeded the range of measured values.
Estimated from Estimated from
Minimum ordinary krigings Maximum ordinary krigings
Date measured Regular Lognormal measured Regular Lognormal
01/11 3.80 -3.135 5.535 712.00 534.814 261.951
01/15 3.00 2.106 11.756 194.88 162.905 88.078
01/16 3.00 2.102 9.150 167.60 107.460 70.224
02/13 1.00 -4.006 5.938 196.13 100.739 33.147
02/28 3.00 -0.005 7.196 138.50 135.518 77.630
03/05 3.68 -5.278 7.281 186.48 103.308 36.171
03/11 4.00 2.945 9.064 109.15 106.438 42.912
03/18 5.29 3.179 8.841 117.35 108.649 43.124
04/08 4.00 -43.540 9.097 690.00 534.630 78.059
04/16 0.14 -3.759 0.901 171.13 164.973 69.290
05/04 5.65 -5.768 14.550 1063.00 808.397 61.646
05/09 2.00 -15.362 10.889 3059.00 895.213 66.493
05/10 3.00 -18.598 13.805 1513.00 1023.12 252.891
05/14 6.00 5.472 6.175 82.00 79.383 79.051
06/07 9.13 -49.164 18.164 1642.00 1234.99 64.426
06/10 8.00 7.456 8.224 111.79 69.293 74.018
06/15 7.22 5.282 12.582 242.42 235.167 83.429
07/04 7.00 6.946 9.128 90.00 80.347 74.346
08/02 3.00 -1.224 16.587 441.00 356.964 76.597
08/17 8.22 5.296 7.132 200.00 194.675 198.473
08/20 5.00 4.244 5.899 135.00 134.182 83.798
08/30 7.00 6.074 11.696 140.00 112.957 83.781
Table 5. Means [+ or -] SDs of the cross-validation summary statistics
from both ordinary and lognormal krigings, year 2000.
All days (n = 366)
SPE RMSS
Ordinary -0.0011 [+ or -] 0.0212 1.2549 [+ or -] 0.7988
Lognormal -0.05012 [+ or -] 0.10191 1.390834 [+ or -] 1.56927
Out-of-range days (n = 22)
SPE RMSS
Ordinary 0.018489 [+ or -] 0.04202 3.329227 [+ or -] 1.93762
Lognormal -0.10918 [+ or -] 0.12434 2.374532 [+ or -] 2.18070
Within-range days (n = 344)
SPE RMSS
Ordinary -0.00147 [+ or -] 0.02018 1.18206 [+ or -] 0.67478
Lognormal -0.04635 [+ or -] 0.09933 1.327924 [+ or -] 1.50445
Table 6. Comparisons of goodness of fit between national and regional
scale krigings of the 12 days studied in 2000.
SPE
Date Natl SW NW MN SE NE
01/01 0.0106 0.0193 -0.0238 0.0008 -0.0168 -0.0125
02/06 0.0034 0.0320 -0.0013 0.0087 0.0241 -0.0126
03/01 0.0159 0.0089 0.0456 0.0062 -0.0079 -0.0215
04/06 -0.0015 0.0038 0.0286 -0.0032 0.0014 0.0140
05/06 -0.0052 0.0284 -0.0420 -0.0075 -0.0095 -0.0178
06/05 -0.0079 0.0150 0.0105 -0.0228 -0.0086 -0.0058
07/05 0.0031 0.0010 0.0083 -0.0571 -0.0233 0.0054
08/04 0.0108 0.0220 -0.0025 0.0069 -0.0208 0.0165
09/03 0.0053 0.0086 -0.0013 -0.0022 0.0054 0.0130
10/03 0.0055 0.0245 0.0164 -0.0314 0.0287 0.0137
11/02 0.0190 0.0565 0.0364 -0.0155 0.0432 0.0080
12/02 0.0130 0.0193 0.0379 -0.0016 0.0308 0.0010
Mean 0.0060 0.0199 0.0094 -0.0099 0.0039 0.0001
Median 0.0054 0.0193 0.0094 -0.0027 -0.0033 0.0032
SD 0.0083 0.0150 0.0259 0.0192 0.0225 0.0136
RMSS
Date Natl SW NW MN SE NE
01/01 0.9843 0.9976 1.0042 1.0064 0.9642 0.8617
02/06 0.9996 1.0034 1.0203 0.9335 1.0370 0.9816
03/01 1.0237 1.0701 1.0505 1.0021 1.0067 1.0397
04/06 0.9992 0.9693 1.0927 0.8644 1.0000 0.9995
05/06 1.0732 1.0027 1.0162 0.9997 0.9938 0.9361
06/05 0.9966 0.9694 1.1046 0.9131 1.1005 1.0638
07/05 0.9938 0.9052 1.1020 0.9489 1.0043 1.0048
08/04 0.9922 0.9990 1.2180 1.0243 0.9932 1.0014
09/03 0.9731 1.0328 1.0393 1.0030 0.8441 1.0008
10/03 0.9692 1.0014 1.0052 0.9925 0.9948 0.9619
11/02 0.9956 0.9984 0.9210 0.9964 0.9933 1.0103
12/02 0.9956 0.9976 1.0037 1.0082 1.0454 1.1159
Mean 0.9997 0.9956 1.0481 0.9744 0.9981 0.9981
Median 0.9956 0.9987 1.0298 0.9981 0.9974 1.0011
SD 0.0269 0.0389 0.0743 0.0485 0.0598 0.0634
Abbreviations: Natl, national-scale kriging; MN, kriging in middle north
region; NE, kriging in northeast region; NW, kriging in northwest
region; SE, kriging in southeast region; SW, kriging in southwest
region.
Appendix 1. WHI Institutions and Investigators
WHI Program Office, National Heart, Barbara Alving, Jacues Rossouw,
Lung, and Blood Institute, Shari Ludlam, Linda Pottern,
Bethesda, MD Joan McGowan, Leslie Ford,
Nancy Geller
Clinical Coordinating Centers
Fred Hutchinson Cancer Research Ross Prentice, Garnet Anderson,
Center, Seattle, WA Andrea LaCroix,
Charles L. Kooperberg,
Ruth E. Patterson, Anne McTiernan,
Shirley Beresford
Wake Forest University School of Sally Shumaker
Medicine, Winston-Salem, NC
Medical Research Labs, Highland Evan Stein
Heights, KY
University of California-San Steven Cummings
Francisco, San Francisco, CA
Clinical Centers
Albert Einstein College of Sylvia Wassertheil-Smoller
Medicine, Bronx, NY
Baylor College of Medicine, Jennifer Hays
Houston, TX
Brigham and Women's Hospital, JoAnn Manson
Harvard Medical School, Boston,
MA
Brown University, Providence, RI Annlouise R. Assaf
Emory University, Atlanta, GA Lawrence Phillips
George Washington University Judith Hsia
Medical Center Washington, DC
Harbor-UCLA Research and Rowan Chlebowski
Education Institute, Torrance,
CA
Kaiser Permanente Center for Evelyn Whitlock
Health Research, Portland, OR
Kaiser Permanente Division of Bette Caan
Research, Oakland, CA
Medical College of Wisconsin, Jane Morley Kotchen
Milwaukee, WI
MedStar Research Institute/Howard Barbara V. Howard
University, Washington, DC
Northwestern University, Linda Van Horn
Chicago/Evanston, IL
Ohio State University, Columbus, Rebecca Jackson
OH
Rush Medical Center, Chicago, IL Henry Black
Stanford Prevention Research Marcia L. Stefanick
Center, Stanford, CA
State University of New Dorothy Lane
York-Stony Brook, Stony Brook,
NY
University of Alabama at Cora E. Lewis
Birmingham, Birmingham, AL
University of Arizona, Tamsen Bassford
Tucson/Phoenix, AZ
University at Buffalo, Buffalo, Jean Wactawski-Wende
NY
University of California-Davis, John Robbins
Sacramento, CA
University of California-Irvine, F. Allan Hubbell
Irvine, CA
University of California-Los Howard Judd
Angeles, Los Angeles, CA
University of California-San Robert D. Langer
Diego, La Jolla/Chula Vista, CA
University of Cincinnati, Margery Gass
Cincinnati, OH
University of Florida, Marian Limacher
Gainesville/Jacksonville, FL
University of Hawaii, Honolulu, David Curb
HI
University of Iowa, Iowa Robert Wallace
City/Davenport, IA
University of Massachusetts/ Judith Ockene
Fallon Clinic, Worcester, MA
University of Medicine and Norman Lasser
Dentistry of New Jersey,
Newark, NJ
University of Miami, Miami, FL Mary Jo O'Sullivan
University of Minnesota, Karen Margolis
Minneapolis, MN
University of Nevada, Reno, NV Robert Brunner
University of North Carolina- Gerardo Heiss
Chapel Hill, Chapel Hill, NC
University of Pittsburgh, Lewis Kuller
Pittsburgh, PA
University of Tennessee, Memphis, Karen C. Johnson
TN
University of Texas Health Robert Brzyski
Science Center, San Antonio, TX
University of Wisconsin, Madison, Gloria E. Sarto
WI
Wake Forest University School of Denise Bonds
Medicine, Winston-Salem, NC
Wayne State University School of Susan Hendrix
Medicine/Hutzel Hospital,
Detroit, MI
|
|
||||||||||||||||

d')
Printer friendly
Cite/link
Email
Feedback
Reader Opinion