Dear Editor:

While reading the article "Lead in Drinking Water: Sampling in Primary Schools and Preschools in South Central Kansas" in the March 2012 (JEH 74[7]) edition of the Journal of Environmental Health, I came across a few issues with the figure, tables, and conclusions that I would like to bring to your attention.

Figure 1 implies that 35.7% of water samples contained lead and that 64.3% were below lead detection limits. This figure, however, inflates the actual percentage of samples that contained lead and underreports the percentage of samples that were below detection limits. The area that represents the percent with detectable lead (32.1%) already captures the two samples that exceeded U.S. Environmental Protection Agency (U.S. EPA) guidance levels. Therefore it is inappropriate to add another area (3.6%) to represent these two data points again. By "double counting" these data points and subtracting them from 100%, the author arrived at 64.3% for the percentage of the samples that were below the detection limit. In actuality, only 32.1% (18/56) of the samples contained detectable lead (two of which exceeded U.S. EPA guidelines) while 67.9% (38/56) were below lead detection limits.

Tables 1-3 used the Chi-square test for independence to assess the relationship between multiple variables and the concentration of lead in drinking water. When conducting the test, the author divided the number of observations in each row of the table by the number of columns to arrive at the expected values for each cell of the expected contingency table (not shown in article) instead of using the formula expected value = [(2 Row*2 Column)/n]. When using the correct formula to calculate expected values, it is clear that the Chi-square test is not appropriate for these data (less than 80% of the expected values are greater than five and one or more value is less than one). Even if one ignores the limitations of the Chi-square test with these data, calculations using x2 = [(Observed-Expected)2/Expected] result in x2 = 8.9, 12.6, and 15.5 for Tables 1-3, respectively. Using these values, the test fails to reject all null hypotheses, indicating independence between the concentration of lead in drinking water and all covariates analyzed here.

In conclusion, lead in drinking water is a very important issue, especially when we consider the impact on the youth. As such, regular sampling of drinking water in primary and preschools is an important step to help preserve the health of this population. Although it is logical that the age of a building, corrosiveness, and temperature of water may impact the concentration of lead in drinking water, these associations cannot be drawn from the data presented here. The associations claimed in this article are based upon improperly conducted statistical tests and are spurious.

MAJ Joseph J. Hout, MSPH, REHS

Division of Occupational & Environmental Health Sciences

Uniformed Services University of the Health Sciences

The Authors Respond

Dear Editor:

MAJ Hout is correct in his observations regarding our data analysis and we thank him for his careful attention to our statistics. We did make a simple mathematical error in calculating the percentages in Figure 1, and the expected values were calculated incorrectly for Tables 1-3. The expected values of each column were actually calculated as being equal (total observations divided by number of columns), not as MAJ Hout described in his letter.

Ms. Massey was a graduate student in our online Master of Science program when she conducted this research. Her faculty advisor abruptly resigned and left Ms. Massey (and many other students) without a research mentor. As director of the program I stepped in and helped Ms. Massey and the other students finish their independent research projects, even though many projects were well outside my area of expertise (cardiovascular physiology). As the senior researcher on this project, I accept full responsibility for not verifying the accuracy of the data analysis. The errors are completely unintentional but inexcusable nonetheless.

MAJ Hout states that the Chi-square test is inappropriate for these data. We disagree, as do the reviewers of our manuscript, one of whom specifically commented that the "statistical methods applied to the data are appropriate." None of the zero values were structural zeros; all were sampling zeros and could not have been anticipated. MAJ Hout cites a "rule of thumb" attributed to Cochran (1954) that suggests avoiding the use of the Chi-square test when there are expected cell frequencies less than 1 or when more than 20% of the table cells have expected cell frequencies less than 5. This "rule," however, is considered by some to be overly conservative (Larntz, 1978).

While our data fail to meet these suggested criteria, there is a second "rule of thumb" for determining if data may be analyzed using the Chi-square test. According to Roscoe and Byars (1971), the Chi-square test may be used if the average expected cell frequency is at least 2 when the expected cell frequencies are not equal and p < .05. Our data fit this second "rule of thumb."

Reanalysis of the data using the Fisher's exact test, which is an appropriate statistical test when the expected numbers are small but is most commonly applied to 2x2 contingency tables, reveals that lead contamination of drinking water is not significantly related to building age or water corrosiveness but is significantly related to water temperature (p = .026).

Our observation that approximately one-third of the schools and child care facilities sampled in this study had measurable levels of lead in the drinking water is still meaningful. The Lead and Copper Rule requires sampling of single-family dwellings only, so it is likely that schools and many child care facilities are not monitored. There are 782 elementary schools and nearly 7,000 licensed child care facilities in the state of Kansas. Our observation that 3.5% of the facilities sampled in this study had lead contamination in drinking water that exceeded the U.S. Environmental Protection Agency guidance level should be of concern to us all because as many as 27 elementary schools and 245 child care facilities in the state of Kansas could be affected.

Janet E. Steele, PhD

Professor of Biology

Director, Online MS Program

University of Nebraska at Kearney AM

References

Cochran, W.G. (1954). Some methods of strengthening the common Chi-square tests. Biometrics 10(4), 417-451.

Larntz, K. (1978). Small-sample comparisons of exact levels for the Chi-squared goodness-of-fit statistics. Journal of the American Statistical Association, 73, 253-263.

Roscoe, J.T., & Byars, J.A. (1971). An investigation of the restraints with respect to sample size commonly imposed on the use of the Chi-square statistic. Journal of the American Statistical Association, 66, 755-759.
COPYRIGHT 2012 National Environmental Health Association
No portion of this article can be reproduced without the express written permission from the copyright holder.