Hutchinson study, gold standard or spruce goose: an epistemological view of prevention research.
The research design of the Hutchinson Smoking Prevention Project (Peterson, Kealy, Mann, Marcek, Sarason, 2000) was acclaimed as a new "gold standard in Prevention Science" (Clayton, Scutchfield, Wyatt, 2000). The Hutchinson Cancer study was a large-scale, long-term application of Newtonian linear science to a social intervention in a more complete and thorough style than any previous application. In addition to acclamations for the study design, findings of the study were hailed for their definitive importance (a single study, unequivocally rejecting CDC recommended "Best Practices" that were developed from numerous studies conducted over several decades). The profound importance attributed to the study and the implications that the design should serve as the model for future prevention research demands a more thorough analysis of the limitations to complement the acclamations heaped on the study.
The Hutchinson study was designed to assess the long-term impact of a theory-based, social influences intervention using smoking prevalence as the primary outcome. The article (Peterson, Kealy; Mann, Marcek, Sarason, 2000) presenting the research findings, concluded that there is no evidence from this trial that a school-based social-influences approach is effective in the long-term deterrence of smoking among youth. Information about the study, that was released for broad public consumption through a press release from the Fred Hutchinson Cancer Center in Seattle, Washington, was far more provocative. The press release declared, "The most ambitious, school-based smoking-prevention study of its kind has found that teaching youth how to identify and resist social influences to smoke--the main focus of smoking-prevention education and research for more than two decades--simply doesn't work." The press release went on to quote the principal investigator of the study, "The trial's results, while unfortunate, are definitive." The gold standard editorial accompanying the journal article echoed the press release declaration that the study was definitive.
The Hutchinson study was substantial, if not monumental, involving 40 school districts and more than 8,400 children over a 15 year period (1984 to 1999). A critical element of the study design included randomized assignment of the intervention to the experimental school districts with the control school districts not receiving the intervention provided by the Hutchinson Cancer Center. The intervention, the social influences approach, was well developed, utilizing critical components for school-based tobacco prevention recommended by both the National Cancer Institute and the Centers for Disease Control, with the number of hours of intervention exceeding the recommended levels of exposure. Other positive elements described in the literature were the relatively high degree of fidelity in implementing the intervention, the thorough training of the teachers, very low attrition, self reported behavior verified with saliva cotinine tests on a sample of 12th graders, and design controls limiting mixing of controls with experimentals (Peterson, Kealy; Mann, Marcek, Samson, 2000; Clayton, Scutchfield, Wyatt, 2000).
Results focused on prevalence rates, which varied extensively among the school districts. However when aggregating the rates, the experimental school districts had only slightly lower (statistically insignificant) prevalence rates (24.4% among the girls and 26.3% among the boys) than the control school districts (24.7% among the girls and 26.7% among the boys). The journal article reports the results somewhat circumspectively, seeming to recognize that the statistical tests only facilitate the conclusion that the differences were so small that chance could not be ruled out as the cause within the stated levels of probability. Somewhat in contrast, the press release and the editorial make more definitive statements based on the interpretation that the small difference actually constitutes no difference between the experimentals and controls. Reported in the journal article, but not discussed in the conclusions or in the editorial, the number of cigarettes smoked per day among male and female daily smokers combined was lower (9.6%) for the experimental group than the control group (10.4%). This difference was reported as statistically significant, but de-emphasized, possibly because of the lack of statistical significance for many other outcome measures (Peterson, Kealy, Mann, Marcek, Sarason, 2000; Clayton, Scutchfield, Wyatt, 2000).
The Hutchinson study may very well reflect the epitome of what can be accomplished with experimental designs to research a complex social intervention. However, the limitations of the experimental design are ignored in the gold standard editorial. It is the purpose of the following analysis to illustrate those limitations. When the design limitations are considered, the conclusions found in the press release and the gold standard editorial are clearly overstated. This analysis also illustrates how the use of other less costly research designs (without the pretense of definitive findings) should be considered, even if a highly rigorous experimental design with tremendous financial support is possible.
THE NEED FOR SENSITIVITY TO COMPLEXITY
The authors of the "gold standard" editorial, applauding the experimental design, recognized the extent that the Hutchinson study exceeded previous efforts to apply the standards of linear scientific rigor to social interventions. However, the authors also call for more "complex, robust models" of prevention rather than "simple models of main effect." The limitations of linear, simple cause and effect research design in assessing the greater complexity appears to be overlooked. It also is not clear why the authors do not acknowledge that more complex comprehensive models of prevention are already the "gold standard" of prevention as most major authorities on tobacco prevention recommend comprehensive models, models involving multiple components (Centers for Disease Control and Prevention 1999; National Cancer Policy Board, IOM, National Research Council & Board of Health Promotion and Disease Prevention State, IOM. 2000; US Department of Health & Human Services, 2000). The authors do suggest that the field needs to reorient its research from "main effects question (what works) to the moderated model question (what works, for whom, under what conditions, how and why)," perhaps providing an indirect hint that the Hutchinson design may not be the standard.
No recognized authority is arguing that school based programs alone are sufficient to curtail the youth tobacco epidemic in the United States. Florida's success (arguably the most successful effort to reduce youth tobacco use) appears to provide evidence of the benefits of comprehensive programs (Centers for Disease Control and Prevention, 1999; Bauer, Johnson, Hopkins, Brooks, 2000). Looking at any school based program as the single solution and declaring it ineffective appears to be analogous to a cancer researcher testing a single component of a multiple component chemotherapy program and declaring it useless because it did not produce the desired effect, even though experience had already demonstrated that effectiveness occurred when combined with other components. It would have been more appropriate for the Hutchinson study researchers and supporters to have limited their findings to school based social influences programs as stand alone programs, while recognizing that most authorities were already recommending that school based programs be implemented in conjunction with other programs.
Comprehensive tobacco prevention programs, what is given credit for tobacco prevention success in Florida (Centers for Disease Control and Prevention, 1999; Bauer, Johnson, Hopkins, Brooks, 2000), involve multiple components, perhaps interacting synergistically. The inherent complexity of comprehensive programs presents problems for simple cause and effect, Newtonian linear research designs. Many of the physical sciences, such as quantum physics, phase transition and fluid dynamics, have discarded linear designs as inadequate to address complexity (Gleick, 1987). Approaches to science that emphasize complexity have increasingly been applied to social phenomena (Eve, Horsfall, Lee, 1997; Arrow, McGrath, Berdahl, 2000; Marion, 1999). However, the implications of complexity for prevention research appear to be ignored when the Hutchinson study is considered "definitive" or a "gold standard." The need for research design that more effectively addresses the complexity and perhaps even synergistic interaction of multiple components of tobacco prevention programs is illustrated by examining the Hutchinson Study in light of the issues illustrated in the complexity/chaos literature (Gleick, 1987).
The proclamations of the "definitive" nature of the Hutchinson study, found in the gold standard editorial and the Hutchinson Cancer Center press release also contradict the emerging realities of non linearity that occurs when even a third variable is introduced. Prigogines (1997) characterization of the and of certainty," in his epistemological analysis of science, reflects the realization that nature is composed of virtual chaos, certainly unpredictability, when multiple variables interact. He concludes that unpredictable resonance, what others might refer to as synergy, makes universal conclusions or certainty unattainable.
Different contexts provide other challenges to the proclamations of the Hutchinson study being definitive. The Hutchinson study authors did not address the possibility that social influences approaches may have varying impact, depending on the social context in which they are introduced. For example, the most primitive knowledge of water includes the realization that water performs differently when a simple variable such as temperature is introduced. However, knowledge of water's behavior in one context (at 100 degrees Fahrenheit) does not provide any predictability about water's behavior in other contexts (at -100 degrees Fahrenheit or 1000 degrees Fahrenheit). The Hutchinson study does not address the probability that education programs have varying impact in different social contexts in the same way that water acts differently in different temperature contexts. Illustrating the problem, a major social context that changed toward the end of the Hutchinson study was the reduction in tobacco advertising that targeted young people, but the variety of social contexts are virtually infinite.
The lack of efforts to study context, recommended by some authorities on evaluation research design (National Science Foundation, 1997; Patton, 1990), were particularly problematic because of potential confounding variables that were not addressed methodologically in the Hutchinson study. Randomized assignment to experimental and control groups, considered by many to be the optimal research design, does not necessarily address context. The Hutchinson study attempted to assess the impact of the social influence approach on prevention, but the principles of the social influences approach, embodied in the work of Bandura (1977) and Botvin (1992), are included in a wide range of school based education curricula designed to address health problems, such as drug and alcohol education, (Botvin & Botvin, 1992) sex education (Botvin, Baker, Dusenbury, Botvin, Diaz, 1995), HIV/AIDS education (St. Lawrence et al, 1995) and comprehensive health education (Centers for Disease and Prevention, 2001). Without assessing the extent that the social influences approaches were already present or were introduced during the study within the Hutchinson control population, rival hypotheses that explain the lack of statistically significant differences become plausible. It is plausible that the social influences approaches can be generalized from one health issue to another; for example, refusal skills to resist peer pressure related to drugs or sex could also apply to tobacco. It is very possible that infusions of the social influences approaches related to other health issues were present in the Hutchinson controls, thereby reducing the difference in social influences training between the controls and the experimental groups. Not assessing the extent that social influences approaches were present in the controls is analogous to researchers employing an experimental design to assess changes in serum antibody levels for a measles vaccine without monitoring who already had measles or who contracted measles during the study.
Data reported in the Hutchinson study provide further evidence of experimental design limitations related to context. The gold standard editorial criticized the social influences approach for attributing "the causes of smoking almost exclusively within the individual." The editorial points to the supporting evidence of data in the study showing high variation in the ranges of smoking prevalence among the school districts, thereby indicating other variables influenced tobacco use. The 12th grade female prevalence for the control school districts had ranges from 0% to 41.9% and the experimental school districts had ranges from 15.5% to 34.2%. Whether the social influences approach really does have this almost exclusive emphasis on what is "within the individual" is very questionable. However, while faulting the social influences theory for not considering other multiple variables in the social environment that appeared to influence tobacco use, the gold standard authors failed to recognize that the Hutchinson Study research design also did not address the potential confounding variables causing these broad ranges. Clearly, the high variation in prevalence among the school districts illustrates other potential variables at work, but the disparity in prevalence ranges between controls and the experimentals should have particularly been addressed. The 0% prevalence in at least one school district screams for an explanation, and the inclusion of this group representing an unknown variable, potentially if not probably only found in the controls, should have raised questions. The 0% rate in at least one control school district would have reduced the overall control prevalence rate and thereby may have been responsible for making the difference between the experimental group and the controls statistically insignificant. Clearly, better understanding of context and complexity would have helped reduce this major limitation that random assignment failed to address.
IGNORING SMALL DIFFERENCES
In light of complexity approaches to science, the Hutchinson study conclusions are problematic because potentially important data is ignored. Critics of Newtonian linear approaches point to the common flaw of traditional research ignoring the "noise," or unaccounted for variations in data that occur with experiments (Gleick, 1987). Some of the earliest developments of complexity science flowed from the study of the small, frequently overlooked variations, that in fact had very large impact. "Fractals" emerged from the study of these previously overlooked phenomena. The discarded noise (20% lack of fidelity to program implementation) in the Hutchinson study is actually quite large to be overlooked, but lauded because of its relative success; the fidelity was higher than other similar types of research. The Hutchinson study does deserve credit for achieving such a high level of fidelity, which is important for research but not necessarily considered good teaching. Teachers are frequently trained to adapt their programs to the needs of their students so research that is trying to measure the consistent implementation of teaching can contradict the tenets of good teaching, thus making relatively high rates of lack of fidelity acceptable for this research. The adaptation and flexibility needed for effective program implementation conflicts with the adoption and fidelity associated with diffusion of innovation research (Livingood, Woodhouse, 1992), requiring other models of research than the traditional simple linear research designs. The inherent contradiction of rigidly implemented optimal teaching interventions and the 20% deviation from implementation in the experimental group, which is unacceptable for most linear physical science, should raise general questions about the utility of traditional experimental research design for researching complex social problems.
Other noise (small, deemed insignificant results) that was ignored in the conclusions of the Hutchinson study were the data showing that there were in fact observed differences in the experimental and control groups in the direction of the desired effect. The groups receiving the Hutchinson social influences programs were fractions of a percent lower in daily smoking prevalence (statistically insignificant) than the controls. The difference in the number of cigarettes smoked per day was statistically significant in the desired direction, but neglected in the conclusions. Other noise was the fact that the controls actually started out with fewer people having tried tobacco. Only 10.8% of the controls had tried tobacco in 3rd grade compared to 11.8% of the experimentals. In effect, this small 1% difference could also be reported as the experimental group having tried tobacco at a 9% higher rate than the control group. The rates of parental smoking in the experimental group was also higher than the controls (1.8% or 3.6% depending on how it is reported). These "inconsequential" data, considered in light of recommendations for comprehensive, multiple component programs, are more than issues of scientific epistemological rhetoric. When combined with other interventions as part of comprehensive programs, these small inconsequential differences might contribute to a net or aggregated effect that might be statistically significant. Even more plausible, the synergistic effect of combining the social influences approaches with other components of a comprehensive program could produce an effect that the Hutchinson study methods could not have possibly measured, due to its lack of data collection for other community based components of a comprehensive program.
Regardless of the complexity of social issues, the Hutchinson study will probably continue to be a marvel to those who believe Newtonian linear approaches to science are the ideal. However, scientific approaches that emphasize understanding complexity and context would appear to complement the recommendations for changes in the theoretical foundations of prevention research contained in the gold standard editorial. Mixed method approaches (National Science Foundation, 1997) utilizing quantitative and qualitative (Patton, 1990) designs to maximize the benefits of both designs may be much more suitable for understanding that complexity and context. Similar to how an engineer studies the fluid mechanics of a wing in a wind tunnel to refine development of the wing (developing very specific applied knowledge rather than universal laws of physics), prevention researchers may need to limit their findings to the specific circumstances that they study.
Application of Mixed Method design to another controversial area of tobacco prevention research illustrates the concept of research leading to insight rather than laws of cause and effect, associated with traditional experimental research. A recent study of the impact of possession law enforcement on youth tobacco use in Florida incorporated a Mixed Method design with both the quantitative (Livingood, Woodhouse, Jopling- Sayre, Wludyka, 2001) and qualitative (Woodhouse, Jopling-Sayre, Livingood, 2001 ) studies being reported in the scientific journals. The study of the general population of youth, (Livingood, Woodhouse, Jopling-Sayre, Wludyka, 2001) looking more at the preventive effect of possession enforcement and using an adaptation of the youth use surveillance system in Florida, found statistically significant differences in youth tobacco use. A broad range of potential confounding variables were examined and excluded as likely causes of the effect. The accompanying qualitative study (Woodhouse, Jopling-Sayre, Livingood, 2001) helped to confirm the differences in enforcement among the counties that were studied and helped clarify the dynamics of how enforcement may have been influencing tobacco use behavior (reducing role modeling associated with young people smoking in high visibility locations). However, this study made no pretense of being definitive, recognizing that unforeseen confounding variables may have been at work and the context was very limited (relatively minor, non-criminal penalties implemented as part of a comprehensive youth tobacco prevention program). If anything, the study contributes to the literature that law enforcement is not a simple construct that can be studied with simple linear, cause and effect designs and the study reinforced the concept that any observed effect had to be considered within the context of the broad well-funded, comprehensive program that existed within Florida. Similarly, experimental design researchers would do well to recognize the limitations of their findings.
Looking at the Hutchinson study with hindsight and in light of the evolution of science, the research design appears to: 1) be prohibitively costly (15 million dollars, [National Cancer Institute, 2000]), 2) ignore both the recommendations for comprehensive programs and the inherent complexity of comprehensive programs, 3) lack efforts to understand the context and grasp the importance of how different contexts might influence the results, and 4) overlook the observed difference that may be particularly important in light of the recognized need for these programs to be implemented in conjunction with other community based efforts. The cost and the National Cancer Institute's stated lack of plans to fund similar initiatives (National Cancer Institute, 2000) provide a superficial analogy to the Spruce Goose * (NIH being the only practical source of funding, capable of supporting such an undertaking). For practical purposes and for better science, the long term extensive randomized assignment used by the Hutchinson study may be hailed as a scientific marvel, but it should also be used as a lesson for how alternative research designs can help improve our insights about the efficacy of prevention programs. Most importantly, even with extensive resources and what proponents consider an optimal application of design, experimental designs have serious limitations, and other research designs that are more sensitive to complexity and context should be considered when designing tobacco prevention research. Maintaining an emphasis on interventions that can be experimentally tested may also result in a focus on prevention efforts that are more easily controlled (usually clinically based such as cessation programs), potentially resulting in the neglect of more population based, primary prevention programs that involve complex, less controllable social phenomena. Recalling that the tobacco industry maintained that experimental design was the only way to truly prove that tobacco use caused cancer or heart disease in humans (research that could not ethically or practically be conducted), we should be reluctant to hold experimental design up as the standard for gaining insights about what prevents tobacco use.
* End Note: The Spruce Goose, more formally known as the Hughes Flying Boat, was hailed as a technological marvel but never flew again after its maiden flight and has been relegated to museums or storage ever since. (Evergreen Aviation Museum, 2001).)
Arrow, H., McGrath, J.E., Berdahl, J.L. (2000) Small Groups as Comples Systems. Thousand Oaks: Sage Publications.
Bandura, A. (1977). Social Learning Theory Englewood Cliffs (NJ): Prentice Hall.
Bauer, U.E., Johnson, T.M., Hopkins, P.S., Brooks, R.G. (2000) Changes in Youth Cigarette Use and Intentions Following Implementation of a Tobacco Control Program. JAMA, 284, 723-728.
Botvin G.J., Botvin E.M. (1992) Adolescent tobacco, alcohol and drug abuse: prevention strategies, empirical findings, and assessment issues. J Dev Behav Pediatr, l3, 290-301.
Botvin, G.J., Baker, E., Dusenbury, L., Botvin, E.M., Diaz, T. (1995). Long-term follow-up results of a randomized drug abuse prevention trial in a white middle class population. JAMA, 273, 1106-12.
Centers for Disease Control and Prevention. (1999). Best Practices for Comprehensive Tobacco Control Programs-August 1990. Atlanta, GA: U.S. Department of Health and Human Services, CDC, NCCDPHP, OSH, August.
Centers for Disease Control and Prevention. (1999) Tobacco use among middle and high school students--Florida, 1998 and 1999. Morbidity and Mortality Weekly Report, 48, 248-253.
Centers for Disease and Prevention (2001). Comprehensive School Health Education. [Online]. http:// www.cdc.gov/nccdphp/dash/cshedef.htm.
Clayton, R.R., Scutchfield, F.D., Wyatt, S.W. (2000). Hutchinson Smoking Prevention Project: a New Gold standard in Prevention Science Requires New Transdiciplinary Thinking. J Natl Cancer Inst, 92, 1964-65.
Evergreen Aviation Museum. (2001). Spruce Goose: A Brief History [Online]. http://sprucegoose.org/ spruceGoose.t?request=A%20Brief%20History.
Eve, R.A., Horsfall, S., Lee, M.E. (1997). Chaos, Complexity and Sociology. Thousand Oaks: Sage.
Gleick, J. (1987). Chaos: Making a New Science. New York: Penguin Books.
National Cancer Institute. (2000). Press Release: Questions and Answers about the Hutchinson Smoking Prevention Project: [Online].
Livingood, W.C., Woodhouse, L.D. (1992). Keystone Model for Training to Implement School Health Promotion Programs. Health Value, 16(1), 10-16.
Livingood, W.C., Woodhouse, C.D., Jopling- Sayre, J., Wludyka, P. (2001). Impact study of tobacco possession law enforcement in Florida. Health Education & Behavior, 28(6), 733-48.
Marion, R. (1999). The Edge of Organization: Chaos & Complexity Theories of Formal Social Systems. Thousand Oaks; Sage Publications.
Montfort, S, Brick, P. (2000). Unequal Partners: Teaching about power and consent in adult-teen and other relationships. Morristown (NJ): Planned Parenthood of Greater Northern New Jersey, Inc.
National Cancer Policy Board, IOM, National Research Council & Board of Health Promotion and Disease Prevention State, IOM. (2000). Programs can reduce tobacco use. Institute of Medicine. Washington, DC: National Academy Press.
National Science Foundation. (1997). User-Friendly Guide to Mixed Method Evaluation. NSF 97-153. Arlington, VA: NSF.
Patton, M.Q. (1990). Qualitative Evaluation and Research Methods, 2nd Ed. Newbury Park CA: Sage.
Peterson, A.V. Jr., Kealy, K.A., Mann, S.L., Marcek, P.M., Sarason, I.G. (2000). Hutchinson Smoking Prevention Project: long-term randomized trial in school-based tobacco use prevention. Results on smoking. J Natl Cancer Inst. 92,1979-91.
Prigogine, I. (1997) The end of certainty. New York: Free Press.
US Department of Health & Human Services. (2000). Reducing Tobacco Use: A Report of the Surgeon General. Atlanta GA USDHHS, CDC, NCCDPHP, OSH.
St. Lawrence, J.S., Brasfield, T.L., Jefferson, K.W., Alleyne, E., O'Bannon, R.E., Shirley, A. (1995). Cognitive-behavioral intervention to reduce African-American adolescents' risk for HIV infection. Journal of Consulting and Clinical Psychology, 63(2), 221-37.
Woodhouse, C.D., Jopling-Sayre, J., Livingood, W.C. (2001). Tobacco Policy and the Role of Law Enforcement in Prevention: The Value of Understanding Context. Qualitative Health Research 11(5), 682-92.
William C. Livingood, Ph.D., is Director of the Division of Health, Policy & Evaluation Research at the Duval County Health Department and is a Clinical Associate Professor in College of Medicine, Department of Pediatrics at the University of Florida. Carolyn D. Woodhouse, Ed.D., M.P.H., is a Professor & MPH Coordinator at East Stroudsburg University. Dr. Woodhouse is also a Research & Evaluation Consultant for the Duval County Health Department. Address all correspondence to Dr. Livingood at College of Medicine; Department of Pediatrics; University of Florida; 515 W 6th St.; Jacksonville, FL 32206; Phone: 904.665.2339; e-mail: William_Livingood@doh.state.fl.us.
|Printer friendly Cite/link Email Feedback|
|Author:||Woodhouse, Carolyn D.|
|Publication:||American Journal of Health Studies|
|Date:||Jun 22, 2001|
|Previous Article:||Taking a step back: developing interventions within a mediating-variable framework.|
|Next Article:||Use of an ecological approach to worksite health promotion.|