Assessing the new criteria for newborn screening.
Newborn screening programs have been run by state health departments since the 1960's. (1) The first and paradigm condition for newborn screening is phenylketonuria or PKU. PKU is caused by a hereditary defect in the metabolism of a biochemical in food called phenylalanine. Without dietary treatment, many children with PKU develop profound mental retardation and severe cognitive damage often occurs before the condition is clinically evident. Screening at birth enables the infant to begin a specialized diet within a few weeks of birth, thus preventing irreparable damage. Newborn screening for PKU has been a remarkable success. Over the years, state programs added new conditions to the screening panels, including congenital hypothyroidism, hemoglobinopathies (including sickle cell disease), and galactosemia that have been part of most programs for several decades. However, in the past 5--10 years, there has been a rapid expansion in the number of conditions included on screening panels. (2) In large measure, this is due to application of a technology called tandem mass spectrometry. Use of this tool enables screening for dozens of different conditions simultaneously by analyzing blood spots for characteristic changes in their biochemical profile. Now a majority of states screen for more than 30 conditions and some for more than 50. With this expansion has come a debate over the appropriate criteria for adding new tests. Is population screening for all of these conditions justified or is this an example of the technological imperative?
An important background element of this discussion is the mode by which screening is conducted. These are public health programs at the state level that are mandated by state law in all but 3 places: Maryland, Wyoming, and the District of Columbia. (3) The informed permission of parents is not sought for screening in other states. In this regard, newborn screening is markedly different than other forms of testing, particularly genetic testing, for which informed consent is considered an important component of testing protocols. (4) Parents may refuse testing on religious or philosophical grounds in most states but in at least one case, a state has successfully removed a child from the custody of his parents in order to conduct newborn screening. (5) The original justification for this approach was that the benefits of screening are so dramatic for conditions like PKU that the state is within its parens patria authority to mandate screening for newborns. But if mandatory PKU screening makes sense, does the same logic hold true for 20, 30 or 50 other conditions? Perhaps if the benefits of screening are comparable. But the justification for mandatory screening weakens if the benefits of screening, or the data regarding potential benefits, are less robust.
I. TRADITIONAL CRITERIA FOR SCREENING
The classic paper on criteria for population screening was published by Wilson and Jungner in 1968. (6) See Table I. Their criteria were proposed for the full scope of population screening opportunities, but their ideas remain particularly important and influential in the context of newborn screening. Andermann and colleagues recently published a synthesis of population screening criteria that have emerged over the past 40 years. (7) Many of the original concepts in the Wilson and Jungner recommendations were revised to include contemporary values including quality assurance, equity, access, and scientific evidence of effectiveness. Nevertheless, the core principles remain closely aligned over the decades. Many states have developed their own criteria for their newborn screening programs that often are based heavily on the traditional Wilson and Jungner criteria. But even a cursory review of these criteria reveals that they are largely subjective. For example, what does it mean to say, "[t]he condition sought should be an important health problem?" If only a few children per year in a state population die from a preventable condition, does that constitute an important health problem from our contemporary perspective? This subjectivity traditionally led to a wide variation between states in the conditions targeted by newborn screening. Although there is more uniformity between states in recent years, for decades it was the case that some states screened for only 4-8 conditions while others screened for more than 20. On this basis alone, it is clear that while the traditional Wilson and Jungner criteria were widely adopted in the field, their interpretation has been so broad as to undermine the notion that meaningful criteria exist at all.
Variation in newborn screening panels is not a feature of US programs alone. Pollitt recently reviewed screening programs in Europe and concluded, "[t]here is great variation ... by various bodies in adjacent countries in Western Europe ..." (8) This is not due to the lack of criteria per se but, again, to their subjective nature. Pollitt notes that the United Kingdom National Screening Committee developed 19 criteria and it specified 87 items of information that need to be addressed under 35 headings as part of its evaluation of new tests. In general, European nations screen for substantially fewer conditions than do most U.S. states. So clearly the struggle to develop justifiable criteria for this important set of programs is global in scope.
Wilson and Jungner recognized the challenges of screening and offered this cautionary statement, "[t]he central idea of early disease detection and treatment is essentially simple. However the path to its successful achievement ... is far from simple although sometimes it may appear deceptively easy." Evidence over the intervening years supports their caution. Screening of populations, or of large population groups, often is not effective. On numerous occasions, clinicians have been enthusiastic about a screening approach and then, as the data are collected regarding efficacy, the programs are narrowed or abandoned. For many years, the hospital admission chest X-ray was routine as a way of detecting occult lung disease, until the evidence revealed that early detection of diseases like lung cancer was not helpful in terms of morbidity or mortality, and the films generated many alarms and interventions over findings that proved to be benign.
The US Preventive Services Task Force (USPSTF) undertakes a stringent evidence-based review of a broad range of screening and testing modalities. A review of their reports illustrates that some screening approaches appear highly beneficial while there is not clear evidence of efficacy for many familiar practices. For example, the USPSTF supports mammography every 1 to 2 years for women over the age of 40, but finds insufficient evidence to recommend either clinical breast examination alone or breast self-examination as effective approaches. (9) In a recent review, it found insufficient evidence to recommend for or against screening for prostate cancer in men less than 75 years old and it recommended against screening men older than 75 years. (10) The USPSTF also recommends screening for hypertension in adults (11) and "strongly recommends screening for cervical cancer in women who have been sexually active and have a cervix." (12) But it found insufficient evidence for screening infants for iron deficiency (13) or for elevated lead levels in asymptomatic young children. (14) The point is that many screening approaches commonly used by clinicians and familiar to the general public are not supported by the data. Some direct-to-consumer screening methods, such as full body MRI scans, have virtually no published data at all supporting their utility.
Screening is a very attractive approach to disease prevention. It seems like it should work but, in fact, the basic mathematics of testing large numbers of individuals to identify a few treatable cases is very challenging. Further, screening tests almost always generate numerous "false positive" results that are frightening and often expensive and risky to address. If the harms from evaluating false positive results (e.g., tissue biopsies) outweigh the benefits of the program, then screening programs can be more harmful on balance than no program at all. Therefore an effective program requires an accurate, relatively inexpensive screening test, an efficient outreach and engagement effort, and a treatment or prevention strategy that works. It bears emphasis that a successful program includes more than a good test and an effective treatment for the condition--it requires that all of the various elements of a program work together in sequence to maximize benefits and minimize harms.
Screening programs are not always highly effective even when good tests and treatments are available. For example, the literature clearly shows that less than 50% of children identified with sickle cell disease through newborn screening complied with prophylactic antibiotics or the vaccinations that are essential to reduce morbidity and mortality. (15) The screening program provides no direct benefit to untreated children. In this context, state health departments should ask whether it is more appropriate to spend limited resources on improvements in the sickle cell screening program or use them to add a new set of screening tests to the panel for which data are scant on efficacy.
In our era of evidence-based medicine, we should strive for explicit criteria for each element of a screening program, and we should gather data before and during the adoption of a new screening program to assess whether criteria are met. (16) This level of rigor has not been a part of traditional newborn screening programs.
Evidence-based medicine is becoming increasingly important given the high cost of US medical care and the recognition that a substantial portion of clinical care has not been carefully evaluated. Newer on the block is the evidence-based evaluation of genetic tests. In the newborn screening context, direct tests of DNA are not widely utilized, nevertheless, the majority of conditions targeted are genetic in etiology. Tests generally target changes in biochemistry in the blood secondary to heritable or congenital conditions. The FDA is not taking an active role in regulating most genetic tests, enabling companies to market tests to clinicians and directly to consumers without the evaluation typical of drugs or devices. An established approach to test evaluation is captured in the acronym ACCE, standing for Analytic validity, Clinical validity, Clinical Utility, and Ethical, legal and social implications. (17) Analytic validity in the context of newborn screening testing refers to the ability of the test to correctly characterize a target blood sample with known characteristics under typical laboratory conditions. Clinical validity refers to the ability of the test to correctly characterize the child as being affected or unaffected with the condition. Most screening tests are designed to be highly sensitive but not necessarily highly specific. This means that false positive results are common and children who receive a false positive result will need to undergo additional testing to determine whether they are affected or not. For most newborn screening tests, the ratio of false positive to true positive results is 10 to 1 or higher so a substantial portion of the monetary and psychological costs of newborn screening are created by the need to manage the false positive results. (18) Clinical utility refers to the usefulness of the test results to benefit the individual tested. Tests can be of high clinical validity but low clinical utility or vice versa. In the context of NBS, each of the ACCE criteria is important and these considerations map to the Wilson and Jungner criteria to a significant extent. For example, Wilson and Jungner state that there "should be a suitable test or examination." In newer terminology, we would expect a test to have sufficient analytic validity and clinical validity, although how "sufficient" is defined may vary by test and condition.
Of the Wilson and Jungner criteria, it is primarily their second criterion that has emerged as the subject of most debate: "There should be an accepted treatment for patients with recognized disease." In our context, this means that there should be an established treatment or preventive intervention for a condition before it is eligible for inclusion on a newborn screening panel. Under the ACCE rubric, we would say that the test has to have demonstrated clinical utility. However, this criterion often is misinterpreted as requiring only that there exist an effective treatment for the condition targeted. In the context of a screening program, the important criterion is whether early detection results in improved clinical outcome for the child. If a treatment can be effectively delivered after clinical diagnosis of a child, that is, following the development of symptoms, then there is no rationale for screening of asymptomatic children. Of course, effective early intervention requires that there be an effective treatment but the two are not equivalent. Effective treatment is a necessary but not sufficient criterion for screening. For example, in the adult context, breast cancer is a treatable condition. Nevertheless, screening is not appropriate in younger women until the net benefits of early detection are demonstrated in this group.
II. CRITERIA OF THE AMERICAN COLLEGE OF MEDICAL GENETICS
In 2004, the American College of Medical Genetics undertook a project to recommend a uniform panel of conditions for newborn screening that could be adopted by state programs throughout the country. (19) This work was commissioned by the Maternal and Child Health Bureau of the federal Health Resources and Services Administration (HRSA). The effort involved an evaluation of the criteria for newborn screening, an analysis of the evidence for a wide variety of conditions, and recommendations on a uniform panel of conditions to be targeted by newborn bloodspot screening. The criteria developed can be divided into three main categories: clinical characteristics of the condition, the analytic characteristics of the screening test, and the diagnosis, treatment and management of the condition. The working group articulated a number of specific criteria within these categories and developed a scoring system to rate the relative fulfillment of each criterion. See Table II. The ACMG working group then surveyed a total of 289 individuals from a variety of backgrounds, including experts, clinicians, and consumers, to obtain scores for each of the 84 conditions under consideration. An expert panel also conducted a review of the literature to determine whether there was sufficient evidence to verify the responses of the survey respondents. The working group recommended 29 conditions for inclusion on a uniform panel.
Although the ACMG report has been criticized for its methods, (20) the basic criteria are consistent with the Wilson and Jungner tradition. Newer aspects of the ACMG approach include a strong emphasis on tests on a multiplex platform. A multiplex platform is one in which results for multiple conditions are produced from a single analysis. In the context of newborn screening, this is primarily relevant to a technology called tandem mass spectrometry (MS/MS) that simultaneously analyzes blood for numerous biochemical properties. Therefore information about dozens of conditions that cause biochemical alterations in blood will be revealed through MS/MS analysis even if the clinician is only interested in results on a small number of conditions. In the future, newborn screening may be conducted with another multiplex platform technology--DNA chips. Screening for one condition on a chip that contains thousands of genetic markers would produce results for many genetic markers for which information was not sought. This produces complex ethical dilemmas when clinicians are forced to respond to information that may or may not be useful for the patient. (21)
A second but related innovative aspect of the ACMG report is the emphasis on so-called secondary targets. The working group identified 25 conditions that fell below the cut-off for inclusion on the primary core panel but would be conditions identified in the screening process for the primary conditions. That is, the multiplex platform for testing identifies an additional 25 conditions that do not, in themselves, qualify for screening based on the criteria established. These additional 25 conditions are, in general, poorly understood and may be related in some cases to conditions that qualify for inclusion on the uniform panel. The ACMG working group recommended that state programs mandate the reporting of all secondary conditions to health care providers. The net result is that the ACMG recommended that states mandate a total of 54 conditions for screening on the uniform panel--29 core conditions and 25 secondary conditions. The ACMG did not clearly articulate how states and clinicians are supposed to manage the complex communication process inherent in the secondary conditions.
The ACMG working group identified opinion and literature that supported the contention that conditions were amenable to treatment or amelioration beyond what other groups have been willing to acknowledge. (22) It is widely recognized that the literature is very limited for many rare conditions and there is an almost complete absence of data from randomized controlled trials of different treatment modalities. Carefully controlled research is an inherent challenge with rare conditions because even referral centers will see only a handful of affected children dispersed over time. Further, even if there is evidence of an effective treatment, there is little or no evidence of how a population screening approach works to bring the benefits of early intervention to affected children and their families. Finally, there is no consensus on what benefits to children count as sufficient to justify population screening.
The ACMG approach provides a scoring system for relative benefits ("prevents ALL negative consequences/prevents MOST negative consequences," etc) but this does not adequately capture the complexity of these conditions. Dionisi-Vici and colleagues note that the literature suggests that early identification of some organic acidurias (conditions on the ACMG recommended panel) leads to decreased early mortality but "[p]rogressive neurocognitive deterioration is almost invariably present" despite treatment. (23) These are devastating conditions that are very difficult to manage and children typically suffer profound neurological impairments. Hypothetically, what if the evidence shows that screening and early intervention are effective in delaying the age of serious impairment from 15 months to 30 months? The magnitudes of these benefits do not approach the benefits of screening for traditional conditions on screening panels such as PKU, hypothyroidism, or sickle cell disease. How should these types of benefits be weighed and judged when making policy decisions? Are they sufficient to warrant screening of the entire newborn population? As the ability to detect conditions expands, the debate needs to continue over whether there is clear evidence of benefit and how much benefit justifies population screening.
In summary, the rapid expansion in the number of tests on NBS panels in the US can be attributed to several factors. First, there is the existence of a new technology, MS/MS, which provides results on a large number of conditions through a single analysis of a dried bloodspot. Second, the criteria used to judge candidate conditions for screening remain highly subjective. Third, there is some evidence from clinicians that contemporary treatment modalities are effective in reducing morbidity and/or mortality for these complex conditions. This information has been extrapolated to conclude that population screening for such conditions also will be effective. But policy decisions about screening are being made with essentially no evidence about efficacy derived from studies that actually entail a population screening approach. Two exceptions are worth noting. Newborn screening for cystic fibrosis was evaluated through a randomized controlled trial beginning in the 1980's in Wisconsin. (24) It has taken approximately 20 years of follow-up examinations of these children and other studies to demonstrate that the benefits of screening are sufficient to warrant newborn screening. (25)
A second example is newborn screening for neuroblastoma. This is an uncommon cancerous tumor of young children that can be detected through a characteristic pattern of biochemicals in the blood secreted by the tumor. Despite international enthusiasm for newborn screening, two carefully designed trials of screening failed to show any benefit from early detection. (26) Neuroblastoma was a condition that appeared ideal for NBS prior to the actual evidence. This experience has not inhibited policy makers from using "colloquial evidence" in the ACMG report for decisions to rapidly expand newborn screening panels. (5)
This discussion illustrates that there is substantial work to be done on the articulation of criteria for newborn screening even under traditional approaches where direct benefit to the child is a primary objective. What does benefit to the child mean? How much benefit is enough to justify population screening in light of the risks of screening and the opportunity costs? How much evidence is sufficient to initiate a new program? In conjunction with these long-standing complexities, new, alternative criteria for newborn screening are under active consideration.
III. NEW CRITERIA FOR SCREENING
The emerging debate is focused on the question of whether screening might be justified even when benefits to the child are entirely unproven or unexpected. Dwayne Alexander, Chief of the Eunice Shriver National Institute for Child Health and Development at the NIH, and Peter van Dyke, Chief of HRSA, advocate a change in thinking about the criteria for newborn screening: "With the potential of greatly expanded testing, many have begun to question one standard tenet of newborn screening ... that it's appropriate to screen only for conditions for which an effective treatment already exists. (27) The tenet served a useful purpose in early years, but it's now being challenged as outmoded, because it fails to consider other benefits." Continuing on in that same article, they state: "The technology could be expanded to screen for additional disorders as mutational analysis or other multiplex technology becomes available, with decisions being based more on what not to screen for (perhaps Huntington disease) than on what to include." Alexander and van Dyke suggest a future word in which we might have an extraordinarily large number of tests on NBS panels, due in large measure to the technical ability to detect those conditions in blood spots and a potential change in attitude about the necessary criteria for screening.
What are the other benefits beyond those to the child directly that might be considered? There are four types of benefits that are proposed: 1) elimination of the "diagnostic odyssey," 2) the provision of reproductive risk information to parents, 3) fostering research with affected children, and 4) the developmental, psychological, and social benefits that occur from early disease detection. Bailey and colleagues have been articulate advocates of the fourth type of benefit in the context of children with conditions causing mental retardation or developmental delay. (28) They argue that, even in the absence of definitive treatments for many conditions associated with mental retardation and developmental delay, early intervention can lead to improved developmental and behavioral outcomes. Further, families benefit from the expanded availability of services and research about the conditions is fostered. I mention this fourth type of benefit first only to say that it is not different from traditional criteria. If early intervention leads to improved developmental outcomes, then those benefits are no different than benefits that might accrue from drug or dietary treatments that are tailored to the disease. These are potential benefits directly to the child and if they occur through early detection from a screening program, then such benefits are consistent with traditional criteria. Caveats here are that the benefits that occur through early intervention must still outweigh the potential harms and other costs of the screening program. Further, as noted above, the benefits from early intervention through screening must be greater than the benefits of interventions initiated following clinical detection of the condition. Third, the early interventions that provide benefit must be available and used by a meaningful number of the children detected through a screening program. All of these caveats are relevant to any treatment or prevention strategy employed after newborn screening. It should not be sufficient to demonstrate that early intervention is effective in selected groups under controlled study conditions--such benefits should be demonstrated in the "real world" through pilot studies of population screening for such conditions.
A. Eliminating the Diagnostic Odyssey
One central feature of newborn screening is the identification of children at birth rather than at an older age when symptoms develop. Many of the conditions targeted by newborn screening are metabolic disorders that tend to present clinically with vague or non-specific symptoms like vomiting, irritability, or lethargy. Symptoms may be episodic so that care providers may not see the child when the manifestations are the worst. The rare nature of the conditions also makes it challenging for clinicians to quickly identify the disease. Many parents of affected children report that it took months or years for physicians to make the correct diagnosis, perhaps with many struggles and journeys along the way as parents become increasingly frustrated and frantic for answers. For example, a diagnosis of Duchenne Muscular Dystrophy (DMD) typically is made 2 years after symptoms of weakness are first noted by parents. (29) DMD has been included on a few state and national NBS panels but is not on the ACMG panel. It represents a condition for which screening is possible but no early interventions exist that change the course of the disease. In a Colorado study, the diagnosis of cystic fibrosis (CF) without newborn screening was delayed to an average of 14.5 months without newborn screening. (30)
There is no question that the diagnostic odyssey is an extremely difficult experience for parents and family members, as well as for clinicians. And it is highly likely that elimination of this odyssey would be a net benefit to the parents of affected children although this has not been well studied through prospective randomized trials. The existing literature for conditions like CF suggests that parents experience considerable burdens through a delay in diagnosis. (31) One countervailing concern is the loss of the period of time when parents believe their child is healthy. This might be considered a time of blissful ignorance before worrisome symptoms develop. While this phenomenon also needs to be better studied, any adverse consequences of this loss of innocence are likely to pale in contrast to the difficulties parents face in coming to a correct diagnosis.
Baily and Murray also wonder whether early diagnosis in the absence of effective treatments leads to a therapeutic odyssey. (32) That is, perhaps parents in this circumstance search the web and travel from one practitioner to another, wherever some hope is offered. If so, elimination of the diagnostic odyssey would be of little benefit to parents who would still be searching for an answer for their child.
Benefits from elimination of the diagnostic odyssey are in proportion to the typical delays in clinical diagnosis. If clinicians were more astute, or had algorithms in place to do targeted screening on children with early suggestive symptoms, the diagnostic odyssey could be reduced or eliminated. Targeted screening is a common approach when a subgroup of the population can be identified that is at increased risk for the disease. Mammography is recommended for older women, not all women, because the incidence of breast cancer is substantially higher in older women and the test is more sensitive. Even though missed cases of breast and colon cancer are among the top reasons for malpractice suits for internists, full population screening has not been justified on this basis. Unfortunately, the debate over newborn screening has not prompted any significant discussion of alternatives to population screening. Targeted screening approaches should be carefully considered for some conditions before the diagnostic odyssey is promoted as a partial justification for NBS.
The question in this context is how much weight to give to elimination of the diagnostic odyssey in policy decisions about NBS. I will raise two points here for further elaboration below. First, screening is not risk free. Many parents will be told their child has an abnormal laboratory test and needs further evaluation only to learn that the initial test was a false positive. This is an expected outcome of all screening programs for rare or uncommon conditions. Further, we know that a subset of parents will experience negative repercussions from false positive results that can persist long after the parents are told the child is healthy. In the context of a condition for which there is no treatment, the benefits of an elimination of the diagnostic odyssey for parents of affected children must be weighed against the negative impacts of receiving false positive results or ambiguous results on a larger group of parents.
The second consideration is that NBS is conducted without parental permission in all but two states (Maryland and Wyoming) and the District of Columbia. As noted, the traditional justification for mandating newborn screening is that the benefits to the child are so substantial that the state can override parental authority regarding testing. If the rationale for screening is the elimination of the diagnostic odyssey for parents, the justification for mandatory screening of the child cannot be sustained. For these reasons, elimination of the diagnostic odyssey is an important but secondary benefit of NBS. It should not be considered a primary justification for mandatory population screening.
B. Reproductive Information for Parents
Most of the conditions targeted by NBS are genetic in etiology and most exhibit an autosomal recessive pattern of inheritance. For recessive conditions, each parent is a carrier of a mutation and both parents must contribute a mutated copy of the gene to produce a child with the condition. Each pregnancy of two such carriers has a 25% chance of resulting in an affected child. One hallmark of conditions potentially appropriate for newborn screening is that the conditions are not evident at birth. Without screening, the correct diagnosis may be delayed for months or years. In the meantime, the parents may have additional affected children. Therefore, a potential advantage of newborn screening is that parents are alerted to their reproductive risk prior to the birth of a second affected child.
While theoretically attractive, the magnitude of this benefit is unclear in the US population. A benefit would consist of sufficient information to allow a decision that parents would not have made otherwise. In this context, this means either deciding not to have additional children after the first affected child, or using prenatal diagnosis and pregnancy termination in the event of an affected fetus. A study in Australia published in 2000 indicated that two thirds of parents of children with CF detected through newborn screening had additional children and two thirds of those chose to have prenatal diagnosis for CF. (33) Many of those who did not have additional pregnancies indicated that the possibility of an additional affected child was a significant reason. Of those pursuing prenatal diagnosis, the majority either terminated an affected pregnancy or said they would do so. The US experience may be somewhat different. Mischler et al. found in a 1998 study of parents of children with CF identified through NBS that 70% of families had additional children, 26% of those families utilized prenatal diagnosis, but none of the 3 affected pregnancies were terminated. (34)
Overall, the data are limited on the behavioral responses of parents to reproductive information provided through newborn screening. We would expect responses to be influenced by the severity of the condition, the availability of counseling services and prenatal diagnosis, and cultural values. It is probably fair to conclude that NBS provides valuable information for many parents about reproductive risk and that many parents utilize this information for making reproductive decisions.
The relevant question for our purposes is the weight to be afforded this potential benefit of newborn screening. One significant disadvantage of providing reproductive risk information through NBS is that this requires one affected child to be born before parents are alerted to their risk. If risk detection is a priority, then carrier detection in adults prior to pregnancy or during pregnancy would be more timely. The American College of Obstetrics and Gynecology recommended that carrier identification for CF and hemoglobinopathies be offered to adults with at-risk heritage prior to or during pregnancy. (35) Currently rates of carrier testing during pregnancy for these conditions are not consistently high due both to inconsistent offers by clinicians and limited demand by couples. In the future, the need for newborn screening may decline if prenatal carrier screening followed by prenatal diagnosis for conditions like CF is utilized by a high proportion of the population. (36) Why screen every newborn if most parents know the status of their fetus before delivery?
For most other hereditary conditions targeted by newborn screening programs, there has not been a concerted effort to develop prenatal screening approaches. There are significant technical challenges in this regard. As noted, newborn screening technologies often work by identifying an abnormal accumulation of biochemicals in the infant's bloodstream. This approach generally is not useful for the detection of unaffected adult carriers before or during pregnancy because carriers do not have the same accumulation of biochemicals as affected individuals. If screening moves to a DNA-based platform, then prenatal screening of couples for the same conditions targeted by newborn screening may become feasible. Technical challenges aside, many couples who would not terminate a pregnancy will decline prenatal diagnosis with the knowledge that affected children will be identified at birth through NBS. Therefore, while there is some overlap between prenatal and neonatal screening, newborn screening will not be replaced by prenatal screening and diagnosis in the foreseeable future.
From an ethical perspective, providing genetic information relevant to the parents by testing their children raises obvious concerns. This discussion has focused on reproductive information but note that the same rationale could be offered for other types of genetic testing. Newborn screening is relatively efficient because it captures almost all babies in the population through well-established programs. If genetic testing for risk of, say, colon cancer is beneficial for adults, as the current literature suggests, why not screen infants for the relevant genes in order to detect at-risk parents? Our answer since the advent of modern genetic testing hinges on the recognition that genetic testing often provides information that has a powerful impact on individuals for both good and bad. The literature clearly shows that many adults do not want predictive genetic testing. Therefore genetic testing generally should be conducted only with the informed consent of the individual being tested, or their parents in the case of pediatric testing. Newborn screening has been an exception to this rule. But in a situation where testing provides a significant benefit to the parents rather than the child, the parents themselves should consent to testing. In the case of adult onset diseases like colon cancer, genetic testing should be offered to adults themselves and provided with their informed consent even though this may be less efficient than screening all newborns on their behalf. The efficiency of testing newborns for the benefit of their parents does not justify this backdoor approach. Similarly with respect to reproductive information for parents, mandatory newborn screening is not ethically appropriate when parents could be tested directly or asked their consent for screening of their child for information relevant to their future pregnancies.
Traditionally, the provision of reproductive risk information to parents has been considered a secondary benefit of NBS. That is, it has been considered an important benefit but not one that alone justifies newborn screening. Those who are the most forceful advocates of expanded newborn screening are not arguing otherwise. The new question is whether a number of these secondary benefits together are sufficient to justify screening in the absence of an anticipated benefit to the child. As we have seen, justification by the accumulation of secondary benefits is stymied by the lack of informed permission for newborn screening.
C. Research Promotion
The fourth proposed justification for newborn screening in the absence of clear benefit to the child is the value screening programs provide for the understanding of rare conditions. Research on many conditions potentially amenable to newborn screening is extremely difficult due to their low prevalence in the population. This means that even the most senior expert in the field will have only limited experience with any one condition. Research on the best available treatments will be frustrated by the inability to recruit enough children in the same time frame to compare competing approaches. Further, medical technology changes over time so it is difficult to know whether the outcomes of children evaluated with a condition in, say, the 1980s are still relevant to children with the condition today.
Inclusion of such rare conditions on screening programs provides a number of opportunities to better understand the diseases. First and foremost, affected children are identified shortly following birth using relatively consistent case definitions. These cases can be identified through health department records rather than through the records of health providers scattered throughout the health care system. Following these children from birth provides the chance to understand the natural history of the disease, or the course of the disease under the different treatment modalities commonly used. Second, screening identifies children across the range of severity of the condition. Often rare diseases are characterized by the signs and symptoms of individuals who are the most severely affected. These are the people who come to clinical attention most readily. However, many conditions exhibit a wide range of severity, including individuals who are so mildly affected that they would not be clinically identified through symptoms alone. Screening the entire population identifies these mild cases as well as the more classic cases. This can be a significant benefit to those trying to understand the condition and to those who might otherwise go undiagnosed with a treatable condition. But it also can be burden to some individuals who are labeled as "having a disease" despite having such a mild form that they were never destined to be ill. Screening programs can identify many more individuals who are affected with a condition in a population than were recognized prior to screening. (37) A third advantage of screening from a research perspective is that enough children can be identified across the population to enable recruitment into clinical trials to formally test new treatments.
All of these potential advantages provided by screening with respect to research are unassailable, except for the fact that the research infrastructure does not currently exist to realize such benefits. There is no national registry or system in place to acquire data on children with conditions identified through NBS, nor systems to enable recruitment into controlled trials. Children are being identified with these rare conditions but for the most part the clinical data remains locked within their medical records in offices scattered across the country. Currently the Eunice Shriver National Institute for Child Health and Development at the NIH is planning a national network of research centers that may provide exactly this type of research support for newborn screening. But this network will not be in place for several years. Therefore it is disingenuous to argue that new newborn screening tests are justified today based on the opportunities for research. Efforts to develop such an infrastructure should take priority over the expansion of NBS to include conditions for which there is not clear evidence of benefit to the children. (7)
IV. THE COSTS AND RISKS OF NEWBORN SCREENING
Policy decisions need to be made in the light of a full assessment of the benefits, costs, and risks of the options available. This discussion has focused on the various benefits potentially provided by newborn screening. The risks of NBS are primarily two-fold: false positive results and results of unknown clinical significance. False positive results occur with every screening program due to the inability of any test in a human system to function flawlessly. Tests will have a sensitivity that is less than 100% (the ability to identify those with the condition) and a specificity that is less than 100% (the ability to identify those without the condition). When the specificity of a test is not perfect, an unaffected individual will occasionally produce a positive test result--a false positive. Initial positive screening tests are usually followed by additional testing to separate the true positive from the false positive results. For uncommon conditions, it is typical for a test to yield 10 to 50 false positive results for every true positive. That is, the "positive predictive value" of newborn screening tests are 10% or less. Again, this is a feature of virtually all screening programs and is not due to any particular limitation of tests in the context of newborn screening.
Much of the cost of many screening programs arises from the need to contact individuals with initial positive screens and conduct confirmatory testing. But the cost of false positive results is not the primary concern. The primary concern is the psychological impacts of this frightening information. Parents are understandably alarmed when they are initially informed that the newborn screening test is positive. For some parents, the distress this causes does not resolve completely when further testing determines that the child is healthy. A substantial number of parents will have residual anxiety about the health of their child that can last months or years. (19)
As noted, false positive results are an expected outcome of screening programs and it is incumbent on programs to reduce the burden of this phenomenon by maximizing the quality of the tests and by providing results in a sensitive, timely and accurate manner. The point of this discussion is that the relative burden of false positive results looms larger as the benefits of screening grow smaller. For programs like PKU screening, the benefits to the affected children far outweigh the burdens to parents who initially receive false positive results. But when screening provides limited or no benefits to the child, it is possible for the burdens of the screening program to outweigh the benefits. If the benefits of screening are primarily the secondary benefits of a reduction in the diagnostic odyssey, the provision of reproductive information for parents, and the promotion of research opportunities, it is quite possible that policy makers would consider the burdens of false positive results to be excessive and not justifiable.
A second risk of screening programs is the identification of individuals with mild versions of the condition or versions of unknown clinical significance. The severity of many conditions exists along a broad spectrum. This is due to the fact that mutations in genes associated with these diseases may reduce the function of the gene by variable amounts. There also may be other genes or environmental factors that can exacerbate or ameliorate the condition. Occasionally benign conditions can mimic more severe diseases on the test results. This was a poorly understood phenomenon during the early years of PKU screening. Hyperphenylalaninemia is a condition that looks like PKU on the initial test result but does not lead to mental retardation. But if children with hyperphenylalaninemia are treated with the dietary restrictions appropriate for PKU, they can suffer significant harm. When rare conditions are poorly understood, infants with benign or mild variants may not be distinguishable from infants with more severe variants, and if they are treated aggressively on the assumption that they are classic cases of severe disease, they may be harmed as a result.
The broader point is that screening programs may have significant negative impacts on some individuals due to false positive results and due to the stigma and overtreatment of individuals with benign or mild variants of the condition. These are unfortunate outcomes to be minimized but tolerated when screening provides substantial benefits to those affected with the disease. But when the benefits of screening are unsubstantiated, these potential negative impacts demand great caution in the introduction of new screening modalities. When there are no benefits of screening to the affected child, then these negative impacts may outweigh the benefits to parents through the elimination of the diagnostic odyssey, the provision of reproductive information, and/or the promotion of research.
The final harms or costs to newborn screening programs are the opportunity costs. As we have seen, well-established screening programs like sickle cell disease function at partial efficiency. Public health programs in the US are poorly funded in many states. Given this situation, more lives may be saved by devoting full attention and resources to programs that we know can be effective rather than adding more and more tests for poorly understood conditions. It also is worth emphasis that the US ranks about 29th in the developed world for infant mortality. (38) This ranking is not due to the relative absence of newborn screening tests where the US ranks at or near the top. Pouring additional resources and expertise into NBS will not improve this ranking. Again, should health departments be marshalling resources to progressively expand newborn screening, or should these resources be spent in other, more productive areas for the welfare of children?
Newborn screening programs impact every child born in the US and in the developed world. Accordingly, the discussion of criteria for screening should engage a wide spectrum of experts, scholars, advocates and lay individuals. But the analyses and opinions of these debaters are only as good as the data on which policy decisions must be made. Without a better understanding of all of the hypothetical benefits and risks discussed above, debates will continue endlessly as either worthless programs proliferate or valuable programs are stymied. The first priority must be the creation of a research infrastructure that will enable the pilot introduction of new tests with the accumulation of data to define efficacy, and to identify weak elements in programs that prove to be partially effective. A central criterion for the inclusion of new tests must be that sufficient data exist on which to make an informed decision about population screening.
With respect to the newer criteria discussed above, the challenge of moving from a status as a secondary criterion to a primary criterion is the absence of informed consent. Benefits afforded to parents do not justify mandatory screening of their children. If we wish to promote these benefits to parents as sufficient for screening, then this must be done in a context of an offer of screening to parents who will make an informed choice. If we value the efficiency of a mandatory approach to screening, then substantial benefit to the child must remain a central criterion for new programs.
(1) AAP Newborn Screening Task Force, Serving the Family from Birth to the Medical Home: Newborn Screening: A Blueprint for the Future A Call for a National Agenda on State Newborn Screening Programs, 106 PEDIATRICS 389 (2000).
(2) Bradford L. Therrel & John Adams, Newborn Screening in North America, 30 J. INHERITABLE METABOLIC DISEASE 447, 452 (2007).
(3) U.S. GEN. ACCOUNTING OFFICE, NEWBORN SCREENING: CHARACTERISTICS OF STATE PROGRAMS 22 (2003), www.gao.gov/cgi-bin/getrpt?GAO-03-449 (last visited Jan. 29, 2009).
(4) Nancy Press & Ellen Wright Clayton, Genetics and Public Health: Informed Consent Beyond the Clinical Encounter, in GENETICS AND PUBLIC HEALTH IN THE 21ST CENTURY: USING GENETIC INFORMATION TO IMPROVE HEALTH AND PREVENT DISEASE 505, 516 (Muin J. Khoury et. al. eds., 2000) ("Parents need to be involved in making choices because they can function better if they know what is going on and because they bear the consequences most directly when something adverse happens to their children.").
(5) Jenifer Palmer, Omaha Court Case Widens From Screening Test to Baby's Meals, OMAHA WORLD-HERALD, Oct. 13, 2007, available at http://www.omaha.com/index.php?u_page=2798&u._sid=10157077.
(6) J. M. G. WILSON & G. JUNGNER, WORLD HEALTH ORG., PRINCIPLES AND PRACTICE OF SCREENING FOR DISEASE (1968), available at http://www.who.int/bulletin/volumes/86/4/07-050112bp.pdf.
(7) Anne Andermannet et al., Revisiting Wilson and Jungner in the Genomic Age: A Review of Screening Criteria Over the Past 40 Years, 86 BULL. OF THE WHO 317 (2008).
(8) R. J. Pollitt, Introducing New Screens: Why Are We All Doing Different Things?, 30 J. INHERITED METABOLIC DISEASE 423, 423 (2007).
(9) U.S. Preventive Services Task Force, Screening for Breast Cancer, http://www.ahrq.gov/clinic/3rduspstf/breastcancer/brcanrr.htm (last visited Jan. 11, 2009).
(10) U.S. Preventive Services Task Force, Screening for Prostate Cancer, http://www.ahrq.gov/clinic/uspstf08/prostate/prostatesum.htm#contents (last visited Jan. 11, 2009).
(11) U.S. Preventive Services Task Force, Screening for High Blood Pressure, http://www.ahrq.gov/clinic/uspstf/uspshype.htm (last visited Jan. 11, 2009).
(12) U.S. Preventive Services Task Force, Screening for Cervical Cancer, http://www.ahrq.gov/clinic/uspstf/uspscerv.htm (last visited Jan. 11, 2009).
(13) U.S. Preventive Services Task Force, Screening for Iron Deficiency Anemia, http://www.ahrq.gov/clinic/uspstf/uspsiron.htm (last visited Jan. 11, 2009).
(14) U.S. Preventive Services Task Force, Screening for Elevated Blood Lead Levels in Children and Pregnant Women, http://www.ahrq.gov/clinic/ uspstf/uspslead.htm (last visited Jan. 11, 2009) (concluding that "evidence is insufficient to recommend for or against routine screening for elevated blood lead levels in asymptomatic children aged 1 to 5 who are at increased risk.... [And recommending] against routine screening for elevated blood lead levels in asymptomatic children aged 1 to 5 years who are at average risk.").
(15) Centers for Disease Control & Prevention, Update: Newborn Screening for Sickle Cell Disease--California, Illinois, and New York, 1998, 49 MORBIDITY & MORTALITY WKLY. REP. 729, 729 (2000); Accord Colin M. Sox et al., Provision of Pneumococcal Prophylaxis for Publicly Insured Children with Sickle Cell Disease, 290 JAMA 1057, 1061 (2003) (finding that children who should receive prophylaxis "were dispensed so little prophylactic medication").
(16) Jeffrey R. Botkin, Research for Newborn Screening: Developing a National Framework, 116 PEDIATRICS 862, 863 (2005).
(17) See generally James E. Haddow & Glenn E. Palomaki, ACCE: A Model Process for Evaluating Data on Emerging Genetic Tests, in HUMAN GENOME EPIDEMIOLOGY: A SCIENTIFIC FOUNDATION FOR USING GENETIC INFORMATION TO IMPROVE HEALTH AND PREVENT DISEASE, 217 (Muin J. Khoury et. al. eds., 2004).
(18) Beth A. Tarini et al., State Newborn Screening in the Tandem Mass Spectrometry Era: More Tests, More False-Positive Results, 118 PEDIATRICS 448, 450-54 (2006).
(19) Michael S. Watson et al., Newborn Screening: Toward a Uniform Screening Panel and System, 8 GENETICS IN MEDICINE 1S, 2S (2006).
(20) See Virginia A. Moyer et al., Expanding Newborn Screening: Process, Policy, and Priorities, HASTINGS CENTER REP., May-June 2008, at 32, 33 (criticizing the ACMG because "[t]he original ACMG process did not conform to contemporary standards of evidence-based decision-making"); see also Michael S. Watson et al., Newborn Screening: Toward a Uniform Screening Panel and System, 8 GENETICS IN MEDICINE 1S, 2S (2006).
(21) See generally Ellen Wright Clayton, Incidental Findings in Genetics Research Using Archived DNA, 36 J.L. MED. & ETHICS 286 (2008) (discussing ethical dilemmas that doctor's face following DNA testing).
(22) See generally Pollitt, supra note 8.
(23) Carlo Dionisi-Vici et al., "Classical" Organic Acidurias, Propionic Aciduria, Methylmalonic Aciduria and Isovaleric Aciduria: Long-term Outcome and Effects of Expanded Newborn Screening Using Tandem Mass Spectrometry. 29 J. INHERITED METABOLIC DISEASE 383, 383 (2006).
(24) Philip M. Farrell et al., Early Diagnosis of Cystic Fibrosis Through Neo natal Screening Prevents Severe Malnutrition and Improves Long-Term Growth, 107 PEDIATRICS I (2001).
(25) See Scott D. Grosse et al., Newborn Screening for Cystic Fibrosis: Evaluation of Benefits and Risks and Recommendations for State Newborn Screening Programs, MORBIDITY & MORTALITY WKLY. REP.: RECOMMENDATIONS & REP., Oct. 15, 2004, at 1 passim.
(26) William G. Woods et al., Screening of Infants and Mortality Due to Neuroblastoma, 346 NEW ENG. J. MED. 1041. 1045 (2002) ("It is more likely that screening has no effect on mortality due to neuroblastoma."); Freimut H. Schilling et al., Neuroblastoma Screening at One Year of Age, 346 NEW ENG. J. MED. 1047, 1052 (2002) ("Our findings do not support mass screening for neuroblastoma at one year of age.").
(27) Duane Alexander & Peter C. van Dyek, A Vision of the Future of Newborn Screening, 117 Pediatrics S350, S352 (2006).
(28) Donald B. Bailey Jr., Debra Skinner, & Steven F. Warren, Newborn Screening for Developmental Disabilities: Refraining Presumptive Benefit, 95 AM. J. PUB. HEALTH 1889, 1889 (2005).
(29) Lainie Friedman Ross, Screening for Conditions That Do Not Meet the Wilson and Jungner Criteria: The Case of Duchenne Muscular Dystrophy, 140A AM. J. MED. GENETICS 914, 915-16 (2006).
(30) Frank J. Accurso, Marci K. Sontag, & Jeffrey S. Wagener, Complications Associated with Symptomatic Diagnosis in Infants with Cystic Fibrosis, 147 J. PEDIATRICS S37, S38-S39 (2005).
(31) See Philip M. Farrell et al. Bronchopulmonary Disease in Children with Cystic Fibrosis After Early or Delayed Diagnosis, 168 AM. J. RESPIRATORY & CRITICAL CARE M.ED. 1100 (2003).
(32) Mary Ann Baily & Thomas H. Murray, Ethics, Evidence, and Cost in Newborn Screening: Would Resources Spent on Screening be Better Spent Else where?, 38 HASTINGS CENTER REP. 23, 28-29 (2008).
(33) Tracy Dudding et al., Reproductive Decisions After Neonatal Screening Identifies Cystic Fibrosis, 82 ARCHIVES DISEASE CHILDHOOD FETAL NEONATAL EDITION F124, F125 (2000).
(34) Elaine H. Mischler et al., Cystic Fibrosis Newborn Screening: Impact on Reproductive Behavior and Implications for Genetic Counseling, 102 PEDIATRICS 44, 44 (1998).
(35) American College of Obstetricians and Gynecologists Committee on Genetics, Update on Carrier Screening for Cysstic Fibrosis, 106 OBSTETRICS & GYNECOLOGY 1465 (2005).
(36) Bridget Wilcken, Letter to the Editor, Community-Wide Screening for Cystic Fibrosis Carriers Could Replace Newborn Screening for the Diagnosis of Cystic Fibrosis, 44 J. PEDIATRICS & CHILD HEALTH 232, 232 (2008).
(37) Bridget Wilcken et al., Screening Newborns for Inborn Errors of Metabolism by Tandem Mass Spectrometry, 348 NEW ENG. J. MED. 2304, 2308 (2003).
(38) NAT'L CTR. FOR HEALTH STATISTICS, HEALTH, UNITED STATES, 2007 WITH CHARTBOOK ON TRENDS IN THE HEALTH OF AMERICANS 172 (2007) available at http://www.cdc.gov/nchs/data/hus/hus07.pdf.
Jeffrey R. Botkin, M.D., M.P.H., Professor of Pediatrics and Medical Ethics, Associate Vice President for Research, University of Utah School of Medicine.
TABLE I. WILSON AND JUNGNER SCREENING CRITERIA 1. The condition sought should be an important health problem 2. There should be an acceptable treatment for patients with recognized disease 3. Facilities for diagnosis and treatment should be available 4. There should be a recognizable latent or early symptomatic stage 5. There should be a suitable test or examination 6. The test should be acceptable to the population 7. The natural history of the condition, including development from laten to declared disease, should be adequately understood 8. There should be an agreed policy on whom to treat as patients 9. The cost of case-finding (including diagnosis and treatment of patients diagnosed) should be economically balanced in relation to possible expenditure on medical care as a whole 10. Case-finding should be a continuing process and not a "once and for all" project TABLE II: ACMG CRITERIA FOR NEWBORN SCREENING Incidence of >1:5,000 100 Condition >1:25,000 75 >1:50,000 50 >1:75,000 25 <1:100,000 0 Signs & Symptoms Never 100 clinically <25% of cases 75 identifiable in <50% of cases 50 the fist 48 hours <75% of cases 25 Always 0 Burden of disease Profound 100 if untreated Severe 75 Moderate 50 (Natural history Mild 25 if untreated) Minimal 0 Does a sensitive AND YES 200 specific screening NO 0 test currently exist? Test Doable in neonatal 100 Characteristics blood spots OR by (Yes = apply score a simple, in nursery No = zero) physical method High Throughput 50 >200/day/FTE) Overall analytical 50 cost < 1$ per test per condition Multiple analytes relevant to one 50 condition are detected in same run Other conditions 50 identified by same analytes Multiple conditions 200 detected by same test (multiplex platform) Availability of Treatment exists and 50 treatment is widely available in most communities Treatment exists but 25 availability is limited No treatment available 0 or necessary Cost of treatment Inexpensive 50 Expensive 0 (>$50,000/patient/ year) Potential To prevent ALL 200 efficacy of negative consequences existing treatment To prevent MOST l00 negative consequences To prevent SOME 50 negative consequences Treatment efficacy 0 not proven Benefits of early Clear scientific 200 intervention evidence that (INDIVIDUAL intervention OUTCOME) resulting from screening optimize outcome Some scientific 100 evidence that early intervention resulting from screening optimizes outcome No scientific 0 evidence that early intervention resulting from screening optimizes outcome Benefits of early Early identification 100 identification maximizes benefits (FAMILY & SOCIETY) (education, understanding prevalence and natural history, cost effectiveness Early intervention 50 improves benefits No evidence of 0 benefits Early diagnosis YES 100 and treatment NO 0 prevent mortality Diagnostic Providers of 100 Confirmation diagnostic confirmation are widely available Limited availability 50 of providers of diagnostic confirmation Diagnostic confirmation 0 is available only in a few centers Clinical Providers of acute 100 management management are widely available Limited availability 50 of providers of acute management Acute management is 0 available only in a few centers Simplicity of Management at the 200 therapy primary care or family level Requires periodic 100 involvement of a specialist Requires regular involvement of a 0 specialist
|Printer friendly Cite/link Email Feedback|
|Author:||Botkin, Jeffrey R.|
|Date:||Jan 1, 2009|
|Previous Article:||Systems to determine treatment effectiveness in newborn screening.|
|Next Article:||On treatability: considerations of treatment in the context of newborn screening.|