Printer Friendly
The Free Library
14,504,020 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Comparing aberration detection methods with simulated data.


We compared aberration detection methods requiring historical data to those that require little background by using simulated data. Methods that require less historical data are as sensitive and specific as those that require 3-5 years of data. These simulations can determine which method produces appropriate sensitivity and specificity.

**********

The Early Aberration Reporting System (EARS) was developed to allow analysis of public health surveillance data. Several alternative aberration detection methods are available to state and local health departments for syndromic surveillance. Before 2001, most statistical aberration detection methods required at least 5 years of background data (1-6). However, with the release of Bacillus anthracis Bacillus anthracis Infectious disease A gram-positive organism which causes often fatal infections when its endospores–resistant to heat, drying, UV light, gamma radiation, and many disinfectants–enter the body and cause septicemia Military medicine  in the U.S. mail shortly after the September 11, 2001, World Trade Center attacks, substantial interest has emerged in public health tools that could be rapidly implemented without requiring years of background data. Newly developed nonhistorical aberration detection methods can require as little as 1 week of data to begin analysis, although they have not been extensively evaluated against traditional historical methods (7,8).

The objective of our study was to determine the sensitivity, specificity, and time to detection of 3 methods that require <3 years of historical baseline data, C1--MILD (C1), C2--MEDIUM (C2), and C3--ULTRA (C3), and compare the results with those of 2 methods that require 5 years of historical baseline, the historical limits method (2) and the seasonally adjusted Seasonally adjusted

Mathematically adjusted by moderating a macroeconomic indicator (e.g., oil prices/imports) so that relative comparisons can be drawn from month to month all year.
 cumulative sum (CUSUM) (5), based on simulated data. Simulated data were used to avoid some of the interpretation difficulties that can come from making these comparisons on the basis of empirically observed, natural disease data. All 5 of these methods are components of EARS (7).

The Study

The methods C1, C2, and C3 were named according to according to
prep.
1. As stated or indicated by; on the authority of: according to historians.

2. In keeping with: according to instructions.

3.
 their degree of sensitivity, with C1 being the least sensitive and C3 the most sensitive. All 3 methods are based on a positive 1-sided CUSUM calculation. For C1 and C2, the CUSUM threshold reduces to the mean plus 3 standard deviations In statistics, the average amount a number varies from the average number in a series of numbers.

(statistics) standard deviation - (SD) A measure of the range of values in a set of numbers.
 (SD). The mean and SD for the C1 calculation are based on information from the past 7 days. The mean and SD for the C2 and C3 calculations are based on information from 7 days, ignoring the 2 most recent days. These methods take into consideration daily variation because the mean and SD used by the methods are based on a week's information. These methods also take seasonality into consideration because the mean and SD are calculated in the same season as the data value in question.

Since 1989, results from the historical limits method have been used to produce Figure 1 in the Morbidity and Mortality Weekly Report Morbidity and Mortality Weekly Report (MMWR) is a weekly epidemiological digest for the United States published by the Centers for Disease Control and Prevention. The 5 June 1981 issue of the MMWR published the cases of five men in what turned out to be the first report of AIDS. . This method compares the number of reported cases in the 4 most recent time periods for a given health outcome with historical incidence data on the same outcome from the preceding 5 years; the method is based on comparing the ratio of current reports with the historical mean and SD. The historical mean and SD are derived from 15 totals of 3 intervals (including the same 4 periods, the preceding 4 periods, and the subsequent 4 periods over the preceding 5 years of historical data).

[FIGURE 1 OMITTED]

The seasonally adjusted CUSUM method is based on the positive 1-sided CUSUM where the count of interest is compared to the 5-year mean and the 5-year SD for that period. The seasonally adjusted CUSUM was originally applied to laboratory-based Salmonella salmonella

Any of the rod-shaped, gram-negative, non-oxygen-requiring bacteria that make up the genus Salmonella. Their main habitat is the intestinal tract of humans and other animals.
 serotype serotype /se·ro·type/ (ser´o-tip) the type of a microorganism determined by its constituent antigens; a taxonomic subdivision based thereon.

se·ro·type
n.
See serovar.

v.
 data.

To calculate sensitivity, specificity, and time to detection, all 5 detection methods of EARS were used to independently analyze 56,000 sets of artificially generated case-count data based on 56 sets of parameters. These 56 sets of parameters each generated 1,000 iterations of 6 years of daily data, 1994-1999, by using a negative binomial distribution In probability and statistics the negative binomial distribution is a discrete probability distribution. The Pascal distribution and the Polya distribution are special cases of the negative binomial.  with superimposed su·per·im·pose  
tr.v. su·per·im·posed, su·per·im·pos·ing, su·per·im·pos·es
1. To lay or place (something) on or over something else.

2.
 outbreaks. Means and standard deviations were based on observed values from national and local public health systems and syndromic surveillance systems. Examples of the data included national and state pneumonia pneumonia (nmōn`yə), acute infection of one or both lungs that can be caused by a bacterium, usually Streptococcus pneumoniae  and influenza influenza or flu, acute, highly contagious disease caused by a virus; formerly known as the grippe. There are three types of the virus, designated A, B, and C, but only types A and B cause more serious contagious infections.  data and hospital influenzalike illness. Adjustments were made for days of the week, holidays, postholiday periods, seasonality, and trend. Any 6 years could be used, but the years 1994 1999 were used to set day of the week and holiday patterns and to avoid any problems that programs might have with the year 2000. Fifty (89%) of these datasets then had outbreaks superimposed throughout the data. Three types of outbreaks were used, each representing various types of naturally occurring events: log normal, a rapidly increasing outbreak; inverted inverted

reverse in position, direction or order.


inverted L block
a pattern of local filtration anesthesia commonly used in laparotomy in the ox.
 log normal, a slowly starting outbreak; and a single-day spike A burst of extra voltage in a power line that lasts only a few nanoseconds. See power surge, power swell, sag and surge suppression.

(jargon) spike - To defeat a selection mechanism by introducing a (sometimes temporary) device that forces a specific result.
. These types of outbreaks were combined with different SDs and incubation incubation /in·cu·ba·tion/ (in?ku-ba´shun)
1. the provision of proper conditions for growth and development, as for bacterial or tissue cultures.

2.
 times to create 10 different types of outbreaks that had equal probability of being included in the simulated data. A year of final simulated data can be seen in the Figure, with original data and outbreaks that were added. As a result of these analyses, the statistically marked aberrations, or flags, produced by the 5 detection methods were evaluated for their specificity, sensitivity, and time to detection. These data can be obtained at http://www.bt.cdc.gov/surveillance/ears/datasets.asp.

In our study, sensitivity was defined as the number of outbreaks in which [greater than or equal to] 1 day was flagged, divided by the total number of outbreaks in the data. An outbreak was defined as a period of consecutive days in which varying numbers of aberrant aberrant /ab·er·rant/ (ah-ber´ant) (ab´ur-ant) wandering or deviating from the usual or normal course.

ab·er·rant
adj.
1.
 cases were added to the baseline number of cases. An outbreak had days before and after it when no aberrant cases were added to the baseline case counts. Specificity was defined as the total number of days that did not contain aberrant cases (and that were not flagged), divided by the total number of days that did not contain aberrant cases. Based on these definitions, actual values for sensitivity and specificity were calculated.

Time to detection was defined as the number of complete days that occurred between the beginning of an outbreak and the first day the outbreak was flagged. For example, if a method flags an outbreak on the first day, its time to detection is 0. Likewise, if it flags on the second day, its time to detection is 1, and so on. Time to detection is an average of the times to detection for each outbreak and dataset. Only outbreaks that were flagged on at least 1 day were included in the average. Therefore, sensitivity is needed to completely interpret time to detection. We calculated 2-sided 95% confidence values, and they were relatively small and consistent.

Overall, the CUSUM methods (the seasonally adjusted CUSUM, C1, C2, and C3) had similar times to detection, but their sensitivities varied (Table). Specifically, C1, C2, and C3 showed increasing sensitivity from 60% to 71% to 82%, respectively. The seasonally adjusted CUSUM and C3 methods had similar sensitivities, 82.5% and 82.3%, but C3 had a higher specificity, 88.7% and 95.4%. The historical limits and C1 and C2 methods showed varying sensitivities (44%-71%), with C1 and C2 having the highest, but all demonstrated similar specificities (96%-97%).

When results were stratified stratified /strat·i·fied/ (strat´i-fid) formed or arranged in layers.

strat·i·fied
adj.
Arranged in the form of layers or strata.
 by outbreak type, 1-day outbreaks (i.e., spikes spikes

see peplomer.
) exhibited the lowest sensitivities. Analysis was broken down by dataset and outbreak type (online Appendix Tables 1 and 2, available at http://www. cdc.gov/ncidod/EID/vol11no02/04-0587_appl.htm and http://www.cdc.gov/ncidod/EID/vol11no02/04-0587_app2.htm).

For the 6 datasets that contained noise but no outbreaks, no sensitivity or time to detection exist to calculate. The overall specificity for the seasonally adjusted CUSUM, historical limits, C1, C2, and C3 were 88.7%, 98.3%, 97.2%, 97.2%, and 95.2%, respectively. The specificity for these 6 datasets was consistent with general results. The historical limits method showed superior specificity in all but the last dataset.

Conclusions

These simulations demonstrate that the methods for aberration detection that require little baseline data, C1, C2, and C3, are as sensitive and specific as the historical limits and seasonally adjusted CUSUM methods. As expected, C1, C2, and C3 showed increasing sensitivities in accordance Accordance is Bible Study Software for Macintosh developed by OakTree Software, Inc.[]

As well as a standalone program, it is the base software packaged by Zondervan in their Bible Study suites for Macintosh.
 with their intended sensitivity levels (C1 being the least sensitive, C3 being the most), but with decreasing specificities as sensitivity increases. Seasonally adjusted CUSUM and the historical limits method also showed sensitivities and specificities as expected, with the seasonally adjusted CUSUM having the lower specificity and higher sensitivity. These findings emphasize the effectiveness of aberration detection methods without requiring long-term historical data as a baseline.

Since the 10 simulated outbreaks were randomly generated by using consistent rates, the sensitivity, specificity, and time to detection could be stratified by dataset and outbreak type. The results of these analyses were largely congruent con·gru·ent  
adj.
1. Corresponding; congruous.

2. Mathematics
a. Coinciding exactly when superimposed: congruent triangles.

b.
 with the expected findings, with some variations. The simulated datasets are designed for public health officials to select a dataset that best reflects their data of interest or the type of outbreak they are anticipating to determine which method provides them with the sensitivity and specificity they would find useful. The simulated datasets can also be used to make comparisons with other methods.

The aberration detection methods C1, C2, and C3 are used in several states, counties, and local public health departments. Public health departments are able to apply these methods to data sources that do not have long periods of baseline data. Public health departments are also able to apply 1 set of methods they understand to various types of diseases, covering different frequencies and seasonalities.

The C1, C2, and C3 methods have detected outbreaks of public health interest, including West Nile West Nile may refer to:
  • West Nile virus
  • West Nile region in Uganda
 disease and the start of the influenza season. C1, C2, and C3 demonstrate consistency over the various situations represented in these simulations. Other aberration detection methods exist, as do other simulated datasets. The simulated datasets presented in this paper cover a larger variety of types of data that might be expected in public health. These simulated datasets also include enough past years of data so that methods that require 5 years of historical information can also be used in the comparisons. These simulations provide a method to fairly compare other methods among themselves and to the methods included in EARS.

The simulations were based on means and SDs to help determine which method performs better under which circumstances CIRCUMSTANCES, evidence. The particulars which accompany a fact.
     2. The facts proved are either possible or impossible, ordinary and probable, or extraordinary and improbable, recent or ancient; they may have happened near us, or afar off; they are public or
. When deciding which method to use, the potential user should base the decision on the sensitivity or specificity or the time to detection.

A potential limitation is that the method for calculating average times to detection disregards undetected outbreaks. Therefore, times to detection should not be considered without also taking into account the sensitivity. However, this method was preferred over the alternative of assigning arbitrary numbers of days to detection for outbreaks that were not detected since the alternative method could lead to misinterpretation of the data. Another limitation is that the artificial datasets may not fully reproduce re·pro·duce
v.
1. To produce a counterpart, an image, or a copy of something.

2. To bring something to mind again.

3. To generate offspring by sexual or asexual means.
 the nuances of natural disease occurrences. While approximations, the simulated data were generated based on naturally observed data and included variations for trend over time, days of the week, seasons, and holidays. Therefore, while these comparisons represent relative sensitivities, specificities, and times to detection, we do not know whether results using naturally occurring data would be consistent.

The results of this study suggest that the EARS historical methods do not have a strong advantage when compared with nonhistorical methods, in fact, the lack of historical data does not impair im·pair  
tr.v. im·paired, im·pair·ing, im·pairs
To cause to diminish, as in strength, value, or quality: an injury that impaired my hearing; a severe storm impairing communications.
 the EARS outbreak detection methods. This study also demonstrates the effectiveness of artificial outbreak data in comparing and evaluating outbreak detection methods. As aberration detection methods are increasingly being used by state and local health departments to monitor for naturally occurring outbreaks and bioterror events, this study contributes to the quest to determine the most efficient method for analyzing surveillance data.

Ms. Hutwagner works with the Bioterrorism bi·o·ter·ror·ism
n.
The use of biological agents, such as pathogenic organisms or agricultural pests, for terrorist purposes.


Bioterrorism 
 Preparedness pre·par·ed·ness  
n.
The state of being prepared, especially military readiness for combat.

Noun 1. preparedness - the state of having been made ready or prepared for use or action (especially military action); "putting them
 and Response Program at the Centers for Disease Control and Prevention Centers for Disease Control and Prevention (CDC), agency of the U.S. Public Health Service since 1973, with headquarters in Atlanta; it was established in 1946 as the Communicable Disease Center.  on developing aberration detection methods for their national "drop-in surveillance" system and ongoing syndromic surveillance. She has been implementing these methods at various sites in the United States United States, officially United States of America, republic (2005 est. pop. 295,734,000), 3,539,227 sq mi (9,166,598 sq km), North America. The United States is the world's third largest country in population and the fourth largest country in area.  and internationally.
Table. By method, overall sensitivity
and specificity and time to detection

                                                            Time to
                                 Sensitivity  Specificity  detection
Type of method        Name           (%)          (%)         (d) *

Historical         Seasonally       82.5         88.7        1.272
methods             adjusted
(at least 5 y         CUSUM
historical data)   ([dagger])
                   Historical       43.9         96.3        2.942
                     limits
                    ([double
                    dagger])

Nonhistorical        C1-MILD        60.1         97.0        1.122
methods (<3 y      ([section])
historical data)    C2-MEDIUM       71.2         97.0        1.319
                  ([paragraph])
                   C3-ULTRA **      82.3         95.4        1.307

* Time to detection must be interpreted with sensitivity
because time to detection does not include missed outbreaks.

([dagger]) The seasonally adjusted CUSUM method sums the positive
differences of the current value from the mean for a period similar
to the current value over 5 years.

([double dagger]) The historical limits method compares the current
sum of 4 time periods to the mean of the sum of 15 totals of 4 time
periods surrounding the current point of interest over 5 years.

([section]) The C1-MILD method is based on CUSUM, but the calculations
reduce to the current value being greater than the mean plus 3 standard
deviations (SD), with the mean and SD based on the past 7 days.

([paragraph]) The C2-MEDIUM method is based on CUSUM, but the
calculations reduce to the current value being greater than the
mean plus 3 SD, with the mean and SD based on the past 7 days
shifted by 2 days.

** The C3-ULTRA method is based on CUSUM, summing the positive
difference of the current value from the mean for 3 days, with
the mean and SD based on the past 7 days shifted by 2 days.


References

(1.) Teutsch SM, Churchill RE, editors. Principles and practice of public health surveillance. New York New York, state, United States
New York, Middle Atlantic state of the United States. It is bordered by Vermont, Massachusetts, Connecticut, and the Atlantic Ocean (E), New Jersey and Pennsylvania (S), Lakes Erie and Ontario and the Canadian province of
: Oxford University Press: 2000.

(2.) Stroup DF, Williamson GD, Herndon JL, Karon J. Detection of aberrations in the occurrence of notifiable diseases The following is a list of notifiable diseases arranged by country. Australia
Source:[1]
  • Acquired Immunodeficiency Syndrome (AIDS)
  • Anthrax
  • Arbovirus infections:
 surveillance data. Stat Med. 1989;8:323-9.

(3.) Farrington CP, Andrews NJ, Beale AD, Catchpole CATCHPOLE, officer. A name formerly given to a sheriff's deputy, or to a constable, or other officer whose duty it is to arrest persons. He was a sort of serjeant. The word is not now in use as an official designation. Minshew ad verb.  MA. A statistical algorithm for the early detection of outbreaks of infectious disease Infectious disease

A pathological condition spread among biological species. Infectious diseases, although varied in their effects, are always associated with viruses, bacteria, fungi, protozoa, multicellular parasites and aberrant proteins known as prions.
. J R Stat Soc Ser A Stat Soc. 1996;159:547-63.

(4.) Simonsen L, Clarke JM, Stroup DF, Williamson GD, Arden NH, Cox NJ. A method for timely assessment of influenza-associated mortality in the United States. Epidemiology epidemiology, field of medicine concerned with the study of epidemics, outbreaks of disease that affect large numbers of people. Epidemiologists, using sophisticated statistical analyses, field investigations, and complex laboratory techniques, investigate the cause . 1997;8:390-5.

(5.) Hutwagner LC, Maloney EK, Bean NH, Slutsker L, Martin SM. Using laboratory-based surveillance data for prevention: an algorithm for detecting salmonella outbreaks. Emerg Infect infect /in·fect/ (in-fekt´)
1. to invade and produce infection in.

2. to transmit a pathogen or disease to.


in·fect
v.
1.
 Dis. 1997;3:395-400.

(6.) Stern L, Lightfoot D. Automated au·to·mate  
v. au·to·mat·ed, au·to·mat·ing, au·to·mates

v.tr.
1. To convert to automatic operation: automate a factory.

2.
 outbreak detection: a quantitative retrospective analysis. Epidemiol Infect. 1999;122:103-10.

(7.) Hutwagner L, Thompson W, Seeman GM, Treadwell T. The bioterrorism preparedness and response Early Aberration Reporting System (EARS). J Urban Health. 2003;80:i89-96.

(8.) Hutwagner L, Thompson W, Groseclose S, Williamson GD. An evaluation of alternative methods for detecting aberrations in public health surveillance data. American Statistical Association The American Statistical Association (ASA) is a scientific and educational society in the United States with the stated mission to promote excellence in the application of statistical science across the wealth of human endeavor. , Joint Statistical Meetings, Proceedings of the Biometrics The biological identification of a person. Examples are face, iris and retinal patterns, hand geometry and voice. Increasingly built into laptop computers, fingerprint readers have become popular as a secure method for identification.  Section. Indianapolis; 2000 Aug. p. 82-5.

Address for correspondence: Lori Hutwagner, Centers for Disease Control and Prevention, 1600 Clifton Rd, Mailstop C18, Atlanta, GA 30333, USA: lax LAX - LAnguage eXample.

A toy language used to illustrate compiler design.

["Compiler Construction", W.M. Waite et al, Springer 1984].
: 404-639-0382; email: lhutwagner@cdc.gov

Lori Hutwagner, * Timothy Browne, * G. Matthew Seeman, * and Aaron T. Fleischauer *

* Centers for Disease Control and Prevention, Atlanta, Georgia, USA
COPYRIGHT 2005 U.S. National Center for Infectious Diseases
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2005, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Dispatches
Author:Fleischauer, Aaron T.
Publication:Emerging Infectious Diseases
Geographic Code:1USA
Date:Feb 1, 2005
Words:2527
Previous Article:Diagnostic system for rapid and sensitive differential detection of pathogens.(Dispatches)
Next Article:Malaria epidemic and drug resistance, Djibouti.(Dispatches)
Topics:



Related Articles
Modeling coating structure development using a Monte Carlo deposition method Part 2: validation of the model and case study.(Coating)
Syndromic surveillance in public health practice, New York City.(Research)
Alert threshold algorithms and malaria epidemic detection.(Research)
Uncertainties in small-angle measurement systems used to calibrate angle artifacts.
Simulated anthrax attacks and syndromic surveillance.(RESEARCH)
Early detection of disease outbreaks.(Products & Services)
The "median" method for the reduction of noise and trigger jitter on waveform data.
Handheld detectors produce fast results.(TECH TALK)(Brief article)
Evaluating detection of an inhalational anthrax outbreak.(RESEARCH)

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles