Printer Friendly

Feature construction can improve diagnostic criteria for high-dimensional metabolic data in newborn screening for medium-chain Acyl-CoA dehydrogenase deficiency.

In neonatal screening, the spectrum of detectable diseases has been greatly expanded by the use of tandem mass spectrometry (TMS) [4] introduced in the 1990s. TMS allows the analysis of multiple aminoacidurias and acylcarnitines in 1 analytical step. The analyte concentrations--often called simply analytes--serve the diagnosis of amino acid disorders, organic acidurias, and fatty-acid oxidation disorders, but their interpretation is complex and requires expertise. Diagnostic criteria published by different working groups are the bases for diagnosis.

Such diagnostic criteria are conventionally derived from knowledge of metabolic pathways. Potentially promising markers are selected manually and then tested against patient and control groups to maximize diagnostic accuracy. For some diseases, it has already been demonstrated that sensitivity and specificity can be improved by taking into account analyte combinations rather than single analytes alone. For example, for medium-chain acyl-CoA dehydrogenase deficiency (MCADD), the ratios of acylcarnitines C8:C10, C8:C6, C8:C2, and C8:C12 have proven useful (1-5).

The information available for diagnosis of a disease is normally spread over different analytes, each of which may be indicative of different aspects of the disease. Combining the information from different analytes by the computation of ratios, sums, or differences can therefore lead to higher discriminatory performance of the resulting new markers. Other reasons for potential improvement may be hidden interactions or relations between analytes. Such information can be expressed explicitly and made available for diagnosis.

These effects can help to improve classification models constructed by routine data-mining algorithms. The input for such learning algorithms is a data set, in the form of a collection of instances with assigned class labels. In our study the data set was controls vs MCADD. The instances are represented by a set of features, such as patient data or laboratory results.

Finding a good representation of the data for the learning algorithms, in the form of suitable features, is known to have high influence on the performance of the resulting classifiers. This task is addressed by feature construction and feature transformation techniques, which are directed at optimizing the data representation for subsequent data-driven learning algorithms (6-8). For some algorithms information in data may not be usable when it is not expressed explicitly. For example, in a recent publication on disease classification in newborn screening (9), the studied method failed to identify the analyte tyrosine as relevant feature for diagnosis of phenylketonuria, whereas in Baumgartner et al. (10), decision-tree induction was able to do so. As shown by Chace et al. (11), tyrosine can effectively improve diagnostic accuracy when used in the phenylalanine:tyrosine ratio.

This study demonstrates that systematic, data-driven searches for suitable markers by construction and evaluation of new analyte combinations can further improve diagnostic criteria for newborn screening.

As an example, we investigated this approach for MCADD (12) using acylcarnitine concentrations measured by TMS for newborn screening and their arithmetic combinations. To demonstrate the suitability of feature construction with metabolic data of newborn screening, we compared the results of the study with previously published diagnostic criteria of other working groups and validated them with data from a 3rd-party screening laboratory.

Materials and Methods


We extracted the primary data set for the study from the neonatal screening data of the Heidelberg newborn screening center. The data were anonymized from routine newborn screening for which parents gave informed consent.

Sample preparation and analysis by electrospray ionization-TMS complied with the protocol described in Schulze et al. (13), with minor variations. The population serviced by the Heidelberg newborn screening center in the southwest of Germany. Beginning December 2003 the Hamad Medical Corporation, the main hospital of Qatar, implemented a statewide neonatal screening program in cooperation with the neonatal screening laboratory situated in Heidelberg. German recommendations for neonatal screening dating from 2002 were applied for this program (14). We also included the data from this cooperation in the study population. In the screened population, the mean age at sampling is 2.9 days (SD 0.86). The data for the study were extracted from neonatal screening data collected from July 2002 to January 2006 for newborns 0 to 5 days old. To our knowledge, no MCADD case was missed in newborn screening during this period.

Patients were divided into 2 classes, the MCADD class, which included the 30 children with proven MCADD identified through neonatal screening. The 2nd class, controls, comprised 2 groups. The 1st group included newborns not suffering from MCADD but showing increased primary marker octanoylcarnitine (C8) concentrations (0.28-1.7 [micro]mol/L; n = 332). The 2nd group represents the normal population with the data of randomly chosen, presumably healthy children (n = 1685). The newborns with increased C8 concentrations, although small in absolute number, were overrepresented within the data set compared with the full screening population. Therefore, to optimize the specificity of resulting classifiers, we intentionally performed the study with many instances likely to be falsely classified as positives.

We randomly divided the selected data into 10 stratified partitions for cross-validation of the study (15). Retests of the same specimen were always assigned to the same partition. In each fold of the cross-validation, 9 partitions were used to create new markers and diagnostic criteria and 1 holdout partition for their validation. Results are given as sums or mean (SD) of the validation results.


For the identification of new markers for MCADD diagnosis, we chose a 3-step data mining approach.

In step 1, we identified promising analyte combinations by a feature construction algorithm. The algorithm generates new numeric features from analytes, and each feature is described by an arithmetic expression composed of 2-4 analytes from Table 1 combined by the arithmetic operators +, -, and /.

We investigated 2 types of expressions, sums and divisions. In this context, a sum denoted a combination of analytes with the operators + or - (e.g., C8 + C10 or C8 - C10), and a division was an expression using analytes or sums as divisor and dividend [e.g., (C8 C10)/C2]. With these restrictions, features were constructed only with operands of the same unit, to achieve a meaningful physical dimension. To avoid undue complexity and decline in the comprehensibility of features, no other operators were used. For example, multiplication was excluded because of the difficulty in interpreting the squared unit of an acylcarnitine concentration.

The algorithm searched >628 000 arithmetic expressions for features showing high discriminatory performance, i.e., the ability to separate the specimens known to be control or MCADD. The discriminatory performance was determined by use of [chi square] statistics, computed to test the statistical association of class distinction (control vs MCADD) with a feature. Here, however, we used the [chi square] measures as a numerical method to rank features for their presumed discriminatory performance.

The algorithm applied 3 different filters to any feature to single out those with minor discriminatory performance or features that do not prove superior to possible "predecessor features" composed of a subset of their component analytes. The remaining features were added to the learning set. For further details about the feature construction algorithm, in particular the specification of searched expressions, the [chi square] measure, and the filters, see Text 1 in the Data Supplement that accompanies the online version of this article at

During step 2, the analyte concentrations and the newly constructed features were converted to binary values by application of a linear threshold function. For example, the analyte C8 might be combined with a threshold of 1 [micro]mol/L. The resulting binary feature C8 > 1 exhibited 2 possible values-t-rue and false--which can be mapped to the classes MCADD and control. Features were by convention arithmetically transformed before applying the threshold, to guarantee that high values indicate MCADD (e.g., C4 - C8 is replaced by C8 - C4). By binarization, an adequate cutoff was embedded in the features, taking into account that false classification of a case as a control (false negative) is a much less tolerable risk than vice versa (false positive).

We investigated 2 variants of automated binarization. The 1st variant selected thresholds based on the range of values in the MCADD group of the learning set. A bootstrapping CI (BCI) for the minimum feature value in the MCADD group was computed, and the lower limit of this interval was used as threshold. Thus the low-value tail distribution of the MCADD cases and the chosen confidence probability determined the threshold and ensured a safety margin, offering protection against false-negative results on unknown cases. Implementation details of the binarization with BCI can be found in Text 2 in the online Data Supplement.

The 2nd binarization variant used the 99% quantile of a presumably healthy reference group as threshold. The baseline values for the quantiles of the features in a normal population were estimated nonparametrically from a random sample of 10 000 healthy newborns extracted from the newborn screening data from July 2003 to December 2005. In the past, high percentiles of the controls have been used as cutoffs in newborn screening to avoid high false-positive rates whenever the small number of available cases did not allow setting cutoffs based on the distribution within the diseased cohort (13).

As a result, each of the binarization approaches provided a learning set for the last data-mining step. Subsequent decision rules--logical combinations of binary features--were built from both learning sets using the rule induction algorithm JRip of the data-mining framework weka (16). The resulting classifiers were easily interpretable and structured like diagnostic criteria published in the literature.

For comparison of the new decision rules with the published diagnostic criteria, the latter were adapted to the data of the newborn screening center at Heidelberg. The markers were adopted as published, but because of missing interlaboratory comparability the reported absolute cutoff values for markers in the diagnostic criteria were replaced with cutoff values analogous to the thresholds in the 2 binarization variants.

We computed sensitivity, specificity, and positive predictive value as assessment criteria on the holdout partition for each cross-validation fold. These evaluation measures included newborns for whom more than 1 data record may be part of the test set. Class assignment for the newborns was therefore done by majority decision (see Text 3 in the online Data Supplement).

Finally, we performed an additional validation of the BCI variant for binarization for all folds with a secondary data set of MCADD cases and controls of the screening center in Hamburg, Germany, with a similar analyte spectrum.



The feature construction algorithm selected 792.70 (250.17) of the more than 628 000 potential numeric features. Table 2 shows the [chi square] values of a selection of the features, the primary marker C8 and 3 conventionally established ratios used in Chace et al. (1), Clayton et al. (2), Pourfarzam et al. (3), and Okun et al. (4). C8 reaches a [chi square] value of 250 (22), whereas the binary features (C8 > [threshold.sub.99%quantile] and C8 > [threshold.sub.BCI]) reach only 19.2 (0.8) and 7.4 (6.4), respectively. Binarization generally means a loss of information and hence a decline in discriminatory performance. The excessive decrease for the binary feature of variant 99% quantile is caused by the overrepresentation of controls with increased C8 values in both test and learning sets. The [chi square] value for the best-performing manually selected ratio C8:C10 is 154 (57), and for (C8:C10 > [threshold.sub.99% quantile]) and (C8:C10 > [threshold.sub.BCI]) it was only 132 (45) and 48 (3), respectively. In contrast, the newly constructed features show mean [chi square] values between 105 and 264 (binary features of the quantile variant, 0.00-211.03; binary features of the BCI variant, 0.00-244.69).

In the following, we list the residual percentage of discriminatory performance for different feature groups achieved through the methods outlined above (see also Table 3). A residual value of 100% means that according to the [chi square] values the binary feature separates the classes control and MCADD with the same accuracy as the numeric feature.

Binarizations with the 99% quantile variant lead to a mean residual discriminatory performance of 28.10% (mean of the folds). Ninety-nine percent of the resulting binary features exceeded the achieved specificity of the binary feature C8 > [threshold.sub.99% quantile].

In the features derived from BCI binarization, only 43% achieved higher specificity scores than the respective binary feature of C8. The mean average residual discriminatory performance in the folds was 9.9%, range 0%-97%. In the BCI variant, large declines in discriminatory performance indicated high variation in the MCADD group used to calculate the BCI threshold. Outliers, especially at the lower range of the feature values, can lead to features with poor or even zero discriminatory performance.

The number of evaluated features exceeded the number of instances in the learning set by far, especially in relation to the instances of class MCADD. Therefore, constructed features might fit the learning set only coincidentally, and their discriminatory performance might not be generalizable for future instances, an effect called overfitting. In the present investigation, overfitting does not seem to have been a major problem, as demonstrated by the cross-validation: on average, 78% of all selected numeric features showed equal or higher [chi square] values than C8, and the lowest [chi square] value per fold on average reached 79% of the [chi square] value for C8, i.e., within 21% of the reference.


We performed a visual inspection of the numeric features to determine to what extent overfitting contributed to their discriminatory performance. We focused on the features with the best mean discriminatory performance computed from the numeric feature and the binary variants. The inspection revealed that these constructed features achieved their high discriminatory performance by reducing the number of controls in the proximity of the MCADD cases in comparison to C8. The new analyte combinations significantly reduced the controls in the proximity of the BCI and 99% quantile cutoff compared with C8 (Fig. 1) and therefore reduced the probability of false-positive classifications. In contrast, the plot of the conventionally established ratio C8:C10 shows much overlapping of MCADD cases and controls.


For each cutoff variant, we compared the conventionally established diagnostic criteria with the decision rules built from the learning sets. The results of these classifiers and the adapted, conventionally established diagnostic criteria are summarized in Table 4 for the BCI variant and Table 5 for the 99% quantile variant.

For the BCI variant, the rules generated by the JRip algorithm resulted in 96.67% sensitivity and 99.80% specificity in the cross-validation study. One fold led to the selection of a feature with an unsafe BCI-threshold in rule induction. For the quantile variant, 100% sensitivity and 99.9% specificity were achieved.

The rules generated without cross-validation were as follows: BCI, (C8 - C4 + C5:1 - C10) > 0.004 AND (C8 - C18:1)/ (C14OH + C16OH) >-2.012; and quantile, (C8 - C10 - C5DC - C18:1OH) > 0.24. Both achieved 100% sensitivity and a specificity >99.9% when applied to all specimens analyzed in Heidelberg between July 2002 and January 2006 (including the cases from the study).

Two published diagnostic criteria did not reach 100% sensitivity with the quantile variant, because one of the thresholds used for the 99% quantile led to false negatives. Some specificities were very low, because any individual binary feature created with a linear threshold function using the 99% quantile can by definition reach only 99% specificity on a representative sample. For our data set enriched with acylcarnitine profiles with increased C8, the specificity was even lower. In contrast, the use of the BCI thresholds in the adapted conventionally established diagnostic criteria always gave 100% sensitivity but also led to specificities as low as 58%.


We acquired a sample of 5 MCADD and 168 controls, including 7 false-positive screened newborns with increased C8 concentrations, from the screening center in Hamburg, Germany. This independent data set was provided blinded, with no information about the final diagnosis, and was used to validate the new BCI approach for automated thresholding and its efficacy in implementing a safety margin.

Of 4443 different binary features created with the BCI variant in the cross-validation study, 1747 could be used in this external validation, given that some of the analytes used for feature construction in this study were not available in the external data. (Additionally, for one of the cases no concentration of C16:1 was available; for that reason, this case was excluded from computation of sensitivity for 21 features with C16:1 as component.) For validation, these 1747 binary features were calculated from the external data, resulting in a class assignment of each instance through each of these features. The data were then unblinded, and sensitivity and specificity were computed.

From the external data sample, 1743 features achieved 100% sensitivity, whereas 4 features misclassified 1 MCADD case and reached only 80% sensitivity. Specificities ranged from 0.00% to 98.81%.


Any method for identification or building discriminators for diagnostic tasks such as t-test or logistic regression analysis relies on adequate data and adequate data representation. Regardless of how well the data representation is adapted to the diagnostic task, relevant information may be hidden in the data. Feature transformation techniques enable researchers to discover such information.

In this investigation, we demonstrated the usefulness of feature construction for improving the data representation for diagnostic criteria in newborn screening for MCADD. The numeric features, analyte combinations created automatically from intermediary metabolites, showed superior discriminatory performance compared with the initial analytes and manually derived analyte ratios. The decision rules built with the improved representation demonstrated the potential of feature construction and overall exceeded the previously reported diagnostic criteria in terms of diagnostic accuracy.

With the use of BCIs for minimum values in the MCADD group, we introduced an approach to automatically select cutoffs with safety margins. According to Carpenter and Bithell (17), the non-Studentized pivotal method for BCIs can be viewed as a suitable and safe method only to find CIs for the median or 50% quantile. We are using this method to compute CIs for the minimum value or 0% quantile and thus cannot expect to identify correct CIs. Nevertheless this approach enables us to select cutoffs based on the number and the estimated distribution of available cases. In this study, the application of BCI cutoffs was empirically reassessed by cross-validation and by validation on external data with good results: of the cutoffs by BCI as derived during cross-validation, only 2.47% led to misclassifications of MCADD cases in the holdout partition, and only 0.23% of the features were affected in the validation with external data.

The safety level for correct classification of future MCADD cases can be controlled by choosing the confidence probability for the minimal value of a feature in the MCADD group. Unfortunately the problem of selecting the right cutoff values resists a solution beyond doubt when data are very sparse. BCIs for the minimum value are applicable only when a sufficiently large representative sample of the relevant cases can be provided.

In the quantile variant, the distribution of the MCADD cases is ignored, so the chosen quantile for binarization will lead to false negatives for features with insufficient initial discriminatory performance. Thus, any thresholds of the quantile variant should be manually reviewed and changed to reasonable cutoffs. This precaution is also recommended for the BCI variant, but here the thresholds should give a safer starting point for setting cutoffs when a sufficient number of cases has been used for learning. Still, the BCI variant shows one disadvantage compared with the quantile variant: safety margins may be too large, and therefore promising features actually may be discarded because of low residual discriminatory performance.

Our study did not aim for automated application of the classifiers or markers for routine newborn screening. We intentionally applied only methods that allow for easy interpretation and comprehensibility of the constructed markers and decision rules, so that they can be used manually in the clinical validation. The use of up to 4 analytes in a feature and 2 features in a discriminator may be problematic because of the increased complexity of comprehending such diagnostic criteria and the increased risk of measurement errors influencing the features. Nevertheless, if a marker is chosen in accordance with the knowledge of the pathway of the metabolites, and the analytical stability of the analytes is guaranteed, there is no reason against using the marker as evidence to formulate a diagnosis. Examination of the reasons for good individual performance of particular analyte combinations may suggest new hypotheses about the secondary effects of deficient pathways and assist in improving the understanding of the diseases under consideration.

In conclusion, feature construction appears to be a promising method to improve diagnostic accuracy in screening for MCADD with TMS. Our results also indicate that the use of feature construction techniques for high-dimensional metabolic data has the potential to improve diagnostic criteria for other diseases screened with TMS and to generate further hypotheses regarding these diseases.

Grant/funding support: None declared.

Financial disclosures: None declared.

Acknowledgments: We thank the editor and the 2 reviewers. Their comments and suggestions on the report and the study helped us to clarify and strengthen several arguments.

Received October 17, 2006; accepted April 30, 2007. Previously published online at DOI: 10.1373/clinchem.2006.081802


(1.) Chace DH, Hillman SL, van Hove JLK, Naylor EW. Rapid diagnosis of MCAD deficiency: quantitative analysis of octanoylcarnitine and other acylcamitines in newborn blood spots by tandem mass spectrometry. Clin Chem 1997;43:2106-13.

(2.) Clayton PT, Doig M, Ghafari S, Meaney C, Taylor C, Leonard JV, et al. Screening for medium chain acyl-CoA dehydrogenase deficiency using electrospray ionisation tandem mass spectrometry. Arch Dis Child 1998;79:109-15.

(3.) Pourfarzam M, Morris A, Appleton M, Craft A, Bartlett K. Neonatal screening for medium-chain acyl-CoA dehydrogenase deficiency. Lancet 2001;358:1063-4.

(4.) Okun JG, Kolker S, Schulze A, Kohlmuller D, Olgemoller K, Lindner M, et al. A method for quantitative acylcarnitine profiling in human skin fibroblasts using unlabelled palmitic acid: diagnosis of fatty oxidation disorders and differentiation between biochemical phenotypes of MCAD deficiency. Biochim Biophys Acta 2002;1584: 91-8.

(5.) Van Hove JL, Zhang W, Kahler SG, Roe CR, Chen YT, Terada N, et al. Medium-chain acyl-CoA dehydrogenase (MCAD) deficiency: diagnosis by acylcarnitine analysis in blood. Am J Hum Genet 1993;52:958-66.

(6.) Freitas AA. Understanding the crucial role of attribute interaction in data mining. Artif Intell Rev 2001;16:177-99.

(7.) Hu YJ, Kibler D. Generation of attributes for learning algorithms. Proceedings of the 13th National Conference on Artificial Intelligence, 1996;806-11.

(8.) Markovitch S, Rosenstein D. Feature generation using general constructor functions. Mach Learn 2002;49:59-98.

(9.) Baumgartner C, Baumgartner D. Biomarker discovery, disease classification, and similarity query processing on high-throughput MS/MS data of inborn errors of metabolism. J Biomol Screen 2006;11:90-9.

(10.) Baumgartner C, Bohm C, Baumgartner D. Modelling of classification rules on metabolic patterns including machine learning and expert knowledge. J Biomed Inform 2005;38:89-98.

(11.) Chace DH, Sherwin JE, Hillman SL, Lorey F, Cunningham GC. Use of phenylalanine-to-tyrosine ratio determined by tandem mass spectrometry to improve newborn screening for phenylketonuria of early discharge specimen collected in the first 24 hours. Clin Chem 1998;44:2405-9.

(12.) Johns Hopkins University, Baltimore, MD. Online Mendelian Inheritance in Man. MIM No. 201450: 2006-08-25. (accessed April 24, 2007).

(13.) Schulze A, Lindner M, Kohlmuller D, Olgemoller K, Mayatepek E, Hoffmann GF. Expanded newborn screening for inborn errors of metabolism by electrospray ionization-tandem mass spectrometry: results, outcome and implications. Pediatrics 2003;111:1399-406.

(14.) Interdisziplinare Screen ingkommission der Deutschen Gesellschaft fur Kinderheilkunde and Jugendmedizin: Richtlinien zur Organisation and Durchfuhrung des Neugeborenenscreenings auf angeborene Stoffwechselstorungen and Endokrinopathien in Deutschland. Monatsschr Kinderheilkd 2000;1424-40.

(15.) Kohavi R. A study of cross-validation and bootstrap for accuracy estimation and model selection. Proceedings of the Fourteenth International Joint Conference on Artificial Intelligence, 1995; 1137-43.

(16.) Witten IH, Frank E. Data Mining: Practical Machine Learning Tools and Techniques, 2nd ed. San Francisco: Morgan Kaufmann, 2005:525pp.

(17.) Carpenter J, Bithell J. Bootstrap confidence intervals: when, which, what? A practical guide for medical statisticians. Stat Med 2000;19:1141-64.

[4] Nonstandard abbreviations: TMS, tandem mass spectrometry; MCADD, medium-chain acyl-CoA dehydrogenase deficiency; BCI, bootstrapping CI.


[1] Division of Metabolic Diseases, Department of General Pediatrics, University Children's Hospital, Heidelberg, Germany.

[2] Department of Pediatrics and Institute of Clinical Chemistry, Hamburg University Medical Center, Hamburg, Germany.

[3] Department of Medical Informatics, University of Heidelberg Medical Center, Heidelberg, Germany.

* Address correspondence to this author at: University Children's Hospital, Division of Metabolic Diseases, Im Neuenheimer Feld 150, D-69120 Heidelberg, Germany. Fax 49-6221-564069; e-mail
Table 1. Overview of acylcarnitines (symbol) used for
feature construction.

Acylcarnitine Symbol

Free carnitine C0
Acetylcarnitine C2
Propionylcarnitine C3
Butyrylcarnitine C4
Isovalerylcarnitine C5
Hexanoylcarnitine C6
Octanylcarnitine C8
Decanoylcarnitine C10
Dodecanylcarnitine C12
Myristoylcarnitine C14
Hexadecanoylcarnitine C16
Octadecanoylcarnitine C18
Tiglylcarnitine C5:1
Octenoylcarnitine C8:1
Decenoylcarnitine C10:1
Myristoleylcarnitine C14:1
Hexadecenoylcarnitine C16:1
Octadecenoylcarnitine C18:1
Hydroxyisovalerylcarnitine C5OH
Hydroxytetradecadienoylcarnitine C14OH
Hydroxypalmitoylcarnitine C16OH
Hydroxypalmitoleylcarnitine C16:1OH
Hydroxyoleylcarnitine C18:1OH
Methylmalonylcarnitine C4DC
Glutarylcarnitine C5DC
Methylglutarylcarnitine C6DC

Table 2. Discriminatory performance of selected features. (a)

Feature No. of times [chi square]
 selected in the value of numeric
 cross-validation feature (b)

C8 (c) 0 250.34 (22.01)
C8/C2 (c) 0 234.29 (39.86)
C8/C6 (c) 0 72.99 (30.59)
C8/C10 (c) 0 156.4 (54.08)
C8 - C4 10 243.74 (36.86)
C8 - C4DC 10 247.03 (31.45)
C8 - C5DC 10 248.36 (27.71)
C8 - C6DC 10 246.20 (30.16)
C8 - C10 10 254.36 (16.37)
C8 - C12 10 251.01 (26.86)
C8 - C14 10 244.88 (33.70)
C8 - C14:1 10 252.54 (22.04)
C8 - C16:1 10 250.34 (22.01)
C8 - C18:10H 10 248.36 (27.71)
C8 - C10 - C18:10H 10 253.87 (12.73)
C8 - C5 - C10 10 249.52 (21.78)
C8 - C4 - C5:1 - C10 6 247.49 (22.89)
(C8 - C18:1)/C16:10H 10 247.43 (21.79)
(C8 - C18:1)/(C140H - C160H) 5 244.71 (29.89)

Feature [chi square] [chi square]
 value of binary value of binary
 feature, feature, BCI (b)
 quantile 99% (b)

C8 (c) 19.17 (0.84) 7.35 (6.37)
C8/C2 (c) 40.71 (8.10) 38.23 (19.36)
C8/C6 (c) 26.93 (35.15) 6.74 (4.5)
C8/C10 (c) 132.19 (44.93) 48.34 (8.02)
C8 - C4 39.72 (4.42) 5.02 (3.64)
C8 - C4DC 40.76 (6.19) 15.47 (13.46)
C8 - C5DC 28.02 (2.20) 16.86 (12.92)
C8 - C6DC 25.67 (2.17) 19.31 (16.76)
C8 - C10 91.62 (14.01) 48.66 (27.89)
C8 - C12 36.47 (6.65) 10.15 (8.85)
C8 - C14 47.27 (5.56) 2.65 (2.32)
C8 - C14:1 40.21 (7.50) 22.77 (19.95)
C8 - C16:1 36.78 (4.56) 2.74 (2.44)
C8 - C18:10H 26.35 (2.95) 13.64 (12.01)
C8 - C10 - C18:10H 105.62 (16.31) 75.88 (38.92)
C8 - C5 - C10 132.92 (23.63) 59.7 (24.05)
C8 - C4 - C5:1 - C10 146.89 (16.32) 174.84 (74.85)
(C8 - C18:1)/C16:10H 111.07 (23.56) 9.94 (4.12)
(C8 - C18:1)/(C140H - C160H) 98.10 (27.15) 7.74 (1.72)

(a) The table shows all 2-component features selected in 1 of the
10-folds and those 3- and 4-component features that are subsequently
used in the figures or have been picked in the rule induction process.

(b) The stated [chi square] values mean (SD) were computed with
10-fold cross-validation. A maximum of 256.53 could be reached as mean
[chi square] value.

(c) Not selected by the feature construction algorithm. These features
are stated for comparison, as they have been used in conventionally
established diagnostic criteria.

Table 3. Summary of constructed features. (a)

Feature variant n superior to C8

Numeric features 792.7 (250.18) 6.26 (14.78)
 Features with arity 2 17.2 (3.94) 2.98 (7.88)
 Features with arity 3 140.9 (34.29) 6.19 (15.55)
 Features with arity 4 634.6 (217.12) 6.39 (14.89)
Binary features BCI 792.7 (250.18) 43.28 (11.35)
 Features with arity 2 17.2 (3.94) 42.15 (25.26)
 Features with arity 3 140.9 (34.29) 39.63 (12.81)
 Features with arity 4 634.6 (217.12) 44.05 (11.81)
Binary features 99% quantile 792.7 (250.18) 99.07 (0.86)
 Features with arity 2 17.2 (3.94) 98.04 (3.33)
 Features with arity 3 140.9 (34.29) 99.89 (0.35)
 Features with arity 4 634.6 (217.12) 98.91 (1.09)

Feature variant Mean (range) of the residual
 discriminatory performance,
 in percentage% (b)

Numeric features
 Features with arity 2
 Features with arity 3
 Features with arity 4
Binary features BCI 9.88 (4.10)
 Features with arity 2 5.63 (3.66)
 Features with arity 3 7.73 (3.51)
 Features with arity 4 10.44 (4.32)
Binary features 99% quantile 28.11 (3.74)
 Features with arity 2 16.97 (3.06)
 Features with arity 3 26.21 (4.08)
 Features with arity 4 28.85 (3.66)

Feature variant Mean (range) of the residual
 discriminatory performance,
 in percentage% (b)

Numeric features
 Features with arity 2
 Features with arity 3
 Features with arity 4
Binary features BCI 0.00-97.05 (0.00-7.79)
 Features with arity 2 0.13-27.00 (0.41-21.77)
 Features with arity 3 0.00-78.76 (0.00-17.15)
 Features with arity 4 0.00-97.05 (0.00-7.79)
Binary features 99% quantile 0.00-90.04 (0.00-12.07)
 Features with arity 2 8.03-45.90 (1.00-13.23)
 Features with arity 3 7.78-73.90 (2.89-14.17)
 Features with arity 4 0.00-90.04 (0.00-12.07)

(a) The table states the results of the 10-fold cross-validation
giving mean values and the SD in parenthesis.

(b) Residual discriminatory performance gives the percentage of the
[chi square] value of the binary feature in relation to the
[chi square] value of the numeric feature.

Table 4. Performance indicators for adapted diagnostic criteria and
new classifiers: BCI variant. (a)

Origin Adapted Sensitivity (b)
 criteria or

Chace et al. (1) C8 >0.12 AND 100.00 (100.00)
 C8/C10 >1.246
 AND C8/C2 >0.017

Clayton et al. (2) (c) C8 >0.12 AND 100.00 (100.00)
 age <14 days

Pourfarzam et al. (3) C8 >0.12 AND 100.00 (100.00)
 C8/C6 >1.069

Okun et al. (4) C8 >0.12 AND 100.00 (100.00)
 C8/C2 >0.017 OR
 C8 >0.12 AND
 C8/C6 >1.069

Historical newborn C8 >0.12 AND 100.00 (100.00)
screening Heidelberg C8/C10 >1.246
 OR C8 >1.0

Classifier BCI variant (C8 - C10 - C5DC 96.67 (100.00)
 - C18:1OH) >0.24

Origin Specificity (b) Positive
 value (b)

Chace et al. (1) 95.835 (99.328) 42.86 (1.18)

Clayton et al. (2) (c) 58.106 (84.828) 16.36 (0.05)

Pourfarzam et al. (3) 68.914 (87.296) 31.03 (0.06)

Okun et al. (4) 93.14 (87.198) 16.36 (0.06)

Historical newborn 91.274 (95.275) 42.86 (0.17)
screening Heidelberg

Classifier BCI variant 99.802 (99.968) 87.88 (20.00)

(a) Performance indicators were computed patient-wise by use of
majority decision (see Text 3 in the online Data Supplement) and
with 10-fold cross-validation of the data set. They are stated
as mean (SD). The stated BCI cutoff values were computed with the
MCADD data of the whole data set.

(b) The further condition in Clayton et al. (2) "no increases in
acylcarnitines with count (C) 10 and with count (C) 6" was not
considered, because of the difficulty of establishing adequate

(c) Values in parentheses were computed by applying the rules and
BCI cutoffs extracted from the overall data set
(no cross-validation) to the data of all newborns screened
between July 2002 and January 2006 at the newborn screening center
Heidelberg (n 397 195), which also includes the data used for

Table 5. Performance indicators for adapted diagnostic criteria and
new classifiers: quantile variant. (a)

Origin Adapted Sensitivity (b)
 criteria or

Chace et al. (1) C8 >0.228 AND 93.33 (93.75)
 C8/C10 >2.589
 AND C8/C2 >0.016

Clayton et al. (2) (c) C8 >0.228 AND 100.00 (100.00)
 age <14 days

Pourfarzam et al. (3) C8 >0.228 AND 23.33 (21.88)
 C8/C6 >5.685

Okun et al. (4) C8 >0.228 AND 100.00 (100.00)
 C8/C2 >0.016 OR
 C8 >0.228 AND
 C8/C6 >5.685

Historical newborn C8 >0.228 AND 100.00 (100.00)
screening Heidelberg C8/C10 >2.589
 OR C8 >1.0

Classifier quantile (C8 - C4 - C5:1 - 100.00 (100.00)
variant C10) >0.004 AND
 (C8 - C18:1)/
 (C14OH + C16OH)

Origin Specificity (b) Positive
 value (b)

Chace et al. (1) 99.058 (99.829) 59.57 (4.22)

Clayton et al. (2) (c) 83.094 (98.331) 8.09 (0.48)

Pourfarzam et al. (3) 98.959 (99.786) 25.00 (0.82)

Okun et al. (4) 90.183 (99.178) 13.16 (0.97)

Historical newborn 98.463 (99.722) 49.18 (2.81)
screening Heidelberg

Classifier quantile 99.901 (99.936) 93.75 (9.88)

(a) Performance indicators were computed patient-wise by use of
majority decision (see Text 3 in the online Data Supplement) and
with 10-fold cross-validation of the data set. They are stated
as mean (SD).

(b) The further condition in Clayton et al. (2) "no increases in
acylcarnitines with count (C) 10 and with count (C) 6" was not
considered, because of the difficulty of establishing adequate

(c) Values in parentheses were computed from the data of all
newborns screened between July 2002 and January 2006 at the
newborn screening center Heidelberg (n = 397 195), which also
includes the data used for learning.
COPYRIGHT 2007 American Association for Clinical Chemistry, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2007 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Informatics and Statistics
Author:Ho, Sirikit; Lukacs, Zoltan; Hoffmann, Georg F.; Lindner, Martin; Wetter, Thomas
Publication:Clinical Chemistry
Date:Jul 1, 2007
Previous Article:Magnetic control of an electrochemical microfluidic device with an arrayed immunosensor for simultaneous multiple immunoassays.
Next Article:Errors in a stat laboratory: types and frequencies 10 years later.

Terms of use | Privacy policy | Copyright © 2021 Farlex, Inc. | Feedback | For webmasters |