# Discordance analysis characteristics as a new method to compare the diagnostic accuracy of tests: example of complexed versus total prostate-specific antigen.

ROC analysis is widely used to characterize and compare the
diagnostic accuracy of markers or different assays (1). The area under
the curve (AUC) [8] is used to describe the overall diagnostic value of
the test in question (1-3), but the use of ROC analysis has several
limitations (1,4). The essential limitation of the ROC analysis is
related to the fact that the diagnostic validity is estimated regarding
the test performance across the whole range of measured values (1). The
full area under the ROC curve has been criticized because it gives equal
weight to all false-positive rates (5).

Depending on the clinical use of a marker, only one portion of the ROC curve, the local performance of a test within a restricted range of false-positive rates, is often of higher importance than the overall performance represented by the AUC. Furthermore, ROC calculations always analyze the whole data set. Therefore, samples in nonrelevant ranges such as very low or high concentrations influence the analysis of diagnostic performance in the relevant concentration range and may complicate appropriate interpretation of the resulting graph and data, particularly when comparing the diagnostic accuracy of two tests.

To overcome these limitations of ROC curves, alternative indices of ROC curves have been proposed, e.g., partial areas using particular specificity intervals (5,6). However, these approaches have not been widely used because of their methodologic complexity. Current practice, therefore, is to try to overcome this disadvantage of ROC analysis by analyzing the data using only a subgroup with a limited concentration range. In this respect, comparison between the clinical utility of total prostate-specific antigen (tPSA) and complexed PSA (cPSA) is a good example (7-9). When the procedure that defines a subgroup by a range of one test (A) and compares the diagnostic performance of both tests A and B is used, artifacts can result at the lower and upper ends of the concentration range of the subgroup. For example, if the lower end of the sample population is defined only by test A, then patients who are false negative for test A but true positive for test B are excluded from analysis (to the disadvantage of test B). Similar selection artifacts exist for the upper end of the concentration range of the subgroup's sample population. Misleading interpretations may result from these selection artifacts. This may be, at least partially, the reason for the contradictory conclusions of different working groups analyzing the diagnostic performance of cPSA vs tPSA (7-12).

In this report, we present a new, easy-to-use approach for routine analysis, which we have called discordance analysis characteristics (DAC), that helps to avoid the above-mentioned disadvantages. The analysis is based on a generalization of the McNemar test so that for a given pair of cutoff values only those patients are analyzed who are categorized differently by both tests. We demonstrate the potential usefulness of the method, using the example of cPSA vs tPSA with the data from a previously published multicenter study (7).

Materials and Methods

STUDY GROUPS, SAMPLES, AND ASSAYS

For elaboration of the DAC method, we used the tPSA and cPSA data from a multicenter study (7). Details concerning the study groups, blood sample collection and storage, and analytical methods were given in the original report (7). Briefly, the study included 700 white men enrolled in screening studies and or case-finding studies with tPSA concentrations between 0 and 6 [micro]g/L. All 700 men were untreated and underwent transrectal ultrasound-guided 6- to 10-sector needle biopsies of the prostate. A total of 283 patients were diagnosed as having prostate cancer (PCa), whereas in 417 men no evidence of prostate cancer (non-PCa) was found in prostate biopsies. PSA concentrations were measured by the Bayer Immuno 1 PSA and cPSA assays (Bayer Diagnostics) as described previously (7,13,14).

PRINCIPLE OF DAC ANALYSIS

The basic approach of the DAC method can be exemplified by use of a scatterplot of the tPSA and cPSA values (Fig. 1) of the multicenter study (7). In general, continuous-like data within a continuous distribution are assumed, which is the case in the analysis of laboratory analytes. When a pair of cutoff values ([CO.sub.A] and [CO.sub.B]) for tests A and B, respectively, are used, four quadrants (Q1-Q4) result. The cutoff pairs for the definition of the quadrants use the criterion that quadrants 2 and 4 should contain the same number of true positives (PCa cases), i.e., [CO.sub.A] and [CO.sub.B] deliver identical sensitivity. This criterion refers to recommendations for the diagnostic evaluation of tumor markers (15). Quadrants 1 and 3 contain cases categorized equivalently by test A and test B: "negative" in quadrant 1 (i.e., below the cutoff) and "positive" in quadrant 3 (i.e., above the cutoff). As the first step in the DAC approach, the selection of the samples used for analysis, quadrants 2 and 4 contain those cases that are relevant for the comparison of both tests because they were categorized discordantly by both tests. Quadrant 2 contains cases with negative results for test A and positive results for test B, whereas quadrant 4 contains cases with positive results for test A and negative results for test B. Thus, for each pair of cutoffs, the (Q2 + Q4) subpopulation of Q2 samples plus Q4 samples includes those samples that cause possible differences in diagnostic accuracy between the tests. In the second step, we analyze the properties of the three local subpopulations selected in step 1: Q2 samples, Q4 samples, and/or (Q2 + Q4) samples. For that purpose, the true-positive ([TP.sub.A] and [TP.sub.B]), true-negative ([TN.sub.A] and [TN.sub.B]), false-positive ([FP.sub.A] and [FP.sub.B]), and false-negative ([FN.sub.A] and [FN.sub.B]) test results are counted for both tests A and B. There are several possibilities for analyzing these counts. For the analysis of (Q2 + Q4) samples, we use a specificity-resembling parameter: the DAC specificities in (Q2 + Q4) for test A are defined as [DAC-SPEC.sub.A] = [TN.sub.A]/([TN.sub.A] + [FP.sub.A]) and accordingly for [DAC-SPEC.sub.B]. The comparative analysis of Q2 samples vs Q4 samples is performed with a parameter resembling positive predictive value (PPV): the DAC-PPV for test A is defined as [DAC-PPV.sub.A] = [TP.sub.A]/([TP.sub.A] + [FP.sub.A]) using the cases in Q4. The definition for test B is set accordingly using only Q2 cases.

[FIGURE 1 OMITTED]

HIGHER VALUES OF DAC SPECIFICITY OR DAC-PPV FOR ONE TEST INDICATE ITS SUPERIOR DIAGNOSTIC ACCURACY

It should be noted that the sum of [DAC-SPEC.sub.A] and [DAC-SPEC.sub.B] always equals 1. Accordingly, it is true that [DAC-PPV.sub.A] + [DAC-NPV.sub.B] = 1 and [DAC-PPV.sub.B] + [DAC-NPV.sub.A] = 1. These effects are attributable to the equivalen cies of [TP.sub.A] = [FN.sub.B], [TP.sub.B] = [FN.sub.A], [FP.sub.A] = [TN.sub.B], and [FP.sub.B] = TNA. Therefore, only one test needs to be graphed for DAC specificities. Similarly, only one of the parameters PPV or negative predictive value (NPV) must be analyzed.

In a third step, these calculations are done for all cutoff pairs (i = 1 ... n), where n is the number of all possible cutoff pairs regarding the criterion mentioned above. The parameters DAC-SPEC and DAC-PPV are graphed over [CO.sub.A,i] and [CO.sub.B,i] by use of two x axes. Alternatively, sensitivity could be used as the x axis.

To estimate the significance of different diagnostic accuracies, we suggest calculating the differences between [DAC-SPEC.sub.B] and [DAC-SPEC.sub.A] and between [DAC-PPV.sub.B] and [DAC-PPV.sub.A] for each pair of cutoffs. The pointwise confidence intervals (CIs) can then be calculated using the methods related to the difference of two proportions with formulas given by Altman (16). In the case of DAC-SPEC, paired observations must be considered, whereas non-paired proportions for DAC-PPV can be assumed. For both DAC-SPEC and DAC-PPV, differences >0 would be indicative of superior diagnostic accuracy for test B, and lower limits of the CIs >0 would indicate significance of the result.

This approach, leading to one or several graphs characterizing the discordant test results, is called the DAC method. A suitable computer program has been developed. (Copies of the program can be obtained from Dr. Keller at thomas.keller@acomed.de or www.acomedstatistics.com/dac-method.html.)

Sometimes it may be appropriate to consider a summary measure. We propose the use of medians of DACSPEC and DAC-PPV values and the medians of ratios of these parameters: [R.sub.DAC-SPEC] = median ([DAC-SPEC.sub.B,i]/ [DAC-SPEC.sub.A,i]) and [R.sub.DAC-PPV] = median ([DAC-PPV.sub.B,i]/ [DAC-PPV.sub.A,i]), respectively. For example, a value >1 for the latter medians would indicate the better diagnostic accuracy of marker B. However, like the AUC of ROC curves, these overall measures do not provide information about the local diagnostic performance.

STATISTICAL ANALYSIS

DAC-SPEC, DAC-PPV, and the pointwise CIs of their differences were calculated as described above (16). An assay was estimated as superior if the related DAC-SPEC and DAC-PPV values were higher than those of the comparative assay. For graphical presentations, raw data (counts) were smoothed by use of a triangular smoothing function (17).

All calculations and graphs for ROC analysis were made with an Excel[TM] (version XP for Windows; Microsoft Corporation) software program (www.acomedstatistics.com/roc-tools.html). Differences in ROC curves were estimated according to DeLong et al. (3). CIs for the AUC were calculated according to Hanley and McNeil (2). The significances of the overall parameters [medians of DAC-SPEC and DAC-PPV and medians of their ratios ([R.sub.DAC-SPEC] and [R.sub.DAC-PPV])] were considered on the basis of the 95% CIs of their medians calculated by bootstrapping (18), using 10 000 bootstrap replicates. The method was programmed using the statistical computer program R (19,20).

Results

ROC curves for cPSA and tPSA, including the whole data set, are shown in Fig. 2. The AUC for cPSA is significantly greater than the AUC for tPSA: 0.691 (95% CI, 0.655-0.725) vs 0.668 (95% CI, 0.631-0.702); P <0.0005.

To estimate the diagnostic performance only in the interesting range 2-4 [micro]g/L, a subgroup analysis seems appropriate. Scatterplots for the subgroup of patients with tPSA concentrations in the range 2-4 [micro]g/L and corresponding cPSA concentrations in the range 1.51-3.19 [micro]g/L are shown in panels A and B, respectively, of Fig. 3. The graphs demonstrate that, particularly at the edges of the selected concentration range, different patients are included in such a subgroup analysis. As shown in the ROC curves (Fig. 3, C and D), cPSA-based selection of patients leads to a significant difference between AUCs (P <0.02), whereas tPSA-based selection fails to show this difference (P = 0.15). Furthermore, the absolute values of the AUC obtained by the different selection procedures differ (tPSA, 0.55 vs 0.48; cPSA, 0.58 vs 0.53) as do the positions and shapes of the ROC curves.

The results obtained with the DAC method (Fig. 4) show the DAC-SPEC and DAC-PPV values as well as the calculated differences of DAC-SPEC and DAC-PPV graphed over the cutoffs of both analyses. DAC-SPEC and DAC-PPV values were significantly higher for cPSA in a wide range of tPSA between ~2.5 and 5.8 [micro]g/L, corresponding to cPSA values of ~1.9 to 4.8 [micro]g/L, as indicated by the positive values for the lower CI.

[FIGURE 2 OMITTED]

The median [DAC-SPEC.sub.cPSA] value of 0.78 (95% CI, 0.61-0.87) differed significantly from the [DAC-SPEC.sub.tPSA] (0.22; 95% CI, 0.13-0.39). Results were similar for theDAC-PPV of 0.63 (95% CI, 0.53-0.82) for cPSA vs 0.31 (95% CI, 0.27-0.50) for tPSA. The CIs of the medians of both pairs of DAC-SPEC for cPSA and tPSA and of DAC-PPV, respectively, did not overlap and indicated a significant difference between the medians. The medians of the ratios [R.sub.DAC-SPEC] and [R.sub.DAC-PPV] explained in the Materials and Methods, were calculated to be [R.sub.DAC-SPEC] = 3.57 (95% CI, 1.53-6.67) and [R.sub.DAC-PPV] = 2.01 (95% CI, 1.30-2.69) and differed significantly from 1.

Discussion

SUBGROUP ANALYSES IN CLINICAL STUDIES WITH cPSA AND tPSA

One recent approach to enhance the clinical utility of PSA assays is the use of cPSA forms. However, discrepant results have been described (7-12,14,21,22). To compare the diagnostic accuracies in different ranges of tPSA concentrations, selected tPSA ranges were frequently chosen as study populations. We clearly demonstrated in this study that the results of a comparison between both assays may be influenced by the inclusion criteria for the subgroups analyzed (Fig. 3).

Conclusions about the diagnostic validity of cPSA vs tPSA have generally been based on ROC analysis taking into account the AUC values and partly the comparison of sensitivity and specificity at certain cutoffs. Only a few studies exist for the low tPSA range <4 [micro]g/L (7-9,11). In a multicenter study including more than 500 men with tPSA <4 [micro]g/L, the differences between cPSA and tPSA in differentiating men with PCa and men without PCa were not clearly demonstrated (7). Although a significantly larger AUC for cPSA in the tPSA range 2.5-4 [micro]g/L was found, differences in the specificities of cPSA vs tPSA at the selected sensitivities of 80%, 85%, 90%,and 95% were not statistically significant for all sensitivity values. Two similar multicenter studies of men with tPSA concentrations <4 [micro]g/L described improved detection of PCa by cPSA based on differences between the AUCs (8,11). In addition to the different clinical settings used in these studies, one reason for these discrepancies may be, at least partially, attributed to selection artifacts at the edges of narrow tPSA ranges as demonstrated in Fig. 3. Therefore, in regard to the conventional strategy of comparative ROC analysis, the results of various studies for the evaluation of the diagnostic impact of cPSA should be considered with caution.

These uncertainties in analyzing the data and interpreting the study results were the starting point for us to develop the DAC method. As described in the Results and demonstrated in Fig. 4, DAC analysis allows description of the overall and local differences in the clinical utility of both tests and suggests a significant advantage of cPSA over tPSA, as indicated by higher values for DAC-SPEC and DAC-PPV, respectively. The results are caused by the lower number of FP samples for cPSA compared with a higher number of FP samples for tPSA among the patients with discordant test results. The clinical impact of these results will be discussed in a separate report.

COMPARING ROC AND DAC

The comparison of the results of ROC and DAC analyses of our reevaluated data and the corresponding conclusions make it necessary to discuss the utility of both methods.

The major disadvantage of ROC analysis was described in the introductory paragraph and is caused by the property of the ROC approach that gives equal weight to all FP rates (5). Therefore, when comparing two tests, the performances of both assays near the cutoffs are difficult to describe by use of ROC curves of the whole data set. To overcome this problem, it is current practice to perform ROC analysis on subgroups of the data set defined by a limited concentration range of one of the markers in question. However, this approach is subject to severe biases resulting from selection effects when the subgroups are defined by a concentration range of only one of the assays in question, as can be seen in Fig. 3. In conclusion, the diagnostic performance of one assay alone around a cutoff cannot be described in a representative way, nor is it possible to get an error-free comparative analysis of the relative performance of two diagnostic tests.

[FIGURE 3 OMITTED]

In contrast, the initial step of the DAC method is an error-free, clearly defined selection of local subpopulations. The method focuses on the discordant test results. It considers exactly those cases that solely are responsible for differences in diagnostic accuracy. Selection artifacts are thus avoided. The DAC approach may make the comparative ROC subgroup analyses unnecessary.

[FIGURE 4 OMITTED]

Whereas it is current practice to combine the results of several subgroup analyses of different ranges to describe the performance in selected ranges, the DAC approach leads to meaningful and easy-to-read data and graphs for the comparison of tests within only one analysis. The concentration ranges with different diagnostic accuracies can immediately be identified. In terms of hypothesis testing, the null hypothesis of no difference can be tested at prospectively chosen cutoffs or ranges of cutoffs. Furthermore, comparison of the results of different studies can be simplified because the result of DAC analysis (e.g., DAC-SPEC and DAC-PPV) is not influenced by any subgroup selection. This is in contrast to a ROC analysis, in which subgroup selections with different ranges around a given cutoff would lead to different values for sensitivity and specificity.

In our example, the differences between cPSA and tPSA are smaller at the upper end of the concentration range (close to 6 [micro]g/L) of the sample population. This is attributable to the inclusion criterion of the study samples (tPSA <6 [micro]/L), which leads to underrepresentation of FP values in Q4 compared with Q2 in this concentration range.

In addition to this pointwise analysis, the DAC method gives a valid overall picture. Medians of DAC-SPEC and DAC-PPV or the median of the ratios [R.sub.DAC-SPEC] and [R.sub.DAC-PPV] characterize the overall performance, whereas their CIs estimate the corresponding significance level.

Unlike the ROC analysis, which is based on and limited to the calculation of sensitivities and specificities, the DAC method is primarily a selection tool for subpopulations responsible for differences in the diagnostic accuracy of tests. The DAC method paves the way for a new possibility of separation of study populations: properties of Q2 samples can be evaluated vs the properties of Q4 samples, which may provide data of clinical relevance. Here we focus on the test results, such TP and TN values, but it would also be possible to perform DAC analysis on variables such as age or tumor stage. These approaches allow deeper insights into causes or consequences of different test results and will be presented in a separate report.

Regarding the parameters DAC-SPEC and DAC-PPV analyzed here, one has to take note of their interesting properties, which are attributable to the equivalencies described in the Materials and Methods and lead to a simplification of analysis. The DAC-SPEC values of both tests add up to 1, and DAC-PPV and DAC-NPV depend on each other.

The DAC-PPV used here is strongly related to the physician s decision-making because it refers to the proportion of people with a positive test who have the target disorder (5,23). The prevalence dependency of this parameter must be taken into account, however. For example, low prevalences in screening settings would lead to lower DAC-PPV values. This should affect the DAC-PPV values of both tests in a quite similar manner. However, the aim of the DAC method is not to calculate DAC-PPVs as absolute values but to compare them to assess differences in diagnostic accuracy. The hypothesis test regarding differences in DAC-PPV does not depend on prevalence. Furthermore, the ratio of the two values strongly reduces the prevalence dependency.

As can be seen in Fig. 1, the DAC method is quite easy to use. We programmed a calculating tool that can be used as an add-in within Excel (Microsoft Corp.). (Copies of the program can be obtained from Dr. Keller at thomas.keller@acomed.de or www.acomed-statistics.com/dac-method.html.)

In practice, there are two difficulties to be solved before applying DAC analysis: First, it is necessary to define the corresponding pairs of cutoffs. In this report, we described the determination using the criterion of equal numbers of TP test results. However, particularly in the case of low numbers of these cases, difficulties occur because of a strong influence of the local distribution of these cases in the scatterplot. Therefore, similar criteria, such as equal numbers of TN values or equal numbers of cases, can be applied. The regression approach according to Passing and Bablok (24) applied to all patients or only to positive cases is also helpful. It should be mentioned that with the data set analyzed here all four possibilities lead to similar results.

The second difficulty results from the low absolute numbers in quadrants Q2 and Q4 in ranges of low sample density. This would give rise to excessive scatter and fluctuation of the resulting curves. Although the overall result (medians compared by bootstrapping) remains unaffected, the graph is hard to read. For this reason, appropriate smoothing procedures should be used before graphing of the results. For the DAC method, simple smoothing procedures such as use of weighted means are sufficient.

In summary, the DAC method represents an adequate analytical tool for comparing the diagnostic performance of two assays. The possibility to assess in detail the local performance of two tests, e.g., close to clinically relevant cutoffs, without compromising the overall picture and avoiding selection artifacts of subgroups are positive features of the DAC method. The DAC method could be used in parallel with ROC analysis of the complete sample population and could replace comparative ROC analyses of subgroups.

This study was funded by Bayer Vital GmbH (Fernwald, Germany), by a grant for Thomas Keller to develop the DAC program. All other authors except Hermann Butz, who is an employee of Bayer, did not receive any consulting fee and/or funds from Bayer. The work was also supported in part by the SONNENFELD-Stiftung, Deutsche Forschungsgemeinschaft (Ju365/5-1), and Deutsche Krebshilfe (70-3295-ST1). We gratefully thank Silke Klotzek and Sabine Becker for excellent technical support.

Received July 1, 2004; accepted November 5, 2004.

Previously published online at DOI: 10.1373/clinchem.2004.039552

References

(1.) Zweig MH, Campbell G. Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine [Review]. Clin Chem 1993;39:561-77.

(2.) Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982; 143:29-36.

(3.) DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988;44:837-45.

(4.) Obuchowski NA, Lieber ML, Wians FH. ROC curves in Clinical Chemistry: uses, misuses, and possible solutions. Clin Chem 2004; 50:1118-25.

(5.) McClish DK. Analyzing a portion of the ROC curve. Med Decis Making 1989;9:190-5.

(6.) Obuchowski NA, McClish DK. Sample size determination for diagnostic accuracy studies involving binormal ROC curve indices. Stat Med 1997;16:1529-42.

(7.) Lein M, Kwiatkowski M, Semjonow A, Luboldt H-J, Hammerer P, Stephan C, et al. A multicenter clinical trial on the use of complexed prostate specific antigen in low prostate specific antigen concentrations. J Urol 2003;170:1175-9.

(8.) Horninger W, Cheli CD, Babaian RJ, Fritsche HA, Lepor H, Taneja SS, et al. Complexed prostate-specific antigen for early detection of prostate cancer in men with serum prostate-specific antigen levels of 2 to 4 nanograms per milliliter. Urology 2002;60:31-5.

(9.) Okihara K, Fritsche HA, Ayala A, Johnston DA, Allard WJ, Babaian RJ. Can complexed prostate specific antigen and prostattc volume enhance prostate cancer detection in men with total prostate specific antigen between 2.5 and 4.0 ng/ml. J Urol 2001;165: 1930-6.

(10.) Brawer MK, Meyer GE, Letran JL, Bankson ER, Morris DL, Yeung KK, et al. Measurement of complexed PSA improves specificity for early detection of prostate cancer. Urology 1998;52:372-8.

(11.) Partin AW, Brawer MK, Bartsch G, Horninger W, Taneja SS, Lepor H, et al. Complexed prostate specific antigen improves specificity for prostate cancer detection: results of a prospective multicenter clinical trial. J Urol 2003;170:1787-91.

(12.) Okihara K, Cheli CD, Partin AW, Fritche HA, Chan DW, Sokoll U, et al. Comparative analysis of complexed prostate specific antigen, free prostate specific antigen and their ratio in detecting prostate cancer. J Urol 2002;167:2017-23.

(13.) Jung K, Elgeti U, Lein M, Brux B, Sinha P, Rudolph B, et al. Ratio of free or complexed to total prostate specific antigen: which ratio should be determined to improve the differentiation between benign prostattc hyperplasia and prostate cancer? Clin Chem 2000;46:55-62.

(14.) Allard WJ, Zhou Z, Yeung KK. Novel immunoassay for the measurement of complexed prostate-specific antigen in serum. Clin Chem 1998;44:1216-23.

(15.) Sturgeon C, Dati F, Duffy MJ, Hasholzner U, Klapdor R, Lamerz R, et al. European Group on Tumour Markers. Tumour marker recommendations. http://egtm.web.med.uni-muenchen.de/detail/1.htm (accessed September 2004).

(16.) Altman DG. Practical statistics for medical research. London: Chapman & Hall, 1991:611pp.

(17.) Hurdle W, Simar L. Applied multivariate analysis. Berlin: Springer, 2003:486pp.

(18.) Efron B, Tibshirani R. An introduction to the boostrap. New York: Chapman & Hall, 1993:436pp.

(19.) R Development Core Team. The R manuals, version 1.9.1. http://www.r-project.org (accessed September 2004).

(20.) Ihaka R, Gentleman RR. A language for data analysis and graphics. J Comp Graph Stat 1996;5:299-314.

(21.) Parsons JK, Partin AW. Applying complexed prostate-specific antigen to clinical practice. Urology 2004;63:815-8.

(22.) Okihara K, Ukimura 0, Nakamura T, Mizutani Y, Naya Y, Uchida M, et al. Can complexed prostate specific antigen enhance prostate cancer detection in Japanese men? Eur Urol 2004;46:57-64.

(23.) Biggerstaff BJ. Comparing diagnostic tests: a simple graphic using likelihood ratios. Stat Med 2000;19:649-63.

(24.) Passing H, Bablok W. A new biometrical procedure for testing the equality of measurements from two different analytical methods. Application of linear regression procedures for method compari son studies in clinical chemistry, part I. J Clin Chem Clin Biochem 1983; 21:709-20.

THOMAS KELLER, [1] HERMANN BUTZ, [2] MICHAEL LEIN, [3] MACIEJ KWIATKOWSKI, [4] AXEL SEMJONOW, [5] HANS-JOACHIM LUBOLDT, [6] PETER HAMMERER, [7] CARSTEN STEPHAN, [3] and KLAUS JUNG [3] *

[1] ACOMED Statisfik, Leipzig, Germany.

[2] Bayer Vital GmbH, Leverkusen, Germany.

[3] Department of Urology, University Hospital Charite, Humboldt University Berlin, Berlin, Germany.

[4] Department of Urology, Kantonsspital Aarau, Aarau, Switzerland.

[5] Department of Urology, University Hospital Munster, Munster, Germany.

[6] Department of Urology, University of Duisburg-Essen, Essen, Germany.

[7] Department of Urology, Stadfisches Klinikum Braunschweig, Braunschweig, Germany.

[8] Nonstandard abbreviations: AUC, area under the curve; tPSA, total prostate-specific antigen; cPSA, complexed prostate-specific antigen; DAC, discordance analysis characteristics; PCa, prostate cancer; CO, cutoff; TP, true positive; TN, true negative; FP, false positive; FN, false negative; PPV, positive predictive value; NPV, negative predictive value; and CI, confidence interval.

* Address correspondence to this author at: Department of Urology, University Hospital Charite, Humboldt University Berlin, Schumannstrasse 20/21, D-10098 Berlin, Germany. Fax 49-30-450-515904; e-mail klaus.jung@charite.de.

Depending on the clinical use of a marker, only one portion of the ROC curve, the local performance of a test within a restricted range of false-positive rates, is often of higher importance than the overall performance represented by the AUC. Furthermore, ROC calculations always analyze the whole data set. Therefore, samples in nonrelevant ranges such as very low or high concentrations influence the analysis of diagnostic performance in the relevant concentration range and may complicate appropriate interpretation of the resulting graph and data, particularly when comparing the diagnostic accuracy of two tests.

To overcome these limitations of ROC curves, alternative indices of ROC curves have been proposed, e.g., partial areas using particular specificity intervals (5,6). However, these approaches have not been widely used because of their methodologic complexity. Current practice, therefore, is to try to overcome this disadvantage of ROC analysis by analyzing the data using only a subgroup with a limited concentration range. In this respect, comparison between the clinical utility of total prostate-specific antigen (tPSA) and complexed PSA (cPSA) is a good example (7-9). When the procedure that defines a subgroup by a range of one test (A) and compares the diagnostic performance of both tests A and B is used, artifacts can result at the lower and upper ends of the concentration range of the subgroup. For example, if the lower end of the sample population is defined only by test A, then patients who are false negative for test A but true positive for test B are excluded from analysis (to the disadvantage of test B). Similar selection artifacts exist for the upper end of the concentration range of the subgroup's sample population. Misleading interpretations may result from these selection artifacts. This may be, at least partially, the reason for the contradictory conclusions of different working groups analyzing the diagnostic performance of cPSA vs tPSA (7-12).

In this report, we present a new, easy-to-use approach for routine analysis, which we have called discordance analysis characteristics (DAC), that helps to avoid the above-mentioned disadvantages. The analysis is based on a generalization of the McNemar test so that for a given pair of cutoff values only those patients are analyzed who are categorized differently by both tests. We demonstrate the potential usefulness of the method, using the example of cPSA vs tPSA with the data from a previously published multicenter study (7).

Materials and Methods

STUDY GROUPS, SAMPLES, AND ASSAYS

For elaboration of the DAC method, we used the tPSA and cPSA data from a multicenter study (7). Details concerning the study groups, blood sample collection and storage, and analytical methods were given in the original report (7). Briefly, the study included 700 white men enrolled in screening studies and or case-finding studies with tPSA concentrations between 0 and 6 [micro]g/L. All 700 men were untreated and underwent transrectal ultrasound-guided 6- to 10-sector needle biopsies of the prostate. A total of 283 patients were diagnosed as having prostate cancer (PCa), whereas in 417 men no evidence of prostate cancer (non-PCa) was found in prostate biopsies. PSA concentrations were measured by the Bayer Immuno 1 PSA and cPSA assays (Bayer Diagnostics) as described previously (7,13,14).

PRINCIPLE OF DAC ANALYSIS

The basic approach of the DAC method can be exemplified by use of a scatterplot of the tPSA and cPSA values (Fig. 1) of the multicenter study (7). In general, continuous-like data within a continuous distribution are assumed, which is the case in the analysis of laboratory analytes. When a pair of cutoff values ([CO.sub.A] and [CO.sub.B]) for tests A and B, respectively, are used, four quadrants (Q1-Q4) result. The cutoff pairs for the definition of the quadrants use the criterion that quadrants 2 and 4 should contain the same number of true positives (PCa cases), i.e., [CO.sub.A] and [CO.sub.B] deliver identical sensitivity. This criterion refers to recommendations for the diagnostic evaluation of tumor markers (15). Quadrants 1 and 3 contain cases categorized equivalently by test A and test B: "negative" in quadrant 1 (i.e., below the cutoff) and "positive" in quadrant 3 (i.e., above the cutoff). As the first step in the DAC approach, the selection of the samples used for analysis, quadrants 2 and 4 contain those cases that are relevant for the comparison of both tests because they were categorized discordantly by both tests. Quadrant 2 contains cases with negative results for test A and positive results for test B, whereas quadrant 4 contains cases with positive results for test A and negative results for test B. Thus, for each pair of cutoffs, the (Q2 + Q4) subpopulation of Q2 samples plus Q4 samples includes those samples that cause possible differences in diagnostic accuracy between the tests. In the second step, we analyze the properties of the three local subpopulations selected in step 1: Q2 samples, Q4 samples, and/or (Q2 + Q4) samples. For that purpose, the true-positive ([TP.sub.A] and [TP.sub.B]), true-negative ([TN.sub.A] and [TN.sub.B]), false-positive ([FP.sub.A] and [FP.sub.B]), and false-negative ([FN.sub.A] and [FN.sub.B]) test results are counted for both tests A and B. There are several possibilities for analyzing these counts. For the analysis of (Q2 + Q4) samples, we use a specificity-resembling parameter: the DAC specificities in (Q2 + Q4) for test A are defined as [DAC-SPEC.sub.A] = [TN.sub.A]/([TN.sub.A] + [FP.sub.A]) and accordingly for [DAC-SPEC.sub.B]. The comparative analysis of Q2 samples vs Q4 samples is performed with a parameter resembling positive predictive value (PPV): the DAC-PPV for test A is defined as [DAC-PPV.sub.A] = [TP.sub.A]/([TP.sub.A] + [FP.sub.A]) using the cases in Q4. The definition for test B is set accordingly using only Q2 cases.

[FIGURE 1 OMITTED]

HIGHER VALUES OF DAC SPECIFICITY OR DAC-PPV FOR ONE TEST INDICATE ITS SUPERIOR DIAGNOSTIC ACCURACY

It should be noted that the sum of [DAC-SPEC.sub.A] and [DAC-SPEC.sub.B] always equals 1. Accordingly, it is true that [DAC-PPV.sub.A] + [DAC-NPV.sub.B] = 1 and [DAC-PPV.sub.B] + [DAC-NPV.sub.A] = 1. These effects are attributable to the equivalen cies of [TP.sub.A] = [FN.sub.B], [TP.sub.B] = [FN.sub.A], [FP.sub.A] = [TN.sub.B], and [FP.sub.B] = TNA. Therefore, only one test needs to be graphed for DAC specificities. Similarly, only one of the parameters PPV or negative predictive value (NPV) must be analyzed.

In a third step, these calculations are done for all cutoff pairs (i = 1 ... n), where n is the number of all possible cutoff pairs regarding the criterion mentioned above. The parameters DAC-SPEC and DAC-PPV are graphed over [CO.sub.A,i] and [CO.sub.B,i] by use of two x axes. Alternatively, sensitivity could be used as the x axis.

To estimate the significance of different diagnostic accuracies, we suggest calculating the differences between [DAC-SPEC.sub.B] and [DAC-SPEC.sub.A] and between [DAC-PPV.sub.B] and [DAC-PPV.sub.A] for each pair of cutoffs. The pointwise confidence intervals (CIs) can then be calculated using the methods related to the difference of two proportions with formulas given by Altman (16). In the case of DAC-SPEC, paired observations must be considered, whereas non-paired proportions for DAC-PPV can be assumed. For both DAC-SPEC and DAC-PPV, differences >0 would be indicative of superior diagnostic accuracy for test B, and lower limits of the CIs >0 would indicate significance of the result.

This approach, leading to one or several graphs characterizing the discordant test results, is called the DAC method. A suitable computer program has been developed. (Copies of the program can be obtained from Dr. Keller at thomas.keller@acomed.de or www.acomedstatistics.com/dac-method.html.)

Sometimes it may be appropriate to consider a summary measure. We propose the use of medians of DACSPEC and DAC-PPV values and the medians of ratios of these parameters: [R.sub.DAC-SPEC] = median ([DAC-SPEC.sub.B,i]/ [DAC-SPEC.sub.A,i]) and [R.sub.DAC-PPV] = median ([DAC-PPV.sub.B,i]/ [DAC-PPV.sub.A,i]), respectively. For example, a value >1 for the latter medians would indicate the better diagnostic accuracy of marker B. However, like the AUC of ROC curves, these overall measures do not provide information about the local diagnostic performance.

STATISTICAL ANALYSIS

DAC-SPEC, DAC-PPV, and the pointwise CIs of their differences were calculated as described above (16). An assay was estimated as superior if the related DAC-SPEC and DAC-PPV values were higher than those of the comparative assay. For graphical presentations, raw data (counts) were smoothed by use of a triangular smoothing function (17).

All calculations and graphs for ROC analysis were made with an Excel[TM] (version XP for Windows; Microsoft Corporation) software program (www.acomedstatistics.com/roc-tools.html). Differences in ROC curves were estimated according to DeLong et al. (3). CIs for the AUC were calculated according to Hanley and McNeil (2). The significances of the overall parameters [medians of DAC-SPEC and DAC-PPV and medians of their ratios ([R.sub.DAC-SPEC] and [R.sub.DAC-PPV])] were considered on the basis of the 95% CIs of their medians calculated by bootstrapping (18), using 10 000 bootstrap replicates. The method was programmed using the statistical computer program R (19,20).

Results

ROC curves for cPSA and tPSA, including the whole data set, are shown in Fig. 2. The AUC for cPSA is significantly greater than the AUC for tPSA: 0.691 (95% CI, 0.655-0.725) vs 0.668 (95% CI, 0.631-0.702); P <0.0005.

To estimate the diagnostic performance only in the interesting range 2-4 [micro]g/L, a subgroup analysis seems appropriate. Scatterplots for the subgroup of patients with tPSA concentrations in the range 2-4 [micro]g/L and corresponding cPSA concentrations in the range 1.51-3.19 [micro]g/L are shown in panels A and B, respectively, of Fig. 3. The graphs demonstrate that, particularly at the edges of the selected concentration range, different patients are included in such a subgroup analysis. As shown in the ROC curves (Fig. 3, C and D), cPSA-based selection of patients leads to a significant difference between AUCs (P <0.02), whereas tPSA-based selection fails to show this difference (P = 0.15). Furthermore, the absolute values of the AUC obtained by the different selection procedures differ (tPSA, 0.55 vs 0.48; cPSA, 0.58 vs 0.53) as do the positions and shapes of the ROC curves.

The results obtained with the DAC method (Fig. 4) show the DAC-SPEC and DAC-PPV values as well as the calculated differences of DAC-SPEC and DAC-PPV graphed over the cutoffs of both analyses. DAC-SPEC and DAC-PPV values were significantly higher for cPSA in a wide range of tPSA between ~2.5 and 5.8 [micro]g/L, corresponding to cPSA values of ~1.9 to 4.8 [micro]g/L, as indicated by the positive values for the lower CI.

[FIGURE 2 OMITTED]

The median [DAC-SPEC.sub.cPSA] value of 0.78 (95% CI, 0.61-0.87) differed significantly from the [DAC-SPEC.sub.tPSA] (0.22; 95% CI, 0.13-0.39). Results were similar for theDAC-PPV of 0.63 (95% CI, 0.53-0.82) for cPSA vs 0.31 (95% CI, 0.27-0.50) for tPSA. The CIs of the medians of both pairs of DAC-SPEC for cPSA and tPSA and of DAC-PPV, respectively, did not overlap and indicated a significant difference between the medians. The medians of the ratios [R.sub.DAC-SPEC] and [R.sub.DAC-PPV] explained in the Materials and Methods, were calculated to be [R.sub.DAC-SPEC] = 3.57 (95% CI, 1.53-6.67) and [R.sub.DAC-PPV] = 2.01 (95% CI, 1.30-2.69) and differed significantly from 1.

Discussion

SUBGROUP ANALYSES IN CLINICAL STUDIES WITH cPSA AND tPSA

One recent approach to enhance the clinical utility of PSA assays is the use of cPSA forms. However, discrepant results have been described (7-12,14,21,22). To compare the diagnostic accuracies in different ranges of tPSA concentrations, selected tPSA ranges were frequently chosen as study populations. We clearly demonstrated in this study that the results of a comparison between both assays may be influenced by the inclusion criteria for the subgroups analyzed (Fig. 3).

Conclusions about the diagnostic validity of cPSA vs tPSA have generally been based on ROC analysis taking into account the AUC values and partly the comparison of sensitivity and specificity at certain cutoffs. Only a few studies exist for the low tPSA range <4 [micro]g/L (7-9,11). In a multicenter study including more than 500 men with tPSA <4 [micro]g/L, the differences between cPSA and tPSA in differentiating men with PCa and men without PCa were not clearly demonstrated (7). Although a significantly larger AUC for cPSA in the tPSA range 2.5-4 [micro]g/L was found, differences in the specificities of cPSA vs tPSA at the selected sensitivities of 80%, 85%, 90%,and 95% were not statistically significant for all sensitivity values. Two similar multicenter studies of men with tPSA concentrations <4 [micro]g/L described improved detection of PCa by cPSA based on differences between the AUCs (8,11). In addition to the different clinical settings used in these studies, one reason for these discrepancies may be, at least partially, attributed to selection artifacts at the edges of narrow tPSA ranges as demonstrated in Fig. 3. Therefore, in regard to the conventional strategy of comparative ROC analysis, the results of various studies for the evaluation of the diagnostic impact of cPSA should be considered with caution.

These uncertainties in analyzing the data and interpreting the study results were the starting point for us to develop the DAC method. As described in the Results and demonstrated in Fig. 4, DAC analysis allows description of the overall and local differences in the clinical utility of both tests and suggests a significant advantage of cPSA over tPSA, as indicated by higher values for DAC-SPEC and DAC-PPV, respectively. The results are caused by the lower number of FP samples for cPSA compared with a higher number of FP samples for tPSA among the patients with discordant test results. The clinical impact of these results will be discussed in a separate report.

COMPARING ROC AND DAC

The comparison of the results of ROC and DAC analyses of our reevaluated data and the corresponding conclusions make it necessary to discuss the utility of both methods.

The major disadvantage of ROC analysis was described in the introductory paragraph and is caused by the property of the ROC approach that gives equal weight to all FP rates (5). Therefore, when comparing two tests, the performances of both assays near the cutoffs are difficult to describe by use of ROC curves of the whole data set. To overcome this problem, it is current practice to perform ROC analysis on subgroups of the data set defined by a limited concentration range of one of the markers in question. However, this approach is subject to severe biases resulting from selection effects when the subgroups are defined by a concentration range of only one of the assays in question, as can be seen in Fig. 3. In conclusion, the diagnostic performance of one assay alone around a cutoff cannot be described in a representative way, nor is it possible to get an error-free comparative analysis of the relative performance of two diagnostic tests.

[FIGURE 3 OMITTED]

In contrast, the initial step of the DAC method is an error-free, clearly defined selection of local subpopulations. The method focuses on the discordant test results. It considers exactly those cases that solely are responsible for differences in diagnostic accuracy. Selection artifacts are thus avoided. The DAC approach may make the comparative ROC subgroup analyses unnecessary.

[FIGURE 4 OMITTED]

Whereas it is current practice to combine the results of several subgroup analyses of different ranges to describe the performance in selected ranges, the DAC approach leads to meaningful and easy-to-read data and graphs for the comparison of tests within only one analysis. The concentration ranges with different diagnostic accuracies can immediately be identified. In terms of hypothesis testing, the null hypothesis of no difference can be tested at prospectively chosen cutoffs or ranges of cutoffs. Furthermore, comparison of the results of different studies can be simplified because the result of DAC analysis (e.g., DAC-SPEC and DAC-PPV) is not influenced by any subgroup selection. This is in contrast to a ROC analysis, in which subgroup selections with different ranges around a given cutoff would lead to different values for sensitivity and specificity.

In our example, the differences between cPSA and tPSA are smaller at the upper end of the concentration range (close to 6 [micro]g/L) of the sample population. This is attributable to the inclusion criterion of the study samples (tPSA <6 [micro]/L), which leads to underrepresentation of FP values in Q4 compared with Q2 in this concentration range.

In addition to this pointwise analysis, the DAC method gives a valid overall picture. Medians of DAC-SPEC and DAC-PPV or the median of the ratios [R.sub.DAC-SPEC] and [R.sub.DAC-PPV] characterize the overall performance, whereas their CIs estimate the corresponding significance level.

Unlike the ROC analysis, which is based on and limited to the calculation of sensitivities and specificities, the DAC method is primarily a selection tool for subpopulations responsible for differences in the diagnostic accuracy of tests. The DAC method paves the way for a new possibility of separation of study populations: properties of Q2 samples can be evaluated vs the properties of Q4 samples, which may provide data of clinical relevance. Here we focus on the test results, such TP and TN values, but it would also be possible to perform DAC analysis on variables such as age or tumor stage. These approaches allow deeper insights into causes or consequences of different test results and will be presented in a separate report.

Regarding the parameters DAC-SPEC and DAC-PPV analyzed here, one has to take note of their interesting properties, which are attributable to the equivalencies described in the Materials and Methods and lead to a simplification of analysis. The DAC-SPEC values of both tests add up to 1, and DAC-PPV and DAC-NPV depend on each other.

The DAC-PPV used here is strongly related to the physician s decision-making because it refers to the proportion of people with a positive test who have the target disorder (5,23). The prevalence dependency of this parameter must be taken into account, however. For example, low prevalences in screening settings would lead to lower DAC-PPV values. This should affect the DAC-PPV values of both tests in a quite similar manner. However, the aim of the DAC method is not to calculate DAC-PPVs as absolute values but to compare them to assess differences in diagnostic accuracy. The hypothesis test regarding differences in DAC-PPV does not depend on prevalence. Furthermore, the ratio of the two values strongly reduces the prevalence dependency.

As can be seen in Fig. 1, the DAC method is quite easy to use. We programmed a calculating tool that can be used as an add-in within Excel (Microsoft Corp.). (Copies of the program can be obtained from Dr. Keller at thomas.keller@acomed.de or www.acomed-statistics.com/dac-method.html.)

In practice, there are two difficulties to be solved before applying DAC analysis: First, it is necessary to define the corresponding pairs of cutoffs. In this report, we described the determination using the criterion of equal numbers of TP test results. However, particularly in the case of low numbers of these cases, difficulties occur because of a strong influence of the local distribution of these cases in the scatterplot. Therefore, similar criteria, such as equal numbers of TN values or equal numbers of cases, can be applied. The regression approach according to Passing and Bablok (24) applied to all patients or only to positive cases is also helpful. It should be mentioned that with the data set analyzed here all four possibilities lead to similar results.

The second difficulty results from the low absolute numbers in quadrants Q2 and Q4 in ranges of low sample density. This would give rise to excessive scatter and fluctuation of the resulting curves. Although the overall result (medians compared by bootstrapping) remains unaffected, the graph is hard to read. For this reason, appropriate smoothing procedures should be used before graphing of the results. For the DAC method, simple smoothing procedures such as use of weighted means are sufficient.

In summary, the DAC method represents an adequate analytical tool for comparing the diagnostic performance of two assays. The possibility to assess in detail the local performance of two tests, e.g., close to clinically relevant cutoffs, without compromising the overall picture and avoiding selection artifacts of subgroups are positive features of the DAC method. The DAC method could be used in parallel with ROC analysis of the complete sample population and could replace comparative ROC analyses of subgroups.

This study was funded by Bayer Vital GmbH (Fernwald, Germany), by a grant for Thomas Keller to develop the DAC program. All other authors except Hermann Butz, who is an employee of Bayer, did not receive any consulting fee and/or funds from Bayer. The work was also supported in part by the SONNENFELD-Stiftung, Deutsche Forschungsgemeinschaft (Ju365/5-1), and Deutsche Krebshilfe (70-3295-ST1). We gratefully thank Silke Klotzek and Sabine Becker for excellent technical support.

Received July 1, 2004; accepted November 5, 2004.

Previously published online at DOI: 10.1373/clinchem.2004.039552

References

(1.) Zweig MH, Campbell G. Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine [Review]. Clin Chem 1993;39:561-77.

(2.) Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 1982; 143:29-36.

(3.) DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics 1988;44:837-45.

(4.) Obuchowski NA, Lieber ML, Wians FH. ROC curves in Clinical Chemistry: uses, misuses, and possible solutions. Clin Chem 2004; 50:1118-25.

(5.) McClish DK. Analyzing a portion of the ROC curve. Med Decis Making 1989;9:190-5.

(6.) Obuchowski NA, McClish DK. Sample size determination for diagnostic accuracy studies involving binormal ROC curve indices. Stat Med 1997;16:1529-42.

(7.) Lein M, Kwiatkowski M, Semjonow A, Luboldt H-J, Hammerer P, Stephan C, et al. A multicenter clinical trial on the use of complexed prostate specific antigen in low prostate specific antigen concentrations. J Urol 2003;170:1175-9.

(8.) Horninger W, Cheli CD, Babaian RJ, Fritsche HA, Lepor H, Taneja SS, et al. Complexed prostate-specific antigen for early detection of prostate cancer in men with serum prostate-specific antigen levels of 2 to 4 nanograms per milliliter. Urology 2002;60:31-5.

(9.) Okihara K, Fritsche HA, Ayala A, Johnston DA, Allard WJ, Babaian RJ. Can complexed prostate specific antigen and prostattc volume enhance prostate cancer detection in men with total prostate specific antigen between 2.5 and 4.0 ng/ml. J Urol 2001;165: 1930-6.

(10.) Brawer MK, Meyer GE, Letran JL, Bankson ER, Morris DL, Yeung KK, et al. Measurement of complexed PSA improves specificity for early detection of prostate cancer. Urology 1998;52:372-8.

(11.) Partin AW, Brawer MK, Bartsch G, Horninger W, Taneja SS, Lepor H, et al. Complexed prostate specific antigen improves specificity for prostate cancer detection: results of a prospective multicenter clinical trial. J Urol 2003;170:1787-91.

(12.) Okihara K, Cheli CD, Partin AW, Fritche HA, Chan DW, Sokoll U, et al. Comparative analysis of complexed prostate specific antigen, free prostate specific antigen and their ratio in detecting prostate cancer. J Urol 2002;167:2017-23.

(13.) Jung K, Elgeti U, Lein M, Brux B, Sinha P, Rudolph B, et al. Ratio of free or complexed to total prostate specific antigen: which ratio should be determined to improve the differentiation between benign prostattc hyperplasia and prostate cancer? Clin Chem 2000;46:55-62.

(14.) Allard WJ, Zhou Z, Yeung KK. Novel immunoassay for the measurement of complexed prostate-specific antigen in serum. Clin Chem 1998;44:1216-23.

(15.) Sturgeon C, Dati F, Duffy MJ, Hasholzner U, Klapdor R, Lamerz R, et al. European Group on Tumour Markers. Tumour marker recommendations. http://egtm.web.med.uni-muenchen.de/detail/1.htm (accessed September 2004).

(16.) Altman DG. Practical statistics for medical research. London: Chapman & Hall, 1991:611pp.

(17.) Hurdle W, Simar L. Applied multivariate analysis. Berlin: Springer, 2003:486pp.

(18.) Efron B, Tibshirani R. An introduction to the boostrap. New York: Chapman & Hall, 1993:436pp.

(19.) R Development Core Team. The R manuals, version 1.9.1. http://www.r-project.org (accessed September 2004).

(20.) Ihaka R, Gentleman RR. A language for data analysis and graphics. J Comp Graph Stat 1996;5:299-314.

(21.) Parsons JK, Partin AW. Applying complexed prostate-specific antigen to clinical practice. Urology 2004;63:815-8.

(22.) Okihara K, Ukimura 0, Nakamura T, Mizutani Y, Naya Y, Uchida M, et al. Can complexed prostate specific antigen enhance prostate cancer detection in Japanese men? Eur Urol 2004;46:57-64.

(23.) Biggerstaff BJ. Comparing diagnostic tests: a simple graphic using likelihood ratios. Stat Med 2000;19:649-63.

(24.) Passing H, Bablok W. A new biometrical procedure for testing the equality of measurements from two different analytical methods. Application of linear regression procedures for method compari son studies in clinical chemistry, part I. J Clin Chem Clin Biochem 1983; 21:709-20.

THOMAS KELLER, [1] HERMANN BUTZ, [2] MICHAEL LEIN, [3] MACIEJ KWIATKOWSKI, [4] AXEL SEMJONOW, [5] HANS-JOACHIM LUBOLDT, [6] PETER HAMMERER, [7] CARSTEN STEPHAN, [3] and KLAUS JUNG [3] *

[1] ACOMED Statisfik, Leipzig, Germany.

[2] Bayer Vital GmbH, Leverkusen, Germany.

[3] Department of Urology, University Hospital Charite, Humboldt University Berlin, Berlin, Germany.

[4] Department of Urology, Kantonsspital Aarau, Aarau, Switzerland.

[5] Department of Urology, University Hospital Munster, Munster, Germany.

[6] Department of Urology, University of Duisburg-Essen, Essen, Germany.

[7] Department of Urology, Stadfisches Klinikum Braunschweig, Braunschweig, Germany.

[8] Nonstandard abbreviations: AUC, area under the curve; tPSA, total prostate-specific antigen; cPSA, complexed prostate-specific antigen; DAC, discordance analysis characteristics; PCa, prostate cancer; CO, cutoff; TP, true positive; TN, true negative; FP, false positive; FN, false negative; PPV, positive predictive value; NPV, negative predictive value; and CI, confidence interval.

* Address correspondence to this author at: Department of Urology, University Hospital Charite, Humboldt University Berlin, Schumannstrasse 20/21, D-10098 Berlin, Germany. Fax 49-30-450-515904; e-mail klaus.jung@charite.de.

Printer friendly Cite/link Email Feedback | |

Title Annotation: | Evidence-Based Laboratory Medicine and Test Utilization |
---|---|

Author: | Keller, Thomas; Butz, Hermann; Lein, Michael; Kwiatkowski, Maciej; Semjonow, Axel; Luboldt, Hans-Joa |

Publication: | Clinical Chemistry |

Date: | Mar 1, 2005 |

Words: | 4521 |

Previous Article: | Determination of CYP2D6 gene copy number by Pyrosequencing. |

Next Article: | Association of transcutaneous bilirubin testing in hospital with decreased readmission rate for hyperbilirubinemia. |

Topics: |