A new approach for clinical biological assay comparison and standardization: application of principal component analysis to a multicenter study of twenty-one carcinoembryonic antigen immunoassay kits.
The measurement of carcinoembryonic antigen (CEA) 13 constitutes a major model for pointing out discrepancies and defects in standardization (6-10). This marker, described in 1965 by Gold and Freedman (11), is a glycoprotein whose functions have not been elucidated (12). Its increase is associated with the progression of gastrointestinal tumors and other cancers (13,14). Used worldwide for the diagnosis and follow-up of cancer patients (15), CEA is one of the most frequently measured tumor markers in France. The importance of the risk of error in CEA measurement must not be ignored by clinicians (16), particularly for results close to the commonly used cutoff value of 5 [micro]g/L.
Taking into account these considerations, a joint group composed of representatives of several scientific societies (the Societe Francaise de Biologie Clinique, the Commission de Radioanalyse et Techniques Associees, and the Federation Nationale des Centres de Recherches et de Lutte contre le Cancer) and a federation of manufacturers, the Syndicat de L'Industrie du Diagnostic in Vitro (SIDV), decided to evaluate the actual degree of heterogeneity of the results given by the different CEA immunoassay kits distributed in France. For the first time, nearly all of the commercial kits available on a national market for the assay of one given marker were tested simultaneously on a large panel of serum samples: 21 kits (distributed by 14 companies) were evaluated by the measurement of CEA in sera from 80 patients.
Statistical analysis of so many data points raises a problem. Classical regression analysis used for evaluating results obtained by one method vs those of a reference method (or at least a commonly used one), and which only compares kits two-by-two, was inadequate, even if improvements in this approach have been proposed (17,18). The two-by-two kit comparison is particularly unsuitable in the case of CEA immunoassays for which there are no methods that measure the absolute quantity of CEA and no recognized standard methods to compare the results. A "mathematical visualization" technique was thus necessary to provide an easy-to-understand representation of our large data set. Consequently, we used principal component analysis (PCA), which has been shown to be a very powerful tool for displaying relationships between different factors (particularly in social and agronomic sciences), but which has never been applied, to our knowledge, in the field of medical analysis evaluation.
A brief explanation of the principle of PCA (19, 20), as applied to our study, follows. PCA is a method that reduces the number of variables to a small number of principal components. These components summarize the information in the original variables and are linear combinations of them. Suppose we had tested the 21 kits by assaying only two sera. On a graph with two axes (on a plane, i.e., a two-dimensional space), we could represent each kit by a point whose coordinates are the CEA concentration of the first serum on one axis and the CEA concentration of the second serum on the second axis. (Each kit can also be represented by an associated vector whose origin is the intercept of the axes and whose extremity is one of the above-stated points). Kits giving quite similar results would be represented by points clustered into the same group, and discrepant kits would give outlying points. If we had three sera, we could use a third axis, orthogonal to the first two axes, in a three-dimensional space. However, two or three sera are not a statistically representative sample and the addition of more sera implies a space with more than two or three dimensions. Because we have a panel of 80 sera, which definitely constitutes a representative sample, we need an 80-dimensional space, mathematically possible but impossible to display. In this multidimensional space, we can still calculate, for each kit, an associated vector whose coordinates are, on 80 axes, the 80 initial components, i.e., the 80 results of the assay of the 80 sera with each kit. 14 We then need to reduce the number of dimensions to obtain a displayable representation on a plane.
To this aim, in this multidimensional space, a two-dimensional subspace can be determined. This plane is defined by two axes chosen to conserve the maximum of the initial data (these axes must be independent, i.e., orthogonal). In this way, the two axes are determined in the two orthogonal directions representing the maximum dispersion of the data. The projections of each kit-associated vector on this plane are then calculated. The projected vectors will conserve the maximum of the distances between the initial vectors.
On these new axes, the coordinates of the projected vectors are its principal components. Clearly, in this representation, the projected points (i.e., the extremities of the projected vectors) will be clustered for kits giving similar results and separated for discrepant kits. The dispersion of the results for the sera, after elimination of the influence of their absolute CEA concentrations (explained below), can be analyzed in a similar manner, thus making it possible to identify outlying sera as well.
In both cases, we could also calculate a third axis, orthogonal to the first two, and obtain a third principal component. We could then calculate the subsequent axes (until n--1 axes, where n is the number of dimensions) and components that obviously cannot be displayed. In fact, two or three axes bearing the first two or three components would summarize the data and should be easily interpretable. Additional components are mostly "noise". The ability to reduce the data to a very small number of vectors is known (21).
It is not guaranteed that the principal components will correlate with some experimental factor such as incubation time or temperature, antibody affinity, antibody heterogeneity, patient status, or degree of glycosylation, but such is possible and illustrates the exceptional analytical power of PCA.
Materials and Methods
The panel of 80 sera consisted of 77 samples obtained from patients suffering from breast (n = 51), colorectal (n = 16), pancreas (n = 3), liver (n =2), stomach (n = 2), lung (n = 1), thyroid (n = 1), or ovary (n =1) cancer. Three samples were obtained from non-cancer patients. To verify whether the between-assay variability could be decreased by the use of a common calibrator, each laboratory assayed a purified and concentrated 5000 [micro]g/L CEA solution (referred to here as the purified common calibrator), kindly provided by Immunotech (Marseilles, France), which initially had been prepared for the Bureau Communautaire de Reference.
BLIND ASSAY PROCEDURE
The manufacturers involved in the study assayed the panel of sera using their own reagent kits. All tubes of serum were first coded to ensure that the sera could not be identified by the manufacturers. The results obtained with each kit were then coded by the SIDV to ensure that, after gathering the data, no one (except the SIDV) could identify the kits, the condition imposed by the manufacturers to participate in this study.
To test the 21 kits, 21 identical panels of the 80 serum samples were formed. Each serum, identified by its serum code, was divided into 21 aliquot parts in one of our laboratories, the sampling site. The encoding software was created and kept at a second site, the coding site. This software was used to print out 80 boards of 21 labels. Each board bore one of the serum codes. Each label had a letter and a unique randomized number. The 80 boards were then sent to the sampling site.
At the sampling site, the labels from each board were placed at random on the tubes of the corresponding serum sample. The tubes of this sample were then distributed equally among the 21 panels, so that all tubes bearing the same letter were assembled in the same panel. Each tube was thus identified by its unique randomized number code and one letter code common to the 80 tubes of each panel. It was therefore impossible, before the final decoding, to establish a connection between the serum samples in the different panels.
The 21 panels of 80 tubes were frozen and sent to the SIDV, where letter codes were assigned to the kits (the SIDV kept these letter codes secret and destroyed them at the end of the study). The panels were then dispatched to the manufacturers of the designated kits. Each manufacturer assayed the serum samples once or in duplicate (for 14 kits). The purified common calibrator was assayed once for 5 kits, in duplicate for 12 kits, and in triplicate for 2 kits; no results were given for 2 kits. Each manufacturer was free to choose the dilutions of the purified common calibrator so that they fell within the concentration range for the kit in question.
The results were returned, via the SIDV, to the coding site, where the number codes of the sera were decoded and the results tabulated for the corresponding serum-kit couples. The kits remained identified by their letter codes only.
ANALYSIS OF THE DATA
The data (i.e., single results or the mean of duplicates) were collected in an 80 X 21 array with the individuals (sera, n = 80) in rows and the variables (kits, p = 21) in columns. Ten missing values were replaced by the mean result for the serum in question.
As a first approach, we examined the raw data by calculating the general mean value of each kit and the mean value, SD, and CV of each serum. Results given by the purified common calibrator were also analyzed and regression lines (measured vs theoretical values) calculated.
The PCA study was then performed using the ADE 3.6 software (a gift from D. Chessel and S. Doledec, Program Library for the Analysis of Environmental Data, URA CNRS 1451, Universite Lyon 1, Villeurbanne, France).
To reduce and homogenize the variances, the raw data were first log transformed. The array was then bicentered. Bicentering, which leads to a mean value equal to zero in the columns and in the rows, is very useful when the data table is homogeneous (only one measurable dimension), which was the case for our data. This calculation consisted of subtracting from each logarithmic value the general mean of the array ([micro]), the isolated effect of the serum ([[alpha].sub.i], in the rows), and the isolated effect of the kit ([[beta].sub.j], in the columns). Each data point was thus reduced to the residual ([[epsilon].sub.ij]) of the analysis of variance  with two factors without interaction. This residual represents the composed effect of each serum-kit couple.
This mathematical treatment, i.e., logarithmic transformation and bicentering, is a crucial point and has a very important consequence (22): it eliminates (a) the effect of the calibration on the results, (i.e., the differences because of the use of different calibrators in different kits); and (b) the absolute CEA concentration of each serum. In other words, it extracts the only remaining heterogeneity attributable to the differences in the reactivity between the 80 X 21 couples of CEA samples and kits.
[FIGURE 1 OMITTED]
To analyze the term [[epsilon].sub.ij], we chose  to use the model with "additive main effects and multiplicative interaction (AMMI)", also called "factor analysis of variance" (24-26) proposed by Mandel (24) and Gollob (25): [[epsilon].sub.ij] = [lambda] [[gamma].sub.i] [[delta].sub.j] + [[epsilon]'''.sub.ij].
In this model, the term [lambda] [[gamma].sub.i] [[delta].sub.j] represents the multiplicative interaction between serum "i" and kit "j" and is given by the PCA of the set of [[epsilon].sub.ij]. 
The variability (called "inertia") of the entire set of 21 points included in the 80-dimensional space was evaluated," and the axes (bearing the principal components) were determined using matrix calculations. The first axis was determined in the direction that allowed the representation of the maximum rate of the total inertia, i.e., the one for which the inertia of the values projected on this axis were the highest possible; the second axis was determined in a direction orthogonal to the first axis, giving access to the maximum of the remaining inertia. Additional axes were determined in the same manner. For each axis, the percentage of projected inertia was calculated vs the total inertia of the set and indicated the importance of the axis in the total variability of the set.
[FIGURE 2 OMITTED]
In the AMMI model (24-26), it is usually possible to statistically verify the requisite number of components (i.e., the number of dimensions). In our study, this was not possible for two essential reasons: (a) the interaction by the use of the residual variance ([[epsilon]'.sub.ij]) could not be tested, and (b) other procedures (23) could not be used because of the excessively large number of kits and sera (exceeding the dimensions of the proposed tables).
The PCA representation of the log-transformed and bicentered data underscored the similarities and the differences between kits (and between sera, which were treated in exactly the same manner).
EVIDENCE FOR OVERALL DISPERSION OF THE RESULTS Preliminary examination of the results confirmed their suspected heterogeneity. The general mean values for the kits varied from 21.8 [micro]g/L (kit X) to 34.1 [micro]g/L (kit 5). The discrepancies between six kits taken as examples are illustrated in Fig. 1. The results for kit B are close to the mean values, but the results for kits I, X, and D increasingly differ from those for kit B. The results for kit D are very far from the mean results. On the other hand, the results for kits K and R (general mean value, 33.7 and 34.1 [micro]g/L, respectively) are very close to each other and seem, perhaps, to differ from those for kit B (general mean value, 26.6 [micro]g/L) only by a proportionality factor.
Inspection of the results for individual sera clearly confirmed the discrepancies. The results for four sera are represented in Fig. 2. The interkit reproducibility, expressed as the CV, ranged from 16% for serum 26 to 49% for serum 10 (Fig. 2A). For serum 10, with a mean value of ~19 [micro]g/L, the results with kits F and X, under or near the cutoff value of 5 [micro]g/L, would have led to an error in diagnosis. The results of the assays of serum 20 and serum 58, whose mean results were close to the cutoff value, underscore how serious this problem is (Fig. 2B). For serum 20, even with a rather low CV (19%), the results varied from 3.5 [micro]g/L to twice that concentration, and indicated for the patient either a favorable or an unfavorable prognosis, depending on the kit used. In the case of serum 58, the discrepancies were more obvious: the mean value of 6.1 [micro]g/L, which was above the cutoff value, apparently indicated a poor prognosis for the patient; nine kits, however, gave reassuring results because they were below the cutoff value. In addition, the same kits did not yield the largest discrepancies for all samples: for example, the largest discrepancies for serum 58 were in kits F and D, and the largest discrepancies for serum 10 were in kits L and X. It must be noted that, even if this kind of representation highlights some degree of heterogeneity, it cannot be used to give an entire and global view of the assays of the 80 sera by the 21 kits.
[FIGURE 3 OMITTED]
In most cases, correlation between theoretical values and measured values of the purified common calibrator showed good linearity of the regression lines, but their slopes differed, ranging from 0.48 to 1.04, depending on the kit. An attempt to equalize the results of the assays by dividing the values by the slopes failed to reduce the heterogeneity.
DATA TRANSFORMATION AND REDUCTION
The differences between the 21 kits are also seen in the histograms and gaussian curves of the log-transformed results in Fig. 3, which shows particularly marked differences between kit D and the other kits. These differences are even more pronounced in Fig. 4, which is a view of the data after they were bicentered, the areas of the circles (positive values) and the squares (negative values) being proportional to the residuals [[epsilon].sub.ij]. Kit D again appears to be markedly different from the others; kit E can also be singularized.
[FIGURE 4 OMITTED]
[FIGURE 5 OMITTED]
PCA REPRESENTATION OF THE DATA
PCA allows a finer analysis of the heterogeneity. The closer the points were, the more similar the corresponding kits were. In contrast, a point separated from the others represented a kit giving discrepant results.
In the first representation obtained for the kits (Fig. 5), the first two axes determining the plane account for 61.1% of the total inertia. A third axis would add 15% more inertia, but this representation is more difficult to display. The two-dimensional scheme confirms that kit D differs strongly from the others. Its point is so distant from the others that all the data processing (including the PCA of the sera) was performed again without the data from this kit, which is responsible by itself for a large part of the variability of this first representation.
In the new representation (Fig. 6) without kit D, the first two axes account for 61.8% of the new total inertia, and the array is rearranged. Four groups of reagents can now be clearly identified: a main group of 13 (A, B, C, H, J, K, L, N, Q, R, 5, V, and, somewhat separated, Y), a group of 4 (G, T, M, and I), a group of 2 (F and X), and finally 1 isolated kit (E).
We pointed out above (Fig. 1) that the results for kit B differed from those for kits K and R, perhaps only by a proportionality factor. Interestingly, it can be noted that, on the PCA plot (Fig. 6), kits K, R, and B fall into the same group (the main group). This strongly suggests that the differences between the results for kit B vs those for kits K and R are essentially caused by a difference in calibration and that this difference has been eliminated by the log transformation and bicentering of the data.
The PCA biplot for the sera is presented in Fig. 7. The differences attributable to their absolute CEA concentration have also been eliminated by the log transformation and the bicentering of the data. The majority of the sera belong to the same cluster of points, suggesting that these sera react similarly; however, some of them, sera 19, 10, 67, 11, 21, and 71, are distinct from this main cloud and can be identified as outliers. These six sera appear to react differently with the set of kits than the main group of sera. Interestingly, the results for serum 10 already appeared as highly variable in the preliminary examination of the results (Fig. 2).
The analyses of variance and the PCA representations of the 20 kits (without kit D) and the 80 sera are summarized in Fig. 8. Each result (each serum-kit couple, represented by a circle or a square, as in Fig. 4) is now positioned by the projection of the kit on the first axis of Fig. 6 and of the serum on the first axis of Fig. 7. A major group, composed of the most homogeneous kits and sera, is easily distinguishable. Kit Y is slightly isolated from this group. The second group of kits (G, T, M, and I), the isolated kits (E, X, and F), and the most distant sera (19, 10, 67, and 71) appear clearly separated.
The preliminary examination of the data showed their high degree of variability: differences between the mean value of the results for each kit, differences between the profiles of the kits with respect to all the sera, very large differences in the CEA concentrations of the samples when measured by the different kits, and differences in the calibration curves when a common calibrator was used. Nevertheless, using this rather classical approach, we were not able to determine exactly which kits and sera were responsible for the dispersion.
[FIGURE 6 OMITTED]
After logarithmic transformation and bicentering, differences between kits (and also between sera) were more striking. The first PCA biplot then showed that the kits could be distributed into a main group and one isolated kit. This analysis confirmed that this kit, kit D, was very different from the others, and its data had to be excluded from further calculations of the PCA representations. Among the remaining 20 kits, 13 kits were found to form a main group, meaning that their results were practically equivalent or proportional. Another kit, kit E, now appeared to be very different from the others.
PCA can be a very useful tool to identify outliers caused by human mistakes (data transcription errors) or instruments out of control; however, although the hypothesis of human blunder cannot be completely eliminated, precautions  were taken so that we can be reasonably confident in the conclusions of this work.
The PCA of the sera revealed that 74 samples were similar and 6 were different. For these outlying sera, different results were obtained with the majority of the kits used. No relationship between the results and the nature of the pathology of the patients could be identified, neither with respect to localization of the tumor nor with respect to disease progression, metastatic status, or therapy. These clinical variables are independent data and cannot explain the discrepancies.
[FIGURE 7 OMITTED]
It can be concluded that 8 kits and 6 samples were the main causes of most of the heterogeneity we observed in our assay of 80 sera, using 21 kits. In addition, the conclusions of the mathematical analysis of the kits did not vary when the outlying samples were removed from the data set.
With regard to the calibrators, Bormer (6) showed that neither cross-testing of the CEA calibrators of various kits nor the use of the International Reference Preparation OMS 73/601 reduced the discrepancies. In our study, we were not able to equalize the results of the set of kits by the use of our purified common calibrator. In addition, among the 10 kits calibrated against the OMS 73/601 preparation, only 6 kits belonged to the main group of 13 determined by the PCA analysis (data obtained under the control of the SIDV), thus confirming that a common calibrator alone is not sufficient to improve the results.
Moreover, with regard to this main group of 13 kits, we could have expected that the proportional differences in their results would have been reduced by the use of the purified common calibrator. Surprisingly, even in this group, correcting the results according to the slopes of this calibrator failed to equalize the assays. This could be explained by the choice of our somewhat artificial calibrator, and a more natural calibrator obtained by mixing sera from patients might be more efficient. Along these lines, one solution was to multiply the individual results by the ratio of the mean results for the whole group of 13 kits to the mean result of each of the kits used; this allowed a rather good homogenization of the results (data not shown). In this way, the whole serum panel would in fact become the calibrator and perhaps the best one we can use.
No particular link with any of the other known characteristic of the kits (information obtained under the control of the SIDV) was found to explain the differences between the groups. In the main group, five kits use at least one polyclonal antibody and five use a monoclonal system. Antibodies directed against three epitopes on the CEA molecule--gold I, IV, and V (27,28)--are used most often, but seem to be distributed randomly among the kits. The fact remains that CEA immunoassays need to be studied in more detail. A clear identification of the CEA epitopes recognized (28,29) is a key point, given the existence of mutants of the CEA molecule (30) and variations in glycosylation (31). Other characteristics of the assay must be taken into consideration: for example, the number of monoclonal antibodies used to capture the antigen, the kinetic constants of interaction, the buffer, the coating procedure, and the nature of the solid phase and of the detection system. An attempt to select antibodies directed against particular antigenic sites has been successful for the immunoassay of thyroglobulin (32) and luteinizing hormone (33, 34).
[FIGURE 8 OMITTED]
In conclusion, although we were unable to explain the differences between the results obtained using different kits and to attribute these differences to a particular characteristic of the kits, PCA representation allowed us to compare simultaneously, in a large serum panel, almost all of the reagents available on a national market proposed for the measurement of an analyte. To our knowledge, this is the first attempt to apply PCA in the field of medical laboratory analysis. The coding procedure and the collaboration between scientists and manufacturers avoided any biases and disagreements.
PCA, which has generally been used to identify and represent the differences and similarities between populations, can also be used to compare reagent kits (and serum samples). It is a powerful method, able to analyze the results of a large comparative assay, to provide a comprehensive and clear presentation of a large data array, and to graphically display homologies and differences between reagent kits. Nevertheless, it does not allow the determination of whether the results given by a kit are the true values, but only that these results differ from those given by another kit. A common calibrator should no longer be seen as the primary solution to reduce the dispersion of the results, nor should a search for common calibrators be the main focus for manufacturers and researchers. What is essential is to identify those kits giving similar results; only in this situation might a common calibrator significantly reduce the dispersion. In fact, we found that, instead of our artificial common calibrator, the whole panel of patient sera itself was the best calibrator. For any study on standardization of reagents kits, we conclude that a panel of sera from patients would probably be the best reference system. This model, using both PCA and a serum panel, could be extended to any of these studies. It is thus of great interest that this mathematical method become known to biologists and manufacturers. It would complement the classical two-by-two comparisons, whose exclusive use is difficult to apply when many kits are considered.
A biological observatory could be created with the objective of determining the degree of heterogeneity between commercially available kits for the assay of important blood markers and monitoring their eventual changes. By applying the PCA to the assays of large serum panels, this observatory could help manufacturers during the development of new kits by showing them the position of the new reagents with respect to the existing ones. This approach could also be of great interest to legal authorities in charge of quality control and delivery of marketing authorizations. They could also check the stability of the authorized reagents by performing a periodic global survey.
We thank Francis Paolucci and Gerard Derzko for their help in the design of the study and Andrew Kramar for critical comments. The following companies kindly collaborated to this project: Abbott, Amersham Kodak, BioMerieux, Bio-Rad, Boehringer-Mannheim, Chiron, Cis Bio International, Hoechst-Behring, Merck-Clevenot, EG&G, Realef, Produits Roche, Serono Diagnostics, and Sorin Biomedica.
Received December 17, 1998; accepted March 23, 1999.
(1.) Pilo A, Zuccheli GC, Cohen R, Bizollon CA, Capelli G, Cianetti A, et al. Comparison of immunoassays for tumor markers CA 19-9, CA 15-3 and CA 125: data from an international quality assessment scheme. Tumori 1995;81:117-24.
(2.) Soletormos G, Schioler V, Nielsen D, Skovsgaard T, Dombernowsky P. Interpretation of results for tumor markers on the basis of analytical imprecision and biological variation. Clin Chem 1993; 39:2077-83.
(3.) Kenemans P. Multicenter technical and clinical evaluation of a fully automated enzyme immunoassay for CA 125. Clin Chem 1992;38:1466-71.
(4.) Weykamp CW, Penders TJ, Muskiet FAJ, Van der Slik W. Effect of calibration on dispersion of glycohemoglobin values determined by 111 laboratories using 21 methods. Clin Chem 1994;40:13844.
(5.) Painter PC. Discordant hCG results in pregnancy: a method in crisis. Clin Casebook 1989;27:20-4.
(6.) Bormer OP. Standardization, specificity and diagnostic sensitivity of four immunoassays for carcinoembryonic antigen. Clin Chem 1991;37:231-6.
(7.) Bormer OP. Major disagreements between immunoassays of carcinoembryonic antigen may be caused by nonspecific cross-reacting antigen 2 (NCA-2). Clin Chem 1991;37:1736-9.
(8.) Davidson H, Pledger DR, Belfield A. Lack of comparability between CEA analysis using three different methods. Ann Clin Biochem 1985;22:94-7.
(9.) Turpeinen U, Haglund C, Roberts P, Stenman UH. Comparability of three assays for carcinoembryonic antigen [Letter]. Clin Chem 1992;38:1506-8.
(10.) Morrisey NE, Quadry SF, Kinders R, Brigham C, Rose S, Blend MJ. Modified method for determining carcinoembryonic antigen in the presence of human anti-murine antibodies. Clin Chem 1993;39: 522-9.
(11.) Gold P, Freedman SO. Demonstration of tumor-specific antigens in human colonic carcinoma by immunological tolerance and absorption techniques. J Exp Med 1965;121:439-62.
(12.) Lensch HG, Hera SA, Drzeniek Z, Hummel K, Markos Z, Puszta I, Wagener C. Escherichia coli of human origin binds to carcinoembryonic antigen (CEA) and non-specific cross reacting antigen (NCA). FEBS Lett 1990:61:405-9.
(13.) Marx J. Many gene changes found in cancer news. Science 1989;246:1386-8.
(14.) Klemenz R, Hoffmann S, Werenskiold AK. Serum- and oncoprotein-mediated induction of a gene with sequence similarity to the gene encoding carcinoembryonic antigen. Proc Natl Acad Sci U S A 1989;86:5708-12.
(15.) American Society of Clinical Oncology. Clinical practice guidelines for the use of tumor markers in breast and colorectal cancer [Special Article]. J Clin Oncol 1996;14:2843-77.
(16.) Moertel CG, Fleming TR, MacDonald JS, Haller DG, Laurie JA. An evaluation of the carcinoembryonic antigen (CEA) test for monitoring patients with resected colon cancer. JAMA 1993;270:943-7.
(17.) Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986;1:307-10.
(18.) Bland JM, Altman, DG. Comparing methods of measurement: why plotting difference against standard method is misleading. Lancet 1995;346:1086-7.
(19.) Seber GAF. Multivariate observations. New York: John Wiley & Sons, 1984:712 pp.
(20.) Sharma S. Applied multivariate techniques. New York: John Wiley & Sons, 1996:493 pp.
(21.) Wernimont G. Evaluating laboratory performance of spectrophotometers. Anal Chem 1967;39:554-62.
(22.) Kasmierczak JB. Analyse logarithmique, deux exemples d'application. Rev Stat Appl 1985;33:13-24.
(23.) Tukey JW. One degree for non-additivity. Biometrics 1949;5:23242.
(24.) Mandel J. A new analysis of variance model for non-additive data. Technometrics 1971;13:1-8.
(25.) Gollob HF. A statistical model which combines features of analysis of factor analytic and analysis of variance techniques. Psychometrika 1968;33:73-115.
(26.) Goodman LA, Haberman SJ. The analysis of non-additivity in two-way analysis of variance. J Am Stat Assoc 1990;85:139-45.
(27.) Hammarstrom S, Shively JE, Paxton RJ. Beatty BG, Larson A, Ghosh R, et al. Antigenic sites in carcinoembryonic antigen. Cancer Res 1989;49:4852-8.
(28.) Bormer OP, Thrane-Steen K. Epitope group specificity of six immunoassays for carcinoembryonic antigen. Tumor Biol 1991; 12:9-15.
(29.) Hass GM, Boiling TJ, Kinders RJ, Henslee JG, Mandecki W, Dorwin SA, Shively JE. Preparation of synthetic polypeptide domains of carcinoembryonic antigen and their use in epitope mapping. Cancer Res 1991;51:1876-82.
(30.) Udenfriend S, Micanovic R, Kodukula K. Structural requirements of a nascent protein for processing to a PI-G anchored form: studies in intact cells and cell-free systems. Cell Biol Int Rep 1991;15:739-59.
(31.) Miura M, Fukuyama Y, Hirano T, Hirano M, Matsuzaki H, Oka H. Sugar chain multiformity of human carcinoembryonic antigen: difference between normal and tumor associated subfractions. Clin Chem 1990;35:583-4.
(32.) Piechaczyk M, Baldet L, Pau B, Bastide J-M. Novel immunoradiometric assay of thyroglobulin in serum with use of monoclonal antibodies selected for lack of cross-reactivity with autoantibodies. Clin Chem 1989;35:422-4.
(33.) Costagliola S, Niccoli P, Florentino M, Carayon P. European collaborative study on luteinizing hormone assay. 1. Epitope specificity of luteinizing hormone monoclonal antibodies and surface mapping of pituitary and urinary luteinizing hormone. J Endocrinol Investig 1994;17:397-406.
(34.) Costagliola S, Niccoli P, Florentino M, Carayon P. European collaborative study on luteinizing hormone assay. 2. Discrepancy among assay kits is related to variation both in standard curve calibration and epitope specificity of kit monoclonal antibodies. J Endocrinol Investig 1994;17:407-16.
JEAN-CLAUDE RYMER,  ROBERT SABATIER,  ALAIN DAVER,  JACQUES BOURLEAUD,  MARCEL ASSICOT,  JACQUELINE BREMOND,  ,JACQUELINE RAPIN,  SHARON LYNN SALHI,  BRUNO THIRION,  ANNE VASSAULT,  JACQUES INGRAND,  and BERNARD PAU, [2,12] * for the joint group convened by the SOCIETE FRANCAISE DE BIOLOGIE CLINIQUE (SFBC), the COMMISSION DE RADIOANALYSE ET TECHNIQUES ASSOCIEES (CORATA), the FEDERATION NATIONALE DES CENTRES DE LUTTE CONTRE LE CANCER (FNCLCC), and the SYNDICAT DE L'INDUSTRIE DU DIAGNOSTIC IN VITRO (SIDV)
 Laboratoire de Biochimie, CHU Henri Mondor, Avenue De Lattre de Tassigny, 94000 Creteil, France.
 CNRS UMR 9921, Faculte de Pharmacie, Avenue Charles Flahault, 34060 Montpellier Cedex 2, France.
 Centre Paul Papin, Laboratoire de Radio-Immunologie, 2 Rue Moll, 49036 Angers Cedex, France.
 Agence FranCaise de Securite Sanitaire des Produits de Sante, Site de Montpellier-Vendargues, 13 Rue de la Garenne, 34740 Vendargues, France.
 Departement de Biologie Clinique, Institut Gustave Roussy, 94805 Villejuif, France.
 Laboratoires Abbott, 12 Rue de la Couture Silic 203, 94518 Rungis Cedex, France.
 Syndicat de l'Industrie du Diagnostic in Vitro, 6 Rue de la Tremoille, 75008 Paris, France.
 Laboratoire d'Immunologie et Biotechnologie, Faculte de Pharmacie, Avenue Charles Flahault, 34060 Montpellier Cedex 2, France.
 Cis Bio International, BP 21, 91192 Gif sur Yvette Cedex, France.
 Biochimie A, Hopital Necker, 149 Rue de Sevres, 75743 Paris Cedex 15, France.
 Laboratoire de Radioanalyse, Hopital Cochin, 27 Rue du Faubourg St. Jacques, 75674 Paris Cedex 14, France.
 Comitee of Immunoanalysis IFCC, Faculte de Pharmacie, Avenue Charles Flahault, 34060 Montpellier Cedex 2, France.
* Author for correspondence. Fax 33 (0)4 67 54 86 10; e-mail firstname.lastname@example.org.
 Nonstandard abbreviations: CEA, carcinoembryonic antigen; SIDV, Syndicat de L'Industrie du Diagnostic in Vitro; PCA, principal component analysis; and AMMI, additive main effects and multiplicative interaction.
 In our case, the rough results are first transformed into the residuals of the analysis of variance (see below).
 If we set [X.sub.ij] as the logarithm of the measured CEA concentration in serum "i" with kit "j", then: [[epsilon].sub.ij] = [X.sub.ij] - ([mu] + [[alpha].sub.i] + [[beta].sub.j]).
 The non-repetition of the determinations prevents the splitting of [[epsilon].sub.ij] into two terms: the interaction term [[gamma].sub.ij] and the error [[epsilon]'.sub.ij]. Tukey (23) proposed intermediate models containing a multiplicative interaction term: [[epsilon].sub.ij] = [lambda] [[alpha].sub.i] [[beta].sub.i] + [[epsilon]''.sub.ij].
 The calculations are as follows. In the preceding term of the multiplicative interaction, [[epsilon].sub.ij] = [lambda] [[gamma].sub.i] [[delta].sub.j] + [[epsilon]'''.sub.ij], [lambda] is a coefficient and [[gamma].sub.i] and [[delta].sub.j] are the coordinates of one component. In an r-dimensional space, with (k = to r) and [1 [less than or equal to] r [less than or equal to] Min (n - 1, p - 1)]: [[epsilon].sub.ij] = [[lambda].sup.(1)] [[gamma].sub.i.sup.(1)] [[delta].sub.j.sup.(1)] + [[lambda].sup.(2)] [[gamma].sub.i.sup.(2)] [[delta].sub.j.sup.(2)] + ... + [[lambda].sup.(r)] [[gamma].sub.i.sup.(r)] [[delta].sub.j.sup.(r)] + [[epsilon]'''.sub.ij]. Finally, [[epsilon].sub.ij] = [[epsilon].sup.r.sub.(k=1) [[lambda].sup.(k)] [[gamma].sub.i.sup.(k)] [[delta].sub.j.sup.(k) + [[epsilon]'''.sub.ij]. The terms [[lambda].sup.(k)], [[gamma].sub.i.sup.(k)], and [[delta].sub.j.sup.(k)] are found by matrix calculations on the [[epsilon].sub.ij] array.
 The total inertia of a set of points is given by: I = [[SIGMA].sub.i] [m.sub.i] [d.sub.i.sup.2], where [m.sub.i] is the "weight" of each point and [d.sub.i], its distance from the on gin O. In our case, the homogeneity of data led us to choose the weights equal to one, and then the total inertia is equal to the sum of the squares of the previous residuals [[epsilon].sub.ij]: I = [[epsilon].sub.ij] [[epsilon].sub.ij.sup.2]
 When the results of the study were shown to all the manufacturers, each received a copy of the entire set of transcribed rough data for the assay manufactured by that company. Each kit was still identified only by its letter code and each serum only by its number codes. Each manufacturer knew the letter code for its kit and was asked to verify the results. None of the manufacturers, including the manufacturers of kits D and E, ever contested the results of the study.
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Laboratory Management|
|Author:||Rymer, Jean-Claude; Sabatier, Robert; Daver, Alain; Bourleaud, Jacques; Assicot, Marcel; Bremond, Ja|
|Date:||Jun 1, 1999|
|Previous Article:||Quantification of riboflavin, flavin mononucleotide, and flavin adenine dinucleotide in human plasma by capillary electrophoresis and laser-induced...|
|Next Article:||Necessary sample size for method comparison studies based on regression analysis.|