Commutability assessment of potential reference materials using a multicenter split-patient-sample between-field-methods (twin-study) design: study within the framework of the Dutch project "Calibration 2000".
In view of these considerations, reference systems are needed to substantiate the claims of accurate results (1, 2). The introduction of the Directive for In Vitro Diagnostic Medical Devices (3) in the European Union has had a major impact on the further development of such systems.
In a reference system, analytical results should be traceable to the international system, SI. This traceability consists of an unbroken chain of comparisons, each with its stated uncertainties. Part of this chain is formed by the role of Certified Reference Materials (CRMs). Ideally, CRMs help to produce results that are commutable, e.g., numerically the same when different measurement procedures are applied, for all kinds of clinical conditions (2). The proposed ISO/CEN metrology standard (4) gives details for metrologic traceability and asks that the manufacturers of calibrators play a prominent role. We are convinced that a prominent role must also be played by the profession. The profession assesses, controls, and if possible, harmonizes the commercial systems between as well as within laboratories. For this activity, the profession should make use of reference materials that are commutable rather than system specific. This concept forms the basis for the Dutch project "Calibration 2000" (5-7). This project aims at harmonization of laboratory data via calibration by development of commutable, matrix-based secondary reference materials.
The NCCLS EP14 protocol (8) for evaluating possible matrix effects of processed samples or possible calibrators requires the simultaneous analysis of, preferably, 20 selected fresh patient sera together with the candidate calibrators to be studied by both a particular field method and, preferably, a reference method. The results obtained for the patient sera and the candidate calibrators with the comparative method are plotted on the x axis and the evaluated method on the y axis. The scatter of the results of the patient sera around the regression line, expressed as the prediction interval for the standardized residuals of the patient results, will be the measure for evaluating the characteristics of the preparations under investigation.
We considered implementation of the NCCLS EP14 protocol to be very demanding and costly, especially when several analytes and several field methods are involved. A practical alternative is presented, the so-called twin-study design, which in essence is a multicenter, split-patient-sample, between-field-methods protocol. The procedure is illustrated with the commutability assessment of potential reference materials (PRMs) in the analysis of HDL-cholesterol (HDL-C).
Materials and Methods
The study consisted of the simultaneous analysis of patient sera and candidate calibrators for the following analytes: total cholesterol, HDL-C, LDL-cholesterol, triglycerides, and apolipoprotein A1 and B. Analysis had to be carried out in the same analytical run for patient sera and PRMs.
Laboratories usually participating in the Dutch EQAS for general clinical chemistry were first asked about their interest in participating in the study. They were also asked for details of their measurement methodologies. Eighty-six laboratories were thus included. The study protocol consisted of the exchange of 12 fresh patient sera between each of two laboratories forming a laboratory couple; 43 laboratory couples were formed. Each laboratory was asked to select six fresh patient sera on the basis of various HDL-C concentrations, preferably spanning the relevant concentration interval for HDL-C. After these samples were split into two portions, one portion from each sample was transported the same day to the partner laboratory, which in turn proceeded in the same way for its patient specimens. The interchanged fresh patient samples were then analyzed (within 24 h of the initial analysis) in the same analytical run with the PRMs, which were sent beforehand to each participant on dry ice.
For the patient samples, only the results of the second day analysis were reported to the coordinating center. The laboratories acting as laboratory couples were selected on the basis of a modest geographic distance between each so that reanalysis could be carried out within 24 h and on the basis of differences in analytical techniques for the analysis of HDL-C.
ANALYTICAL METHODS USED
Of the study participants, 84% used one of the three direct HDL-C methods: 41% used a [alpha]-cyclodextrin sulfate ([alpha]Cyclo) method (Roche); 41% used the N-Geneous method (Roche); and 2% used an immunoinhibition method (Wako). The remaining 16% used a precipitation method: 8% used a dextran sulfate-magnesium (PrDexMg) method; 6% used a phosphotungstic acid-magnesium (PrPTA) method; and 2% used a polyethylene glycol-dextran sulfate method. The selection procedure produced the following analytical combinations and numbers of laboratory couples: N-Geneous/[alpha]-Cyclo (n = 15 couples); N-Geneous/N-Geneous (n = 8); [alpha]-Cyclo/[alpha]-Cyclo (n = 6); N-Geneous/PrDexMg (n = 4); [alpha]-Cyclo/PrPTA (n = 4); and other combinations (n = 6).
The PRMs, described in detail in the companion by Cobbaert et al. (9) in this issue of the Journal, were as follows: three frozen human serum pools (low, medium, high) prepared exactly according to the NCCLS C37 protocol (pools C37L, C37M, and C37H) (10); three pooled frozen human serum preparations originating from residuals of patient sera and selected on the basis of HDL-C concentration (FroL, FroM, and FroH); and three lyophilized human serum preparations (LyoL, LyoM, and LyoH). Selection, preparation, and lyophilization (for the Lyo PRMs) of the Fro and Lyo PRMs were carried out as described previously (11). Lyophilization took place in the presence of sucrose (200 g/L final concentration). All PRMs were stored centrally at -80 [degrees]C until dispatch on dry ice to the participants. Nominal HDL-C concentrations were as follows: 1.07, 1.25, and 1.83 mmol/L for C37L, C37M, and C37H, respectively; 0.93, 1.13, 1.55 mmol/L for FroL, FroM, and FroH, respectively; and 1.09, 1.70, and 1.89 mmol/L for LyoL, LyoM, and LyoH, respectively. All PRMs were analyzed for value assignment according to the procedure described by Cobbaert et al. (9).
STATISTICAL DATA ANALYSIS
The 43 sets of returned results were first screened for the presence of possible gross errors. In these cases, the respective results were excluded from further statistical evaluation.
Whereas the NCCLS EP14 protocol in most cases uses univariate linear regression analysis to study the behavior of patient and test samples, it was reasoned that in our case application of a bivariate distribution-free statistical approach was more appropriate because of the absence of an error-free reference method. Therefore, bivariate regression analysis according to Passing and Bablok (12,13) was used throughout.
The regression residuals of the PRMs were expressed as the absolute values D of the perpendicular distances of each PRM to the respective patient regression line and were normalized by expressing them as multiples of the state-of-the-art within-laboratory SD ([SD.sub.SA]). Because of the design of the Dutch EQAS, this [SD.sub.SA] is one of the statistical outcomes for each of the analytes covered in this scheme and is targeted on a value of 0.04 mmol/L at 1.0 mmol/L HDL-C. A concentration-dependent correction of this S[D.sub.SA] was carried out by use of a square root approximation of the precision profile of the relevant within-laboratory variation (14). The decision limit for accepting a test material as commutable was set at 3 S[D.sub.SA]. The data set was also treated by AMOVA, in which the variation components that were attributable to the measurement of the PRMs were computed. The aggregated variance of the patient samples was computed by the formula:
[CV.sub.Patients] = [square root of ([n.summation over 1][S.sup.2.sub.y|x])/ Overall patient mean x 100%
where [n.summation over 1] [S.sup.2.sub.y|x] are the squared standard errors of regression in the regression analysis of the patient samples, summed over n laboratory couples.
The total variation for each respective PRM aggregated over all laboratory couples was calculated by:
C[V.sub.PRM] = [square root of ([n.summation over 1] [D.sup.2]/n])/ Mean of PRM x 100%
Normalization was carried out by dividing the perpendicular D values by the mean result of each laboratory couple for the respective PRMs.
Finally, the extra contribution to the total uncertainty by each PRM, C[V.sub.Netto], was calculated by:
[CV.sub.Netto] = [square root of ([CV.sup.2.sub.PRM] - [CV.sup.2.sub.Patients]])
Shown in Fig. 1 are examples of the results for HDL-C returned by some laboratory couples. These examples give an indication of some of the errors that can be expected in this kind of experiment with participating laboratories operating on a voluntary basis and under relatively unsupervised experimental field conditions. It was observed that, despite guidelines on sample handling, some participants diluted the frozen, ready-to-use samples with the amount of water to be used for reconstitution of the lyophilized samples. This is seen, for example, with the Fro samples assayed by laboratory 129 (Fig. 1, middle panel in the right-hand column). Obvious aberrant cases were excluded from further data treatment. In all, the results of 42 laboratory couples were included. Visual inspection revealed the inferior behavior for some PRMs in various laboratory couples, as illustrated, for example, by laboratory couple 93/107 with the Lyo PRMs (Fig. 1, center panel). Fig. 2 shows the distribution of the [SD.sub.SA] normalized regression line residuals for each PRM in relation to the various method combinations. As seen in Fig. 2, the Lyo PRM samples in general performed worse, primarily because of the outlying behavior of the analytical combination N-Geneous/PrDexMg. Averaged over the three concentrations and the 126 residuals concerned, 1.6% of the C37 residuals were outside the 3 [SD.sub.SA] limit. For the Fro and Lyo PRMs, these values were 2.4% and 11.1%.
The [CV.sub.Netto] values for each PRM are shown in Table 1, both for the total data set and for the various analytical method combinations. In general, the [CV.sub.Netto] appeared to be most favorable for the C37 PRMs and to a lesser extent for the Fro PRMs, whereas the Lyo PRMs performed the worst, especially for the method combination N-Geneous/PrDexMg. A comparison of the information in Fig. 2 with that in Table 1 revealed the qualitative agreement of both approaches. Averaged over all method combinations and over all three PRM concentrations, mean C[V.sub.Netto] values for the C37, Fro, and Lyo PRMs were 2.9%, 4.3% and 5.3%, respectively; the C37 PRM thus was the most promising candidate for a future secondary reference material for the analysis of HDL-C.
[FIGURE 1 OMITTED]
[FIGURE 2 OMITTED]
There are two potential approaches for creating harmonized results that are traceable to the required accuracy base. In the first approach, laboratories operate by sending a certain number of patient samples to a certified reference laboratory together with results for these samples obtained over several days. This approach is, in fact, operational in the CDC Reference Laboratory Network for the measurement of total cholesterol for laboratories that wish to participate in the network (15). In the second approach, proper use of CRMs, as described earlier, plays the most prominent role. Both approaches have their advantages and disadvantages. The first approach, somewhat similar to the exchange of patient samples described here between partnered laboratories, obviates the use of expensive reference material but relies heavily on the strict compliance of participating laboratories with agreed-on procedures and may become a burden for the reference laboratory if many laboratories wish to participate. It is therefore desirable for CRMs to be developed that are commutable over all the methods used for a specific analyte.
Possible matrix effects in PRMs can be evaluated by comparison of the scatter of the results for these samples with the scatter of results for patient specimens around the regression line. The regression method applied in the NCCLS EP14 protocol uses the assumption that, because a reference method (x axis) is used, there is no error in the comparative method. Therefore, the residual scatter for patient results, taken in the y axis direction, is influenced by two factors: imprecision and nonspecificity of the method under evaluation. The use of replicate measurements can reduce the contribution of imprecision, and the remaining scatter points primarily to the influence of nonspecificity attributable to interference from known or unknown substances (matrix effect). In view of our primary aim to evaluate not only the characteristics of a PRM in combination with the various known methods for analyzing HDL-C, but also to study the possibility of using the same reference material for all other lipid and lipoprotein analytes (9), we realized that proper application of the NCCLS EP14 protocol would require a high investment in time and money. As organizers of an EQAS, we have intensive contact with its participants. Making use of the existing logistic environment, we thought that a concerted action, such as the twin-study design described here, might demonstrate the possibility for a practical alternative to the NCCLS EP14 protocol. Instead of performing replicate measurements in one analytical setting for the methods to be evaluated, we used the replicates formed by the aggregated results of the participating laboratories. It may be realized that evaluation of individual laboratory couple cases eventually will show larger result scattering compared with a situation in which each laboratory is instructed to report replicate results. This implies that both the imprecision and a potential matrix effect are being "seen" to the maximum degree, which we consider an additional advantage of the approach used in this study.
The absence of a reference method with presumed minimal error led us to use the bivariate regression analysis of Passing and Bablok (12,13). In addition to being bivariate, this regression technique is rather insensitive to extreme outlying data points. In view of these considerations, we think that the present approach, which involves a sufficiently large population of participating laboratories with different analytical methods, is a viable way of getting information on the commutability characteristics of PRMs and, in that sense, may possibly be regarded as a practical addendum to the NCCLS EP14 protocol. We realize that our multicenter approach, because of its relatively unsupervised experimental conditions, inevitably introduces clerical and/or logistic errors for which the data set has to be screened and corrected before data analysis can be carried out.
The NCCLS EP14 protocol uses a 95% confidence interval around the patient regression line, depending on the inherent scatter of the patient results. In view of the multicenter approach used in this study, we thought it better not to use this individual measure and introduced a more general criterion, [SD.sub.SA], for normalization of the residuals of the PRM results. In this way all included laboratory couple data sets were evaluated against a common standard. In cases with a relatively large patient scatter, this approach will lead to less liberal weighting of the particular PRM, as illustrated in Fig. 1, in which one of the data points for the C37M PRM that exceeded the 3 [SD.sub.SA] limit was caused by the results for the laboratory couple 240/33, as depicted in Fig. 2. In addition, the C37H data point for this laboratory couple was the highest data point in the population of normalized regression residuals for the method combination N-Geneous/[alpha]-Cyclo in Fig. 1.
In a previous study (11) in which different materials for use in an EQAS were evaluated for their effects on the accuracy of the total cholesterol assay, it was concluded that the detrimental effect of lyophilization on the serum matrix could be minimized by suitable cryoprotection with sucrose. In the case of HDL-C, used in the present study, we have to conclude that sucrose does not provide this protecting effect. Taken over all three concentrations of the Lyo PRMs, in most of the cases in which the normalized residuals exceeded the 3 [SD.sub.SA] limit, a precipitation method was involved. During recent years, we have seen a large increase in the use of direct assays for HDL-C at the expense of the precipitation methods, from 10% in 1996 to 85% at present. In light of the preceding discussion, this is a promising development with respect to further harmonization of HDL-C analysis results.
We had two reasons for introducing overall descriptive statistics. The first reason is that the expression [CV.sub.Netto] gives quantitative insight into the density distribution of the normalized residuals, which can only be deduced from Fig. 2 in a qualitative way. It is easy, for example, to grasp the overall worse performance of the Lyo PRMs, for which the mean [CV.sub.Netto] value was 5.3% compared with 2.9% for the C37 PRMs. However, it is much more difficult to visualize from Fig. 2 the difference between the performance of C37L and C37H, with [C[.sub.Netto] values of 3.6% and 1.9%, respectively. The second reason is that [CV.sub.Netto] allows extrapolation from the study of commutability characteristics to the situation in which a population of laboratories effectively uses a common calibrator. C[V.sub.Netto] may be interpreted as the extra contribution by the PRM to total measurement uncertainty. The basic assumption is that a perfect calibrator does not contribute to the intrinsic measurement uncertainty. In the ideal case, the [CV.sub.Netto] value should therefore be zero. Alternatively, a [CV.sub.Netto] value of, e.g., 2.5% implicates that the respective PRM introduces an extra measurement error of 2.5%. In the imaginary case of any other inherent errors being absent, this consequently may be translated to an expected value for the between-laboratory CV of 2.5%. If the state-of-the-art within-laboratory CV is 4%, as seen for HDL-C in The Netherlands, then an additional 2.5% between-laboratory variation component contributes to a total variation of [square root of ([4.sup.2] + [2.5.sup.2])] = 4.7%.
We realize that the identification of a material as a potential candidate for a successful secondary reference material does not imply that the material is already fit to be used as such. Validation of the stability, value assignment with traceability to the relevant accuracy base accompanied by stated uncertainty levels, and the guarantee that future production lots have the same quality are a few of the prerequisites needed to meet the specifications of an accepted reference material. We think that we have taken the first step in this process by the characterization of candidate PRMs. In the meantime, we think that it is already possible to use the available C37 material in routine exchanges in the Dutch EQA system. The present twin-study approach has been used in the total setting of a commutability and harmonization study of PRMs by Cobbaert et al. (9) to be used for the standardization not only for HDL-C, but also across the other lipid and lipoprotein analyses in The Netherlands.
(1.) Tietz NW. Accuracy in clinical chemistry-does anybody care? Clin Chem 1994;40:859-61.
(2.) Muller MM. Implementation of reference systems in laboratory medicine. Clin Chem 2000;46:1907-9.
(3.) EU Lex. Directive 98/79 EC on in vitro diagnostic medical devices. Off J L 1998;331:1-37.
(4.) International Organization for Standardization, European Committee for Standardization. In vitro diagnostic medical devices--measurement of quantities in samples of biological origin-metrological traceability of values assigned to calibrators and control material. ISO/TC 212/WG2 N65 prEN 17511. Geneva: International Organization for Standardization, 2000.
(5.) Jansen RTP. Kalibratie 2000. Ned Tijdschr Klin Chem 1998;23: 261-4.
(6.) Jansen RTP. The quest for comparability: calibration 2000. Accred Qual Ass 2000;5:363-6.
(7.) Jansen RTP, Kuypers AWHM, Baadenhuijsen H, van den Besselaar AMHP, Cobbaert CM, Gratama JW, et al. Kalibratie 2000. Ned Tijdschr Klin Chem 2000;25:153-8.
(8.) National Committee for Clinical Laboratory Standards. Evaluation of matrix effects; proposed guideline. NCCLS Document EP14. Wayne, PA: NCCLS, 1998.
(9.) Cobbaert C, Weykamp C, Baadenhuijsen H, Kuypers A, Lindemans J, Jansen R. Selection, preparation, and characterization of commutable frozen human serum pools as potential secondary reference materials for lipid and apolipoprotein measurements: study within the framework of the Dutch project "Calibration 2000". Clin Chem 2002;48:1526-38.
(10.) National Committee for Clinical Laboratory Standards. Preparation and validation of commutable frozen human serum pools as secondary reference materials for cholesterol measurements procedures; approved guideline. NCCLS Document C37. Wayne, PA: NCCLS, 1999.
(11.) Baadenhuijsen H, Demacker PNM, Hessels M, Boerma GJM, Penders TJ, Weykamp C, et al. Testing the accuracy of total cholesterol assays in an external quality-control program: effect of adding sucrose to lyophilized control sera compared with use of fresh or frozen serum. Clin Chem 1995;41:724-30.
(12.) Passing H, Bablok W. A new biometrical procedure for testing the equality of measurements from two different analytical methods. Part I. J Clin Chem Clin Biochem 1983;21:709-20.
(13.) Passing H, Bablok W. A new biometrical procedure for testing the equality of measurements from two different analytical methods. Part II. J Clin Chem Clin Biochem 1984;22:431-45.
(14.) Steigstra H, Jansen RTP, Baadenhuijsen H. Combi scheme: new combined internal/external quality assessment scheme in The Netherlands. Clin Chem 1991;37:1196-204.
(15.) Myers GL, Kimberly MM, Waymack PP, Smith SJ, Cooper GR, Sampson EJ. A reference method laboratory network for cholesterol: a model for standardization and improvement of clinical laboratory measurements. Clin Chem 2000;46:1762-72.
Henk Baadenhuijsen,  * Herman Steigstra,  Christa Cobbaert, [2,3] Aldy Kuypers,  Cas Weykamp,  and Rob Jansen 
 Dutch Foundation for Quality Assessment in Clinical Laboratories (SKZL), University Hospital Nijmegen, NL 6500 HB Nijmegen, The Netherlands.
 Amphia Hospital, 4819 EV Breda, The Netherlands.
 Lipid Reference Laboratory, University Hospital Rotterdam, 3000 CA Rotterdam, The Netherlands.
 Queen Beatrix Hospital, 7100 GG Winterswijk, The Netherlands.
 St. Anna Hospital, 5660 AB Geldrop, The Netherlands.
 Nonstandard abbreviations: EQAS, external quality assessment scheme; CRM, Certified Reference Material; PRM, potential reference material; HDL-C, HDL-cholesterol; [alpha]-Cyclo, [alpha]-cyclodextrin sulfate; PrDexMg, dextran sulfate-magnesium precipitation; PrPTA, phosphotungstic acid-magnesium precipitation; and S[D.sub.SA], state-of-the-art within-laboratory SD.
* Address correspondence to this author at: University Medical Center Nijmegen, Department of Clinical Chemistry/ 116 SKZL, PO Box 9101, NL 6500 HB Nijmegen, The Netherlands. Fax 31-24-356-0686; e-mail hbaadenhuijsen@ skzl.nl.
Received September 4, 2001; accepted May 8, 2002.
Table 1. Overall and method-combination-related values for [CV.sub.Netto] for the three types and three concentrations of PRM. [CV.sub.Netto], (a) % Overall N-Geneous/ N-Geneous/ [alpha]-Cyclo/ (n = 42) N-Geneous [alpha]-Cyclo [alpha]-Cyclo (n = 8) (n = 14) (n = 6) C37L 3.6 4.2 3.5 4.7 C37M 3.2 4.4 3.1 0 C37H 1.9 1.0 1.4 0 FroL 5.9 8.2 6.1 2.8 FroM 3.5 1.3 4.5 3.4 FroH 3.5 2.2 4.5 2.1 LyoL 5.8 3.1 6.7 6.1 LyoM 4.0 2.4 2.0 1.6 LyoH 6.2 4.8 3.7 3.3 [CV.sub.Netto], (a) % N-Geneous/ [alpha]-Cyclo/ Other PrDexMg PrPTA combinations (n = 4) (n = 4) (n = 6) C37L 2.7 4.1 1.3 C37M 1.8 2.2 4.3 C37H 4.8 2.9 0.7 FroL 1.9 6.3 5.4 FroM 2.3 6.4 1.4 FroH 0.8 4.9 3.3 LyoL 11 0 3.3 LyoM 9.8 4.7 5.4 LyoH 15 6.8 6.4 (a) n, number of laboratory couples.
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Laboratory Management|
|Author:||Baadenhuijsen, Henk; Steigstra, Herman; Cobbaert, Christa; Kuypers, Aldy; Weykamp, Cas; Jansen, Rob|
|Date:||Sep 1, 2002|
|Previous Article:||Validation of a high-throughput liquid chromatography-tandem mass spectrometry method for urinary cortisol and cortisone.|
|Next Article:||Selection, preparation, and characterization of commutable frozen human serum pools as potential secondary reference materials for lipid and...|