Adding value to proficiency testing programs.
The enactment of CLIA 88 unraveled the comfortable existence of a few large PT providers who primarily served hospital laboratories. In the early 1990s, thousands of previously unregulated physician office laboratories were added to the PT pool. Many of these laboratories eventually subscribed to newly established PT providers whose missions were more closely aligned to the participants' medical organizations, e.g., the American Academy of Family Physicians and the American Society of Internal Medicine. As of 1998, there were 20 different CLIA-certified PT programs (4). The presumably decreased profitability of the PT business has led to a commoditization of PT products and less effort expended in the design and manufacture of PT specimens for analytes that are not directly regulated by CLIA.
Another result, coincident with the enactment of CLIA 88, was the development of PT programs that serve a narrow spectrum of users, such as those using instruments of a specific manufacturer. Finally, the requirement to subscribe to CLIA-approved PT programs has eroded the use of voluntary external quality assessment programs that provided specimens far more frequently, such as the Murex program, which provides unknowns every 2 weeks, and Toxi-Lab, which provides an unknown every 2 months.
Not only did CLIA prescribe maximum limits on deviations of PT results from peer means, CLIA 88 also increased the numbers of unknowns analyzed by the participating laboratory, from the usual two specimens to five. The increased numbers of unknowns allows the PT provider to more accurately characterize analytical performance and thus more accurately identify the poorly performing laboratory. The rules used by the PT provider to characterize unsatisfactory performance evolved from extremely empiric rules that required results to be within [+ or -] 2 SD of the group mean to currently requiring at least four of five results to be within the CLIA error limits. Under CLIA 88, if two or more results are outside the error limits, performance is deemed unsatisfactory and the participating laboratory is at risk of various sanctions.
In this issue of Clinical Chemistry, Richard Jenny and Kathryn Jackson-Tarentino (5) of the New York State Department of Health report an analysis of causes of PT failures in their therapeutic drug program. Most of the participant failures, roughly 60%, are attributable to analytical error. Although modern analytical instruments are inherently capable of producing results that are accurate and precise enough to meet clinical requirements, the authors conclude that many of today's quality-control (QC) practices are not optimized to detect the presence of significant error. They recommend that laboratories use QC procedures that limit their instruments' analytical error to that specified by the instrument's stable performance. The New York State PT program applies more stringent criteria for allowable error than those in the CLIA requirements for therapeutic drugs (e.g., 15% vs CLIA's 25% for theophylline). For a participating laboratory to be confident of keeping PT results within these limits, instruments must perform according to manufacturers' specifications and analytical variation must be tightly controlled.
In the study by Jenny and Jackson-Tarentino (5), approximately one-half of the PT participants used QC procedures whose allowable deviations exceeded the allowable deviations of the PT program. This is disappointing because the tools to develop efficient QC procedures that can control errors to 15% have been available for many years. These procedures have been made so simple that they can be implemented easily, often without sophisticated statistical calculations.
The first step in developing a testing process that will perform well on PT is to select an instrument whose analytical method can be controlled within the limits of allowable error with commonly used QC procedures such as the multirule procedure. The inherent error of the method must be less than the allowable error, and there must be some "working room" that allows the QC procedure to detect errors before they exceed allowable error. Westgard and Burnett (6) have provided the following criterion to accomplish this. In the absence of bias, the standard deviation must be less than 25% of allowable error:
Standard deviation < Allowable error / 4 (1)
The CV of a theophylline assay must be less than 15%/4 or 3.75% for QC to control the assay satisfactorily. If a shift of 2.35 SD should occur, 5% of the results would have errors exceeding random error. Multirule QC with two or three controls has an ~50-70% chance of detecting such an error when it occurs (7). Thus, instrument and method selection is at least partly about selecting methods whose manufacturers claim performance that meets this criterion.
In the next step, the user must evaluate the instrument to demonstrate that its performance in the user's laboratory is consistent with the manufacturer's claim. When this condition is met, the user can implement the method and set up a QC procedure that is a good compromise of cost and ability to detect significant increases in analytical error. On an ongoing basis, the user must maintain the precision of the instrument and the limits of the QC procedures within the requirements of Eq. 1. Jenny and Jackson-Tarentino (5) show that many PT participants use QC limits that are much wider than those consistent with the performance of their instruments, reducing sensitivity so much that only gross analytical errors can be detected.
When bias is present, the allowable standard deviation of Eq. 1 must be reduced to maintain total error (the combination of bias and random error) below allowable error. Westgard (8) has developed simple graphical tools to determine the acceptability of any combination of bias and imprecision (SD) when allowable error is known. Sometimes, the inherent precision and accuracy (also called trueness) of a method are not good enough for simple multirule QC with two controls to provide adequate error detection. These situations require control procedures with more replicates or control procedures with higher false rejection rates. Although most instruments have good methods, some methods will be marginal and will require more expensive QC procedures and extra attention to maintenance of performance.
The user must control bias and precision on a long-term basis. Precision is readily demonstrated by monthly QC statistics. If the user participates in a regional QC program, bias can be detected by the difference between the user's mean values for controls and the grand means of participants in the regional QC program. This approach is an effective means of detecting significant bias as long as a sufficiently large number of laboratories participate in the regional program and if the matrix of the QC material behaves similarly to that of the materials used by the PT provider.
Because of the regulatory overtones associated with PT, especially for regulated analytes, PT programs seldom offer their customers any interpretation that goes beyond the mandated pass/fail reporting. However, users can detect problems by interpreting their own data, even when none of the five PT results exceeds allowable error. We have developed a multirule system to detect bias and increased random error in sets of five PT results (9). The system consists of a screening rule, one follow-up rule to detect systematic error, and two other rules to detect random error.
We have been using this multirule approach in several laboratories for the last 6 years (9) and have formally evaluated its application to 16 months of immunoassay testing in two different laboratories (10). Significant sources of both random and systematic error have been discovered and corrected using this technique. We recommend that the technique be used to detect error conditions that have not yet led to unsatisfactory results but have caused analytical errors that are almost large enough to cause unsatisfactory results (near misses). The identification of such conditions, followed by investigation and correction, would thus help prevent errors.
It is interesting that the inquiry report used by the New York State Department of Health to follow up unsatisfactory performance is sent separately from the regular proficiency test report. Most of the information provided, such as magnitude of error and identification of the error as a random or systematic error or a result of nonlinearity, can be included in the regular report. This information would be immediately useful to the participating laboratory. We propose that all PT organizations provide such information whenever the laboratory demonstrates unsatisfactory performance. Whether similar workups of near misses should also be included in the PT report must depend on the PT user's requirements.
The level of information offered by the New York State Department of Health appears unrivalled. New York laboratories can more easily initiate standardized process investigation and process improvement. We hope that other PT providers will follow the lead of Jenny and Jackson-Tarentino.
(1.) US Department of Health and Human Services. Medicare, Medicaid and CLIA programs: regulations implementing the Clinical Laboratory Improvement Amendments of 1988 (CLIA), final rule. Fed Regist 1992;57:7002-186.
(2.) National Committee for Clinical Laboratory Standards. A quality system model for health care; approved guideline. NCCLS Document GP26-A. Wayne PA: NCCLS, 1999.
(3.) Fraser CG, Petersen HP. Analytical performance characteristics should be judged against objective quality specifications [Editorial]. Clin Chem 1999; 45:321-3.
(4.) Health Care Financing Administration: 1998's currently approved proficiency testing programs for CLIA. http://www.hcfa.gov/medicaid/clia/ptlist98.htm.
(5.) Jenny RW, Jackson-Tarentino KY. Causes of unsatisfactory performance in proficiency testing. Clin Chem 1999;45:89-99.
(6.) Westgard J0, Burnett RW. Precision requirements for cost-effective operation of analytical processes. Clin Chem 1990;36:1629-32.
(7.) Cembrowski GS, Carey RN. Laboratory management: QC & QA., Chicago, IL: ASCP Press, 1989:62.
(8.) Westgard J0. A method evaluation decision chart (MEDx chart) for judging method performance. Clin Lab Sci 1995;8:277-83. (available in pdf format at http://www.westgard.com/medx.htm).
(9.) Cembrowski GS, Anderson PG, Crampton CA, Coupland R, Carey RN. Pump up your PT IQ. Med Lab Observer 1996;28(1):46-51.
(10.) Cembrowski GS, Crampton CA, Byrd J, Carey RN. Detection and classification of proficiency testing errors in HCFA-regulated analytes: applications to ligand assays. J Clin Immunoassay 1994;17:210-4.
George S. Cembrowski  * R. Neill Carey 
 Capital Health Authority 4B1.24 Walter C. Mackenzie Centre 8440 112th St. Edmonton, Alberta Canada T6G 2B7
 Peninsula Regional Medical Center 100 East Carroll St. Salisbury, MD 21801
* Author for correspondence. Fax 780-407-8599; e-mail firstname.lastname@example.org.
|Printer friendly Cite/link Email Feedback|
|Author:||Cembrowski, George S.; Carey, R. Neill|
|Date:||Jan 1, 2000|
|Previous Article:||Mario Werner, MD (1931-2001).|
|Next Article:||Pitfalls in the diagnosis of patients with a partial dihydropyrimidine dehydrogenase deficiency.|