An overview of method validation--part 1.
What is method validation?
As per ISO 8402:1992, validation is defined as "confirmation by examination and provision of objective evidence that the particular requirements for a specified intended use are fulfilled." Method validation can be interpreted as being the process of defining an analytical requirement, and confirming that the method under consideration has performance capabilities consistent with what the applications require. The buzzword is that it will be necessary to evaluate the method's performance capabilities.
It is implicit in the method validation process that the studies to determine the method performance parameters are carried out using equipment that is within specification, working correctly and adequately calibrated. Likewise, the operator carrying out the studies must be competent in the field of work under study and have sufficient knowledge related to the work to be able to make appropriate decisions from the observation made as the study progresses. Method validation is very closely tied to method development; indeed, it is often not possible to determine exactly where method development finishes and validation begins. Many of the method performance parameters that are associated with method validation are, in fact, usually evaluated as part of method development.
Why method validation is necessary?
Thousands of laboratories around the world are performing millions of analytical measurements everyday for evaluating goods for trade purposes, supporting health care, testing the quality of drinking water, analyzing the elemental composition of an alloy to confirm its suitability for use in aircraft construction, or forensic analysis of body fluids in criminal investigations. Virtually every aspect of society is supported in some way or other by analytical measurement. The cost of carrying out these measurements is high, and additional costs arise when decisions are made on the basis of these results. Clearly it is important to determine the correct result and be able to show that it is correct.
If the result of a test cannot be trusted, then it has little value, and the test might as well have not been carried out. When a customer commissions analytical work from a laboratory, it is assumed that the laboratory has a degree of expert knowledge that the customers do not have themselves. The customer expects to be able to trust results reported, and usually only challenges them when a dispute arises. Thus, the laboratory and its staff have a clear responsibility to justify the customer's trust by providing the right answer to the analytical part of the problem; in other words, results that have demonstrable "fitness for purpose." Implicit in this is that the tests carried out are appropriate for the analytical part of the problem that the customer wishes solved, and that the final report presents the analytical data in such a way that the customer can readily understand it and draw appropriate conclusions. Method validation enables chemists to demonstrate that a method is "fit for purpose."
For an analytical result to be fit for its intended purpose, it must be sufficiently reliable that any decision based on it can be taken with confidence. Thus, the method performance must be validated and the uncertainty on the result, at a given level of confidence, estimated. Uncertainty should be evaluated and quoted in a way that is widely recognized, internally consistent and easy to interpret.
Regardless of how good a method is and how skillfully it is used, an analytical problem can be solved by the analysis of samples only if those samples are appropriate to the problem. Taking appropriate samples is a skilled job, requiring an understanding of the problem and its related chemistry. A laboratory, as part of its customer care, should, wherever possible, offer advice to the customer over the taking of samples. Clearly there will be occasions when the laboratory cannot itself take or influence the taking of the samples. On these occasions, results of analysis will need to be reported on the basis of the samples received, and the report should make this distinction clear.
When should method be validated?
A method should be validated when it is necessary to verify that its performance parameters are adequate for use for a particular analytical problem. For example:
* New method developed for a particular problem;
* established method revised to incorporate improvements or extended to a new problem;
* when quality control indicates an established method is changing with time;
* established method used in a different laboratory, or with different analysis or different instrumentation; and
* to demonstrate the equivalence between two methods, e.g., new method and a standard.
The extent of validation or revalidation required will depend on the nature of the changes made in reapplying a method to different laboratories, instrumentation, operators and the circumstances in which the method is going to be used. Some degree of validation is always appropriate even when using apparently well characterized standard or published methods.
How should methods be validated?
Who carries out method validation
The laboratory using a method is responsible for ensuring that it is adequately validated, and if necessary, for carrying out further work to supplement existing data.
If a method is being developed to have wide-ranging use, perhaps as a published standard procedure, then collaborative study involving a group of laboratories is probably the preferred way of carrying out method validation. However, it is not always a suitable option for industrial laboratories. When it is inconvenient or impossible for a laboratory to enter into collaborative study, a number of questions are raised, including:
* Can laboratories validate methods on their own and if so how?
* Will methods validated in this way be recognized by other laboratories?
* What sort of recognition can be expected for in-house methods used in a regulatory environment?
Working in isolation inevitably reduces the amount of validation data that can be gathered for a method. Principally it restricts the type of information on inter-laboratory comparability. This information is not always required, so this may not be a problem. If necessary, it may be feasible to get some idea of the comparability of measurement results of any given method with others obtained elsewhere by measuring certified reference materials or by comparing the method against one for which the validation has been carried out.
Whether or not methods validated in a single laboratory will be acceptable for regulatory purposes depends on any guidelines covering the area of measurement concerned. It should normally be possible to get a clear statement of policy from the regulatory body.
The following are different performance parameters of a method and what they show.
Confirmation of identity and selectivity/specificity
In general, analytical methods can be said to consist of a measurement stage which may or may not be preceded by an isolation stage. It is necessary to establish that the signal produced at the measurement stage, or other measured property, which has been attributed to the anatyte, is only due to the analyte and not from the presence of something chemically or physically similar or arising as a coincidence. This is confirmation of identity. Whether or not other compounds interfere with the measurement of the analyte will depend on the effectiveness of the isolation stage and the selectivity/specificity of the measurement stage. Selectivity and specificity are measures which assess the reliability of measurements in the presence of interferences. Specificity is generally considered to be 100% selectivity, but agreement is not universal. Where the measurement stage is non-specific, it is possible to state that certain analytes do not interfere, having first checked that this is the case. It is far more difficult to state that nothing interferes, since there is always the possibility of encountering some hitherto unrecognized interference. There will be cases where chemical interferences can be identified for a particular method, but the chances of encountering them in real life may be improbable. The analyst has to decide at what point it is reasonable to stop looking for interferences. These parameters are applicable to both qualitative and quantitative analysis.
The selectivity of a method is usually investigated by studying its ability to measure the analyte of interest in test portions to which specific interferences have been deliberately introduced. Where it is unclear whether or not interferences are already present, the selectivity of the method can be investigated by studying its ability to measure the analyte compared to other independent methods/techniques.
Limit of detection
Where measurements are made at low analyte or property levels, e.g., in trace analysis, it is important to know what is the lowest concentration of the analyte or property value that can be confidently detected by the method. The importance in determining and the problems associated with it arise from the fact that the probability of detection does not suddenly change from zero to unity as some threshold is crossed. The problems have been investigated statistically in some detail and a range of decision criteria proposed. Additional confusion arises because there is currently no universal agreement on the technology applicable. The terms "limit of detection" or "detection limit" are not generally accepted, although they are used in some sectoral documents. ISO uses as a general term "minimum detectable value of the net state variable," which for chemistry translates as "minimum detectable net concentration." IUPAC is cautious in the use of "detection limit," preferring "minimum detectable (true) value."
For validation purposes, it is normally sufficient to provide an indication of the level at which detection becomes problematic. For this purpose, the "blank + 3s" approach will usually suffice. Where the work is in support of regulatory or specification compliance, a more exact approach such as that described by IUPAC and various others is likely to be appropriate. It is recommended that users quote whichever convention they have used when stating a detection limit. The mean and the standard deviation of the sample blank are likely to be dependent on the matrix of the sample blank. Limit of detection will therefore be matrix dependent. Similarly, where such criteria are used for critical decisions, the relevant precision values will need to be re-determined regularly in line with actual operating performance.
Limit of quantification
The limit of quantification (LoQ) is strictly the lowest concentration of the analyte that can be determined with an acceptable level of repeatability precision and trueness. It is also defined by various conventions to be the analyte concentration corresponding to the sample blank value plus five, six or ten standard deviations of the blank mean. It is also sometimes known as "limit of determination." LoQ is an indicative value and should not normally be used in decision making. Neither LoD nor LoQ represent levels at which quantitation is impossible. It is simply that the size of the associated uncertainties approach comparability with the actual result in the region of LoD.
Working and linear ranges
For any quantitative method, it is necessary to determine the range of analyte concentrations or property values over which the method may be applied. This refers to the range of concentrations or property values in the solutions actually measured rather than in the original samples. At the lower end of the concentration range, the limiting factors are the values of the limits of detection and/or quantitation. At the upper end of the concentration, various effects depending on the instrument response system will impose range limitations. Within the working range, there may exist a linear response range. Within the linear range, signal response will have a linear relationship to analyte concentration or property value. The extent of this range may be established during the evaluation of the working range. Evaluation of the working and linear ranges will also be useful for planning what degree of calibration is required when using the method on a day to day basis. It is advisable to investigate the variance across the working range. Within the linear range, one calibration point may be sufficient to establish the slope of the calibration line. Elsewhere in the working range, multi-point (preferably +6) calibration will be necessary. The relationship of instrument response to concentration does not have to beperfectly linear for a method to be effective, but the curve should be repeatable from day to day. Working and linear range may be different for different matrices according to the effect of interference arising from the matrix.
Accuracy expresses the closeness of a result to a true value. Method validation seeks to quantify the likely accuracy of results by assessing both systematic and random effects on results. Accuracy is, therefore, normally studied as two components, trueness and precision. The trueness is an expression of how close the mean of a set of results is to the true value. Trueness is normally expressed in terms of bias. Precision is a measure of how close results are to one another, and is usually expressed by measures such as standard deviation, which describe the speed of the result. In addition, an increasingly common expression of accuracy is measurement of uncertainty, which provides a single figure expression of accuracy.
Practical assessment of trueness relies on comparison of mean results from a method with known values, that is, trueness is assessed against a reference value (i.e., true value or conventional true value). Two basic techniques are available, checking against reference values for a characterized material or from another characterized method. Reference values are ideally traceable to international standards. Certified reference materials are generally accepted as providing traceable values; the reference value is then the certified value of the CRM. To check trueness using a reference material, the mean and standard deviation of a series of replicate tests are determined and compared with the characterized value for the reference material.
Validation needs to fit the purpose, so the choice of reference material may be affected by the end use. The reference material must be appropriate to the use. For regulatory work, a relevant certified material is taken into use. For long-term in-house work, a stable in-house material or certified reference material should be used. For short-term or non-critical work, a prepared standard is often sufficient. To check against an alternative method, results from two methods for the same sample or samples are compared. The sample may be CRMs, in-house standards or simply typical samples. There are advantages to using CRMs, since these have known stability and homogeneity, and additionally give an indication of bias with respect to international standards. On the other hand, CRMs are costly and may not be representative of typical samples.
Interpreting bias measurement
The method bias arises from systematic errors inherent to the method whichever laboratory uses. Laboratory bias arises from additional systematic errors peculiar to the laboratory and its interpretation of the method. In isolation, a laboratory can only estimate the combined bias. However, in checking bias, it is important to be aware of the conventions in force for the purpose at hand. For most purposes, acceptability of bias should be decided on the basis of overall bias measured against appropriate materials or reference methods, taking into account the precision of the method, any uncertainties in reference material values, and the accuracy required by the end use. Statistical significance tests are recommended.
Precision is normally determined for specific circumstances, which in practice can be varied. The two most common precision measures are repeatability and reproducibility. Repeatability (the smallest expected precision) will give an idea of the sort of variability to be expected when a method is performed by a single analyst on one piece of equipment over a short time scale, i.e., the sort of variability to be expected between results when a sample is analyted in duplicate. If a sample is to be analyted by a number of laboratories for comparative purposes, then a more meaningful precision measure to use is reproducibility (this is the largest measure of precision normally encountered). Precision is usually stated in terms of standard deviation or relative standard deviation.
Measurement uncertainty is a single parameter (usually a standard deviation or confidence interval) expressing the range of values possible on the basis of the measurement result. A measurement uncertainty estimate takes account of all recognized effects operating on the result, the uncertainties associated with each effect are combined according to well-established procedures. Where the contribution of individual effects is important, for example in calibration laboratories, it will be necessary to consider the individual contributions from all individual effects separately.
An uncertainty estimate for analytical chemistry should take into account:
* The overall, long term precision of the method;
* bias and its uncertainty, including the statistical uncertainty involved in the bias measurements, and the reference material or method uncertainty;
* calibration uncertainties (as most equipment calibration uncertainties will be negligibly small by comparison with overall precision and uncertainty in the bias, this needs only to be verified); and
* any significant effects operating in addition to the above. For example, temperature or time ranges permitted by the method may not be fully exercised in validation studies, and their effect may need to be added. Such effects can be usefully quantified by robustness studies or related studies, which establish the size of a given effect on the result.
This is effectively the gradient of the response curve, that is, the change in instrument response which corresponds to a change in analyte concentration. Where the response has been established as linear with respect to concentration, that is, within the linear range of the method, and the intercept of the response curve has been determined, sensitivity is a useful parameter to calculate and use in formulae for quantitation. Sensitivity is sometimes used to refer to the limit of detection, but this use is not generally approved.
Ruggedness or robustness
In any method there will be certain stages, which, if not carried out sufficiently carefully, will have a severe effect on method performance, and may result in the method not working at all. These stages should be identified, usually as a part of method development, and if possible, their influence on method performance evaluated using ruggedness tests, sometimes also called robustness tests. This involves making deliberate variations to the method, and investigating the subsequent effect on performance. It is then possible to identify the variables in the method which have the most significant effect and ensure that, when using the method, they are closely controlled. Where there is a need to improve the method further, improvements can probably be made by concentrating on those parts of the method known to be critical. Ruggedness is normally evaluated during method development, typically by the originating laboratory, before collaborating with other laboratories. Ruggedness tests are normally applied to investigate the effect on either precision or accuracy.
Analytical methods do not always measure all the analyte of interest present in the sample. Analytes may be present in a variety of forms in samples not all of interest to the analyst. The method may thus be deliberately designed to determine only a particular form of the analyte. However, a failure to determine all of the analyte present may reflect an inherent problem in the method. Either way, it is necessary to assess the efficiency of the method in detecting all of the analyte present.
Because it is not usually known how much of a particular analyte is present in a test portion, it is difficult to be certain how successful the method has been at extracting it from the matrix. One way to determine the efficiency of the extraction is to spike a test portion with the analyte at various concentrations, then extract the fortified test portions and measure the analyte concentration. The inherent problem with this is that analyte introduced in such a way will probably not be held as strongly as that which is naturally present in the test portion matrix, and so the technique will give an unrealistically high impression of the extraction efficiency. It is, however, the most common way of determining the recovery efficiency, and it is recognized as an acceptable way of doing so. However, the drawback of the technique should be borne in mind.
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Tech Service|
|Date:||Dec 1, 2003|
|Previous Article:||Power transmission belt using stabilized open mesh textile material in overcord for enchanced rubber penetration.|
|Next Article:||Pressure mapping system qualifies truck tire contact pressures on road surfaces.|