Ensuring the quality, reliability and precision of measurement processes through traceability.
Pressures of increased global competition on the manufacturers of goods and providers of services has resulted in a noticeable shift towards greater emphasis on quality control and its management worldwide. It is not surprising that in the USA, firms such as Ford and General Motors have adopted advertising themes of "Quality is Job One" and "The New Symbol of Quality" respectively in attempting not only to increase market share but also, more importantly, to overcome the threat from global competitors.
Explicitly or implicitly, we associate a level of quality, be it specified or perceived, with any given product. The inability of businesses to compete often results from a lack of attention to detail - one of which is measurement precision and its control - in assuring product quality. As high-technology manufacturing assumes increasing importance, one of the major problems encountered by high-technology management is that of managing the precision of product components and their assembly. In turn, these developments mean that the industrial measuring process must ensure precision, often to as much as a millionth of an inch. In accordance with Belanger, precision is here defined as the degree of agreement among independent measurements of a quality under strictly specified conditions. For dimensional metrology, management's ultimate tool for technologically ensuring precision, and the product reliability depending on this precision, is the use of gauge blocks, including the all-important master set of some 81 individual blocks.
The need for precision in calibration and measurement is well established[2-7]. Inadequate measurement capability can have serious consequences for the quality and performance of output generated by processes using such measurements, be they in industry, commerce or science. It can, in many instances, be the cause for disputes between firms, their suppliers, and customers. Moreover, the inability to measure the property in question with precision often results in having to over-design products and components in order to meet tight specifications. Otherwise, inadequate measurement capability may result not only in the acceptance of products with defects, but also in the discarding of products produced to exact specification. The burden of implicit costs that inadequate measurement capability places on a firm limits its capability to compete effectively in today's marketplace. It is for these reasons that, in the area of dimensional metrology, quality specialists constantly strive to be at the cutting edge of precision-measurement technology in order to ensure processes that generate accurate measurements.
The degree to which a measurement meets the test of universality, the national or other generally accepted standard, depends on how well the measurement instrument, tool, or process is calibrated. In most nations, there is an established agency that sets the generally accepted standard for precision in measurement. In the USA, it is the National Institute of Standards and Technology (NIST) that establishes such a generally accepted standard. NIST provides a wide range of calibration services along with Measurement Assurance Program (MAP) services in helping the US scientific community, industry and commerce ensure accurate and uniform physical, chemical, and other measurements.
US high-technology manufacturers often use NIST as their source of calibration. In seeking the highest level of precision in measurement, their primary laboratories are directly traceable to NIST. While primary laboratories are those that are directly traceable to NIST, secondary laboratories are those that are traceable to some primary laboratory. Traceability is defined as the ability to relate individual or nationally accepted systems of measurement through a chain of comparisons.
According to Cameron, the basic motivation for a traceability requirement is to give assurance that the ambiguity introduced by errors of measurement only negligibly affects the decisions, the quality of manufactured items, or the performance of processes of which the measurements are an integral part. Moreover, traceability to a nationally established agency such as NIST provides knowledge of the region of doubt associated with each measurement relative to the nationally accepted standard. However, even though nationally recognized agencies provide a secure base, not all stakeholders in the scientific community, industry, or commerce are able to use them as their calibration source owing to considerations of cost, time and convenience. Those stakeholders (secondary laboratories) unable to use the nationally recognized agency (such as NIST in the USA) are faced with the difficult task of deciding which calibration source (primary laboratory) to use for their calibration needs, and establish traceability to. This question establishes the foundation for the research.
The research question
The primary issue explored and tested in this investigation is whether traceability to the same source ensures statistical equality between high-precision laboratories making measurements on the same set of gauge blocks or, for that matter, any tool or instrument used for calibration. Does the choice of making some primary laboratory A versus primary laboratory B its calibration source make a difference for a secondary laboratory attempting to ensure the highest level of precision in its measurement process and, ultimately, the quality of its products?
The question of traceability with its precision- and reliability-related implications is important not only from a conceptual viewpoint, but also from an empirical perspective. The fact remains that stakeholders in dimensional metrology, whether manufacturers or others, seem heavily influenced by the label of "traceability". They derive from traceability to a nationally accepted source such as NIST in the USA, or from some other high precision laboratory, a sense of security regarding the level of precision it ensures for their measurement process. This research explores whether such a view is justified, or whether it constitutes a misconceived perception. In attempting to test hypotheses that address this issue, the study compares measurements of US primary laboratories with those of NIST. Gauge blocks - a widely used calibration tool and standard - are used as the medium for comparisons between laboratories.
Three sets of hypotheses are tested in this empirical investigation. The first set of hypotheses test for between-laboratory differences based on measurements over time on a given set of gauge blocks. The second set of hypotheses test for between-laboratory differences based on measurements across gauge blocks of different size from a given set at specific points in time. The one hypothesis in the third set serves as a general, yet important hypothesis on traceability.
Set A: Hypotheses comparing laboratories on their measurements on individual blocks over time
[H.sub.A1]. High-calibration primary laboratories do not differ in their mean of measurements on a given block over time.
[H.sub.A2]. High-calibration primary laboratories display the same level of dispersion (as measured by variance) in their measurements on a given block over time.
[H.sub.A3]. The level of precision high-calibration primary laboratories achieve in their measurements on individual blocks over time is the same for blocks of any size (0.100 to 4.000in. blocks inclusive).
Set B: Hypotheses comparing laboratories on their measurements across blocks at specific points in time
[H.sub.B1]. High-calibration primary laboratories do not differ in their mean of measurements across blocks of a set at any given point in time (year).
[H.sub.B2]. High-calibration primary laboratories display the same level of dispersion (as measured by variance) in their measurements across blocks of the same set at any given point in time (year).
Set C: A general hypothesis on traceability
[H.sub.C1]. Traceability to the same source implies statistical equality between high-calibration primary laboratories in their measurements, and in their levels of precision.
The data sets
Two comprehensive sets of data are used in this research. In each of the two data sets, measurements are made over time on a given set of gauge blocks by NIST and by one or more high-precision primary laboratories of US manufacturers. The data were obtained through contact with the research laboratories of these manufacturers.
Data set 1
This data set consists of measurements on ten systematically chosen blocks of representative sizes from the Grand Master Set of rectangular grade AA steel gauge blocks. Single measurements were made on these same ten blocks every year from 1972 to 1991 inclusive (the only exception being 1976, when the blocks were not measured) by the primary laboratory of Timken Research, The Timken Company - a high-precision US manufacturing company - and by NIST. These ten blocks, ranging in length from 0.100in. to 4.000in., were selected by Timken Research Laboratory in 1972 as representative blocks to be used for calibration and for establishing traceability to NIST over time.
Data set 2
The second data set consists of single measurements made by each of two primary laboratories and NIST every other year, from 1978 to 1986 inclusive, on all blocks of the same 81-piece Webber Croblox chromium carbide grade AA/1 Master Set of rectangular gauge blocks. The two primary laboratories are those of US manufacturers and are referred to as LAB1 and LAB2 respectively in data set 2.
Measurement attributes and methodology
The gauge blocks are calibrated in the standard international inch (25.4mm). They are referred to by size (length) in inches. All measurements of length on the blocks are expressed as deviations in microinches from the block's nominal value. A nominal value is the value assigned by a standard laboratory at the time of manufacture (e.g. 1.000in. block). The measurements, whether at NIST or any other primary laboratory, were made using either the interferometric measurement process as described by Beers, or the comparator as outlined by Beers and Tucker[10,11]. The measurement process requires that laboratories maintain, among other strictly prescribed environmental conditions, the temperature at 20 [degrees] Celsius. Similar control of environmental conditions during measurements allows for comparisons between the laboratories.
The statistical analysis system (SAS) was used for the analysis of data. Specific p-values are provided in the tables/where statistical tests are applied to the data in order to establish the exact level of significance of an underlying phenomenon. A p [less than or equal to] 0.05 level is used in establishing significance of a test result. Significance at the p [less than or equal to] 0.10 level is indicated by a "?". The underlying assumptions as stated by Neter and Wasserman in the use of t- and F-tests were tested and satisfied, thereby ensuring the validity of results. The measurements on ten blocks of representative sizes ranging from 0.100in. to 4.000in. inclusive in length selected by Timken were used in testing the hypotheses when using data set 1. Fourteen blocks of representative sizes ranging from 0.050in. to 4.000in. inclusive were systematically selected for analysis from the 81-piece chromium carbide gauge block set when using data set 2 to test the hypotheses included in Set A.
Analysis of results for [H.sub.A1]
To test for between-laboratory differences in measurements on given blocks over time, the difference in their mean on each block was computed for the two data sets. Next, the t-test was administered on each block size to see if the difference between laboratory means was non-zero. In data set 1, Timken fared well in its comparison with NIST. The difference in its mean from that of NIST on any given block was within +1.1 microinches. Even so, Figure 1 illustrates the$inequality that can occur between laboratories in their measurements on the very same block.
For four of the ten blocks in data set 1, the difference between laboratory means was found to be statistically significant (non-zero) at the p [less than or equal to] 0.05 level. Table I contains some of the results of tests conducted on gauge blocks from the two data sets.
[TABULAR DATA FOR TABLE I OMITTED]
The between-laboratory differences were much more pronounced for measurements in data set 2, with the difference in pairwise comparisons as high as 7.62 microinches on the 3.000in. block. In 14 of the 42 pairwise comparisons of laboratory means, the difference was found to be significant at the p [less than or equal to] 0.05 level for the two-tailed t-test. In another six comparisons, the difference was significant for a one-tailed test. The results of the tests on the two data sets essentially reject [H.sub.A1]. High precision laboratories were found to differ significantly in their means on a number of blocks.
Interestingly, in data set 2, not only was there a significant difference between laboratory means on a number of blocks, but also the mean for LAB1 was found to be consistently below that of NIST for all blocks [greater than or equal to] 0.150in. in length. This was also true for LAB2 when compared with NIST for all but one block [greater than or equal to] 0.150 microinches in length. The measurements (and means) of LAB2, though, were much closer to those of NIST than were the measurements of LAB1. What this reveals is the fact that the bias owing to the laboratory is greater for measurements made on comparatively larger blocks. As a result of this finding, greater care is warranted by industry in the use and calibration of larger blocks.
Analysis of results for [H.sub.A2]
To test for between-laboratory differences in their measure of dispersion (variance) in measurements on given blocks over time, two-tailed F-tests were conducted on pairwise comparisons in each of the two data sets. In data set 1, Timken again fared well in its comparison with NIST in that the difference from NIST in its measure of variance on any given block was within [+ or -]0.42 microinch. For three of the ten blocks in this set, the difference in variance between Timken and NIST was statistically significant at the p [less than or equal to] 0.05 level. Interestingly, the dispersion found in the measurements on given blocks for Timken was less than that for NIST for nine of the ten blocks in the data set. Overall, the measure of variance was fairly low and [less than or equal to] 1.0 microinch for nine of the ten blocks for each laboratory. The variance, though, increased with block size for each laboratory. Timken and NIST registered their highest variance of 2.1013 and 1.9791 microinches respectively on the largest 4.000in. block. Table II contains some of the results of tests conducted on both data sets.
While the incidence of significant between-laboratory differences in variance for data set 2 was about the same as in data set 1, the magnitude of such differences was much more pronounced (as high as [+ or -]12 microinches). The greatest difference between the laboratories was on the 4.000in. block (the largest block). Measures of variance for LAB1 on the 2.000, 3.000 and 4.000in. blocks were 9.3, 10.3 and 15.7 microinches respectively. As in data set 1, the variance increased with block size in the case of every laboratory. The dispersion in laboratory measurements on individual blocks was, in many cases, far greater than the [+ or -]2 microinches permitted by the American National Standards Institute (ANSI) specification B89.1.9M for blocks up to 4.000in. long. LAB1 displayed a higher variance than LAB2 and NIST for 12 of the 14 blocks. The findings, particularly those from data set 2, reject [H.sub.A2].
Analysis of results for [H.sub.A3]
Measures of variance for measurements generated by each laboratory on individual blocks over time in each of the two data sets were used in testing [H.sub.A3]. Table III presents the measures of variance on individual blocks for each oratory in data set 2.
[TABULAR DATA FOR TABLE II OMITTED]
For each laboratory in the two data sets, the measure of variance was regressed on block size to see if the magnitude of variance increases significantly (a significant positive slope) with block size. The slope of the regression line (regression coefficient) was then tested for significant difference from zero. In every case (the two laboratories in data set 1, and the three laboratories in data set 2) the slope was positive. For four of the five laboratories, the regression coefficient was found to be significantly non-zero at the p [less than or equal to] 0.01 level. In the case of the fifth laboratory, the positive slope was significantly non-zero at the p [less than or equal to] 0.02 level.
Table III. Measures of variance for laboratories in their measurements on individual blocks over time
Data set 2 Variance Block size LAB1 LAB2 NIST
0.100 3.2 0.3 0.1 0.110 1.5 0.3 0.2 0.120 0.3 0.7 0.1 0.130 2.3 1.2 0.1 0.140 1.8 1.0 0.1 0.150 4.8 0.3 0.1 0.250 1.2 0.3 1.0 0.350 2.2 1.8 0.3 0.500 3.2 0.5 0.7 0.650 6.5 1.0 0.5 0.750 3.7 1.8 1.2 0.850 6.3 0.2 0.8 1.000 4.8 0.0 0.3 2.000 9.3 0.7 2.0 3.000 10.3 2.3 2.9 4.000 15.7 2.3 3.7
1. Block size is the nominal value in inches.
2. Variance is calculated for measurements (expressed as deviations from nominal in microinches) on each block taken every other year from 1978 to 1986 inclusive by the laboratory.
3. Each laboratory displayed a positive correlation (and slope for the line-regressing variance on block size) significant at the p [less than or equal to] 0.01 level.
4. The coefficient of correlation between the measure of variance and block size was + 0.943, + 0.624 and + 0.956 respectively for LAB1, LAB2 and NIST in data set 2. In data set 1, the correlation was + 0.968 for Timken and + 0.747 for NIST.
The coefficient of correlation between variance and block size was found to be significantly positive for all five laboratories. The results clearly demonstrate that these high-precision primary laboratories (including NIST) are not as precise in their measurements on larger blocks compared with smaller blocks. The variance in laboratory measurements on individual blocks increased significantly with block size. As such, [H.sub.A3] is rejected. Also, as shown in Table III, LAB1 displayed a significantly higher level of variance than did either LAB2 or NIST, especially for the larger blocks. The difference in the magnitude of variance displayed by LAB1 and LAB2 provides support for the argument that primary laboratories traceable to the same source can differ significantly in their measurements on the same blocks.
Analysis of results for [H.sub.B1]
For each data set, pairwise t-tests were conducted to see if between-laboratory differences in their means of measurements across blocks of different size were significant during given years. For data set 1, differences between the means of Timken and NIST were significant during only five of the 19 years. Moreover, the magnitude of the differences was within [+ or -]1 microinch for any given year. Figure 2 illustrates the high level of equality Timken achieved in its measurements when compared with those of NIST on the ten blocks during 1991.
For data set 2, the between-laboratory difference in means was found to be significant in eight of the 15 pairwise comparisons. Though the magnitude of the differences was within [+ or -]2 microinches for data set 2, such differences are considered substantial, since one would expect positive and negative deviations in the difference between laboratories to cancel out when computing the mean of the difference between laboratories across 81 blocks in any given year.
In particular, the measurements of LAB1, during each of the five years, were found to be consistently below those of LAB2 and NIST. Although there is not enough evidence to reject [H.sub.B1] outright, the results from data set 2 most certainly raise concern. Table IV contains the results of tests conducted on the two data sets during particular years.
Analysis of results for [H.sub.B2]
For each data set, pairwise F-tests were conducted to test for differences among the variance of measurements across blocks between laboratories during [TABULAR DATA FOR TABLE IV OMITTED] specific years. Results from data set 1 support the null hypothesis of equal variances, as no significant differences were found between Timken and NIST during any of the 19 years. Contrary to the results obtained from data set 1, the results of tests on data set 2 reject [H.sub.B2]. Table V contains the results of all pairwise comparisons of laboratories in data set 2.
In 11 of the total 15 pairwise comparisons of variance, a significant difference was found at the p [less than or equal to] 0.05 level when administering a two-tailed or a one-tailed F-test. The greatest difference in variance of 4.5523 microinches was between LAB1 and NIST during 1982. The results obtained from data set 2 give cause for concern in that the measurements of high-precision primary laboratories were found to differ significantly.
Analysis of results for [H.sub.C1]
All three laboratories of the US manufacturers included in the two data sets are traceable to NIST, and, as such, are primary laboratories. Given this fact, the acceptance or rejection of [H.sub.C1] depends on the results obtained in testing the earlier hypotheses included in hypotheses sets A and B.
Results obtained in testing the earlier hypotheses show that primary laboratories can differ significantly in their mean and variance for block measurements. As one example, measurements made on blocks during any [TABULAR DATA FOR TABLE V OMITTED] given year by LAB1 in data set 2 were consistently below those of LAB2 and NIST in that same year. This indicates a systematic negative bias in the measurement process of LAB1 in general when compared with LAB2, and particularly NIST. Based on the findings of this research, [H.sub.C1] is rejected.
Discussion and conclusion
The term "traceability" itself seems to connotate statistical equality. If primary laboratories, traceable to an agency such as NIST in the USA that sets the national standard for measurement, differ significantly in their measurements on the same block (and as the results show, in most cases they do), then secondary users of gauge blocks, or other such calibration standards, are faced with a dilemma. Who should they turn to for calibration if they are unable to use the direct services of the nationally accepted agency? As the results show, their choice then of some primary laboratory X versus Y does have precision-related implications for them. The issue raised in this investigation should be of concern not only for secondary users but also for primary laboratories themselves.
The use of MAP services provided by a generally recognized and accepted agency such as NIST is strongly recommended for primary as well as secondary laboratories seeking a high level of measurement precision. MAP services as provided by NIST focus on the quality of measurements being made rather than just the properties of the participant's instruments or standards. Having received MAP services, participants can then implement a MAP internally. Such a MAP should demonstrate to the user that the total uncertainty is sufficiently small to meet the user's requirements.
Secondary laboratories, in evaluating potential sources for calibration, can begin by enquiring as to the nature of the internal control process, if any, the source laboratory has in place to measure uncertainty. Further, they should enquire as to whether the primary laboratory being considered has obtained MAP services. Has it implemented a MAP? As Belanger explains, it is possible to achieve a high level of accuracy by using NIST's calibration services instead of NIST's MAP services, but experience has shown that some users of NIST's calibration services have had long-standing measurement problems that remained undiscovered until they participated in a MAP.
Traceability alone does not ensure precision in measurement, let alone the equality of measurement precision among laboratories traceable to the same source. Calibration capability serves as only one prerequisite for precision measurement. Measurements are affected by, among other items, the environment, operator and procedure. Moreover, how calibrated standards are handled when used influences the precision of measurements they generate. Sensitivity to control procedures, commitment to quality, and a low tolerance for uncertainty more than traceability - can help enhance precision capability for the laboratories and, in turn, the quality of the firm's products.
While significant between-laboratory differences were found to exist, the Timken laboratory consistently performed better in its comparisons with NIST than did the other primary laboratories. Based on Timken's involvement with NIST, its commitment to quality, and attention to detail, NIST itself recognizes Timken's measurement capability to be at par with that of its own laboratory. As pointed out earlier, traceability is generally defined as the ability to relate individual or nationally accepted systems of measurements through a chain of comparisons. Such a view of traceability does not address the issue of frequency of comparisons. This issue warrants further investigation, as Timken made more frequent comparisons with NIST than did the other two primary laboratories in data set 2. The higher level of precision achieved at Timken might well have to do with their more frequent comparisons and greater involvement with NIST. More frequent feedback from the source setting the national standard should enable a laboratory to make the fine adjustments needed to ensure a high level of precision in calibration. The importance of upper management's involvement and the organization's quality culture must be emphasized.
The measurement quality and capability of a laboratory should be judged not by traceability alone, or through the single calibration of a given standard, but rather in the aggregate. Recognizing the general intolerance for imprecision today, it is hoped that, through this research, primary as well as secondary stakeholders are better able to comprehend the role and scope of traceability as they attempt to ensure measurement accuracy and precision.
1. Belanger, B., "Measurement assurance programs Part I: general introduction", NBS Special Publication 676-I, US Department of Commerce/NIST, Washington, DC, 1984.
2. Rajaraman, M.K., "Allocation of errors in hierarchical calibration or assembly", Journal of Quality Technology, Vol. 6 No. 1, 1974, pp. 42-5.
3. Hilliard, J.E. and Miller, J.R. III, "The effect of calibration on end-item performance in echelon systems", Journal of Quality Technology, Vol. 12 No. 2, 1980, pp. 61-70.
4. Schumacher, R.B.F., "Systematic measurement errors", Journal of Quality Technology, Vol. 13 No. 1, 1981, pp. 10-24.
5. Seamans, P., "Calibration cuts costs", Quality, Vol. 23, May 1984, pp. 17-8.
6. Carroll, R.J. and Spiegelman, C.H., "The effect of ignoring small measurement errors in precision instrument calibration", Journal of Quality Technology, Vol. 18 No. 3, 1986, pp. 170-3.
7. Babbar, S., "Measurement precision in quality management: an empirical investigation on using gauge blocks as a calibration standard", International Journal of Quality & Reliability Management, Vol. 10 No. 5, 1993, pp. 20-32.
8. Cameron, J.M., "Traceability?", Journal of Quality Technology, Vol. 7 No. 4, 1975, pp. 193-5.
9. Beers, J.S., "A gauge block measurement process using single wavelength interferometry", NBS Monograph 152, US Department of Commerce/NIST, Washington, DC, 1976.
10. Beers, J.S. and Tucker, C.D., "Intercomparison procedures for gauge blocks using electromechanical comparators", NBSIR 76-979, US Department of Commerce/NIST, Washington, DC, 1976.
11. Beers, J.S. and Tucker, CD., "Measurement assurance for gauge blocks", NBS Monograph 163, US Department of Commerce/NIST, Washington, DC, 1979.
12. Neter, J. and Wasserman, W., Applied Linear Statistical Models, Richard Irwin, Homewood, IL, 1974.
13. "Precision gauge blocks for length measurement", ANSI/ASME B89.1.9M, The American Society of Mechanical Engineers, New York, NY, 1984, p. 10.
|Printer friendly Cite/link Email Feedback|
|Publication:||International Journal of Quality & Reliability Management|
|Date:||Feb 1, 1995|
|Previous Article:||Detecting process drift with combinations of trend and zonal supplementary runs rules.|
|Next Article:||The development of measures of the cost of quality for an engineering unit.|