Commentary on Michell, quantitative science and the definition of measurement in psychology.
There are two key points in the paper which are entirely accepted: that scientific measurement may be defined as the estimation or discovery of the ratio of some magnitude of quantitative attribute to a unit of the same attribute; and that quantitative science involves two tasks - the investigation of the hypothesis that the relevant attribute is quantitative and the instrumental task of devising procedures to measure magnitudes of the attribute shown to be quantitative. The burden of the paper is that in psychometrics neither of these two tasks has been done. It is simply assumed that the variables are quantitative and the measurement techniques fail by the criteria of scientific measurement. An example will clarify his arguments. It is assumed, in the case of extraversion, despite the work of Jung, that this is a quantitative variable, without prior proof, and the tests contain no clear unit of measurement, in comparison with, to cite Michell, a cricket pitch where the ratio of the pitch to the unit is 22 (if yards are still permissible). Finally, Michell claims, where measures depart from the scientific model, then there is no justification for using them in the kinds of mathematical arguments which have played so large a part in developing the natural sciences. All this leads Michell to conclude that psychometrics is not scientific either in its theory or application.
While accepting these premises of scientific measurement, I shall argue that psychometrics is not as entirely flawed as Michell claims although some aspects of the field should be abandoned or modified considerably.
First I shall discuss the psychometrics of intelligence, as typified by the factor analytic work of Spearman, Burt and Cattell (Cattell, 1971). There are three points which need to be made. First the original question, which the factor analysis of abilities was designed to answer, was essentially this: what accounts for the positive correlations between human abilities? It turns out that two g factors, crystallized and fluid intelligence, account for much of the covariance. The intelligence test score, measuring these factors, may be regarded as a good index of the level of difficulty which an individual has reached in cognitive problem solving. Secondly, measures of these factors have construct validity, that is they behave as one might expect of intelligence tests. They predict academic success, occupational success and differentiate between groups, as is well documented. All this means that instead of the vague term intelligence psychometrists can use these tests of g, despite their admitted defects as scientific measures, as operational measurements of intelligence which lead to useful predictions and theorizing. Sternberg (1982) summarizes much of this work.
The third point and by far the most germane to this argument is not mentioned by Michell. This concerns the fact that the leading psychometrists, such as Cattell and Eysenck, figures whom Cattell is keen to separate from mere itemetric moles, regard factors only as starting points for their investigations. Before the recent development of confirmatory analysis, factor analysis was a deliberately multivariate exploratory technique, to indicate the variables to study. In the field of intelligence great efforts are being made to investigate the underlying nature of the g factors. Intelligence tests are not the end but the means of investigation. Thus as Barrett (1996) has discussed there is an emphasis on process rather than test refinement. This includes nerve conduction variability and velocity (Deary & Caryl, 1993) and Weiss (1995), for example, invokes gene biochemistry, and quantum statistics to account for EEG and psychometric findings. Lehrl & Fischer (1990) conceptualize intelligence as information-processing capacity and have developed a measure of this, the BIP, the basic period of information processing, which fits the scientific criteria mentioned in this article. This is derived from the speed of reading letters, but investigations by the present author (Draycott & Kline, 1994) have shown it to be related to crystallized rather than fluid intelligence.
Thus in the field of intelligence I would argue that psychometric testing is leading to a real scientific understanding of the phenomenon and that, in time, we shall look upon intelligence tests much as the modern astronomer at Jodrell Bank looks at Galileo's telescope.
As regards personality tests there is more force in the arguments of Michell. The factor analysis of personality questionnaires has yielded, to the satisfaction of most investigators, two clear factors, extraversion and anxiety, and three others over which there is still some argument, tough-mindedness, conscientiousness and openness. As with intelligence these factors correlate but only moderately with a variety of external criteria and are widely used in occupational psychology (see Kline, 1993).
Immediately a severe problem arises as to what is the unit of measurement. A score on these tests consists of the number of items endorsed by an individual but these are clearly not units of measurement in any meaningful sense. At best each item can be considered to be a sample from the universe of items measuring that variable (the true score). Thus the more items in the sample of items which are endorsed, the higher the fallible score and, by inference, the true score. This is the classical psychometric model. Of course the universe of items is notional and there is no method of ensuring that the items sample the universe although high reliability ensures that the items are consistent and sample some universe.
Again the leading workers in the psychometrics of personality, especially Cattell (1981) and Eysenck (1967) regard these tests simply as starting points for the study of personality. By the experimental study of these factors the nature of extraversion and anxiety and their physiological bases are being explicated as Barrett (1996) has summarized. Gray (1981) and Zuckerman (1991) have contributed further to the investigation of the processes underlying these factors. It should also be pointed out that biometric investigation of these personality factors, as is the case with intelligence, has demonstrated a considerable genetic component in their variance (Eysenck, 1994), a finding which suggests that despite their imperfections as scientific measures, they are far from worthless.
However, it is in the field of applied and social psychology, for example, the construction of questionnaires to measure variables which are of interest to researchers in health and education that the strictures of Michell bite hard and, in my view, render the work of little scientific value. As I have argued previously (Kline, 1993) locus of control exemplifies these problems. Here items which have face validity, e.g. 'When I get sick, I am to blame' and 'No matter what I do, I am likely to get sick', are factored and items loading a particular factor are regarded as scales named from the high-loading items. With such a scale the unit of measurement is unknown. Often with only six items per scale it is difficult to see what universe of items they might purport to represent. That they factor together indicates nothing more than that they mean the same thing. This type of blind factoring is bound to yield factors if enough items which are essentially paraphrases of each other are included in a test. With this methodology, there is literally no end to factors which can be produced.
These scales, and there are many such, are then used as variables in further factor analytic studies with other variables thus derived. This kind of psychometrics in which the scales are the variables, simply because their items load a factor, does seem to be measurement gone mad as described by Michell - a systematic thought disorder.
To conclude, it is true that psychometric measurement is not scientific in the sense defined by Michell. On the other hand, the true score model has yielded measures which are certainly better than no quantification at all and psychological variables are hard to fit to the strictly scientific measurement model. It is only when psychometric scores are regarded as end-products rather than as guides for scientific investigation that the full force of Michell's arguments obtains.
Barrett, B. (1996). Process models in individual differences research. In C. Cooper & V. Varma (Eds), Processes in Individual Differences. London: Routledge.
Cattell, R. B. (1971). Abilities, their Structure, Growth and Action. New York: Houghton Mifflin.
Cattell, R. B. (1981). Personality and Learning Theory. New York: Springer.
Deary, I. & Caryl, P. (1993). Intelligence, EEG and evoked potentials. In P. A. Vernon (Ed.), Biological Approaches to Human Intelligence. Norwood, NJ: Ablex.
Draycott, S. & Kline, P. (1994). Further investigations into the nature of BIP: A factor analysis of the BIP with primary abilities. Personality and Individual Differences, 17, 201-209.
Eysenck, H. J. (1967). The Biological Basis of Personality. Springfield, IL: Thomas.
Eysenck, H. J. (1994). Personality and intelligence: Psychometric and experimental approaches. In R. J. Sternberg & P. Ruzgis (Eds), Personality and Intelligence. Cambridge: Cambridge University Press.
Gray, J. A. (1981). A critique of Eysenck's theory of personality. In H. J. Eysenck (Ed.), A Model for Personality. Berlin:Springer-Verlag.
Kline, P. (1993). Handbook of Psychological Testing. London: Routledge.
Lehrl, S. & Fischer, P. (1990). A basic information psychological parameter (BIP) for the reconstruction of the concepts of intelligence. European Journal of Personality, 4, 259-286.
Sternberg, R. J. (Ed.) (1982). Handbook of Human Intelligence. Cambridge: Cambridge University Press.
Weiss, V. (1995). Memory span as the quantum of action of thought. Cahiers de Psychologie Cognitive, 14, 387-408.
Zuckerman, M. (1991). The Psychobiology of Personality. Cambridge: Cambridge University Press.
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||response to article by Joel Michell in this issue, p. 355|
|Publication:||British Journal of Psychology|
|Date:||Aug 1, 1997|
|Previous Article:||A critique of a measurement-theoretic critique: commentary on Michell, quantitative science and the definition of measurement in psychology.|
|Next Article:||Quantification and symmetry: commentary on Michell, quantitative science and the definition of measurement in psychology.|