Printer Friendly

Discrepantly poor verbal skills in poor readers: a failure of learning or ability?

Many investigators have identified meaningful subtypes of disabled learners in terms of academic or cognitive function. Most typically, verbal skills are significantly weak when compared with spatial skills. Opinions differ as to why poor reading should often occur in the context of a verbal/spatial discrepancy. One hypothesis suggests that a primary, neuropsychological discrepancy between the verbal and spatial skill domains comparatively restricts the general development of verbal skills (Warrington, 1967). Another explanation has been that poor reading restricts the general development of verbal skills (Stanovich, 1993). A degree of interaction between these two putative processes might be expected, or alternatively it may be that the two phenomena are coincident but unrelated.

A recurring theme in investigations of children with reading problems has been whether children with a discrepancy between their reading age and that predicted from their IQs have characteristics which distinguish them from non-discrepant children with similar reading age (Fletcher et al. 1994; Shaywitz, Fletcher, Holahan & Shaywitz, 1992; Siegel, 1988; Stanovich & Siegel, 1994). From the few studies where the two groups have been explicitly compared, a consensus is emerging that no extra cognitive deficit exists in the discrepant group. Indeed, they have performed better than the non-discrepant group on some other tasks, including arithmetic (Mellanby, Anderson, Campbell & Westwood, 1996; Rutter & Yule, 1975), Performance subtests of the WISC (Wechsler Intelligence Scale for Children; Wechsler, 1949; Belmont & Birch, 1966) and oral language tests (vocabulary and syntax; Mellanby et al., 1996).

Discrepant cognitive profiles in poor readers have been reported using a wide variety of tests of reasoning and intelligence. Most, like the WISC-R (Wechsler Intelligence Scale for Children-Revised; Wechsler, 1974), the BAS (British Ability Scales; Elliott, 1983) and the CAT (Cognitive Abilities Test; Thorndike & Hagen, 1986), require a mixture of novel problem solving and the application of knowledge stores and routines, such as facts and arithmetic procedures. A theoretical distinction between these two sorts of intelligence has been convincingly drawn by Cattell (1963, 1971; Schmidt & Crano, 1974). He has argued that novel problem solving relies on 'fluid intelligence', which is the basic efficiency of the neural system. In contrast, learned knowledge and procedures rely on 'crystallized intelligence', which is dependent on cultural exposure. The theory indicates that fluid ability can be measured by using novel problems.

The distinction between fluid and crystallized intelligence is crucial to any examination of the link between discrepantly poor verbal skills and poor reading. The importance stems from the fact that even the reading aloud of single words is a learned skill and therefore part of crystallized intelligence. Furthermore, since reading is an important mechanism for acquiring facts and information, it relates directly to the acquisition of crystallized intelligence. Thus poor reading could significantly disable performance on conventional reasoning tests.

Another factor which affects verbal achievement on conventional tests of verbal reasoning is social experience. For example, the WISC-R Information subtests require knowledge of fire alarm procedures, postal processes, even orienteering. Indeed, the verbal section of the CAT (VCAT; Thorndike & Hagen, 1986), a well-established test of reasoning for school pupils, includes some items which depend profoundly on sophisticated social concepts. Here are two examples which are similar in style and content to two CAT verbal items:

(i) 'It is a melancholy ----- that even great men have their poor relations.'

a lie b pretence c truth d necessity e humility

(ii) 'Where there is -----, let me sow joy.'

a darkness b peace c understanding d sadness e death

It is not only the loading of conventional intelligence tests on selective learning and experience that causes methodological concerns; the varying formats of reasoning tests across the verbal and spatial domains also confound investigations. To consider the WISC-R as an example again, the Performance subtests utilize a specific presentation mode, which happens to be visual perceptual and spatial. The Verbal subtests also have a specific presentation mode, which is largely the aural presentation of verbal material. Furthermore, the response mode required by the Performance subtests is generally manual, whereas the response mode required by the Verbal subtests is speech. It is well established that poor readers can evince a range of cognitive deficits (Batchelor, Kixmiller & Dean, 1990, found 14 individual cognitive skills which predicted reading performance in learning-disabled children). Hence, discrepancies between verbal and spatial scores could result from different task demands, rather than differences in primary verbal and spatial reasoning. Because the formats of the two types of test differ, it is not possible to extricate the cognitive skill level from the effects of presentation and response mode.

A new reasoning test (Langdon & Warrington, 1995) has been developed to measure verbal and spatial fluid intelligence in which the presentation and response modes for each section have been carefully matched. By requiring the examinee to deduce novel relations among stimuli which are designed to be easily accessible, it puts minimal demands on crystallized intelligence. The test comprises matched sets of verbal and spatial items, presented in the same format, hence allowing a direct comparison of verbal and spatial reasoning ability [ILLUSTRATION FOR FIGURE 1 OMITTED]. Verbal stimuli are all single high-frequency words (Thorndike & Lorge, 1936, category A or AA words). Stimuli are large and clear to reduce demands on visual acuity. The abstract spatial stimuli do not require fine visual perceptual or spatial discrimination (for example, no answers depend on deciding between similar shapes, or, counting scattered dots). The multiple choice response format has been shown in another neuropsychological test to have the advantage of being relatively resistant to affective factors (Coughlan & Hollows, 1984). Stimuli and response alternatives are displayed for extended inspection, which reduces the effects of confounding deficits in sustained concentration and short-term memory. By supporting the encoding and response stages of reasoning to some extent, the VESPAR may provide a more precise measure of reasoning than traditional tests with their wide-ranging cognitive demands. Initially designed and standardized for adult neurological patients, the VESPAR has been selected because of its suitability for investigating the cognitive profile of poor readers.

In the present study, the VESPAR, the CAT and a single-word reading test were administered to 170 unselected 14-year-olds. Initially, it was shown that the VESPAR test can be successfully employed with children of this age group. The results were compared with results previously obtained (Langdon & Warrington, 1995) with a group of young adults. A comparison was made between the verbal and spatial VESPAR scores for three matched groups of children, selected from the larger group on the basis of their CAT scores: a CAT-discrepant group was identified as having VCAT scores at least 15 points (1.5 SD) below their NVCAT scores; and two constructed CAT non-discrepant groups, one group individually matched for VCAT with the discrepant group and one individually matched for NVCAT with the discrepant group.

The intention was to determine whether the CAT discrepancy (NVCAT-VCAT) was the result of poorer primary verbal ability (fluid intelligence) or poorer acquisition of verbal skills (crystallized intelligence). If the former were the case (poorer primary verbal ability), then the CAT-discrepant group would have a larger discrepancy on the VESPAR (spatial minus verbal score) than the VCAT-matched (non-discrepant) group. However, in the latter case (poorer acquisition of verbal skills), the VESPAR discrepancy would be expected to be the same in the two groups.



A year group of 170 students (90 males and 80 females; mean age 14.58 years, SD 0.30) was assessed. It comprised the complete year 10 from a semi-rural comprehensive in central England. The first language was English and there were no statemented children in the sample. The school and the parents were fully informed of all aspects of the study, and the parents completed a consent form.


The following battery was administered.

1. The Burt Test (1921). This is a test of single word reading for correct pronunciation. It has been standardized and validated for this age group. It comprises 110 words, graded by difficulty and arranged in groups of 10. If 10 failures occur within a group, testing is discontinued at that point. It is no longer widely used and was chosen for this present work to ensure that the students were test naive.

2. The Cognitive Abilities Test (CAT). CAT is an established test of intellectual ability for school students. It is standardized and validated for this age group. In addition, it requires no manipulation of materials and is untimed, which minimizes the possible confounding influences of reduced dexterity and visual processing inefficiencies. Two sections were used in this study. First, the Verbal CAT (VCAT), which comprises analogy, sentence completion and classification tasks, presented in text format. Second, the Non-verbal CAT (NVCAT), which comprises analogy, completion and classification tasks, presented as abstract geometric designs. The possible range for the standardized age scores calculated is 70 to 130.

3. The Verbal and Spatial Reasoning Test (VESPAR). This new reasoning test comprises matched sets of 25 items in both verbal and spatial format, in each of the three modes of inductive reasoning: category, analogy, series completion. The requirements for stored knowledge and learned procedures have been minimized. To date, it has only been standardized for an adult population. The range of possible scores is from 6 (chance level) to 25.


All testing was carried out by the same investigator, a science graduate. Standardized test administration procedures were followed. The VCAT and NVCAT were completed in a classroom setting. The other tests were administered on an individual basis, in a randomized order.


The data were analysed in two different ways. First, the scores from the whole year group were examined, to ascertain the pattern of performance on both established and new tests. Second, a subset of CAT-discrepant pupils (d-pupils) was identified. D-pupils were selected on the basis of their NVCAT score being at least 15 points (about 1.5 SD) above their VCAT score. In order to investigate the cognitive characteristics of the d-pupils, two matched comparison groups were selected from the CAT non-discrepant pupils (nd-pupils). For the first comparison group, the d-pupils were matched individually with nd-pupils according to their VCAT scores. This was designated the 'v-matched group'. For the second comparison group, the d-pupils were matched individually with nd-pupils according to their NVCAT scores. This was designated the 'nv-matched group'. Forty pupils were designated as CAT-discrepant'; however, suitable matches for two pupils were not available from the sample and these were discarded from the analysis at this stage. In this way, three groups were created with 38 different pupils in each group.


Score distributions of year group

The year group obtained a mean error score on the Burt test of 23.12 (SD 16.85). There were no gender differences on it (ANOVA: F = 0.969,p [greater than] .3). They obtained a mean score of 93.86 (SD 13.13) on the VCAT and 101.30 (SD 13.58) on the NVCAT. There were no gender differences on either section of the CAT (ANOVAs: VCAT F = 2.166, p [greater than] .1; NVCAT F = 0.319, p [greater than] .5). The distributions of the VCAT and NVCAT scores are given in Fig. 2, with the national control sample score distributions (Thorndike & Hagen, 1986) given for comparison. There is a tendency for the year group to be slightly less competent on the VCAT and slightly more competent on the NVCAT than the national control sample. In fact, this tendency was more pronounced than the figure suggests. Twelve pupils scored at a level below their final recorded score on the VCAT, but the CAT scoring protocol imposes a minimum score of 70. Thus their VCAT scores were artificially raised by this test artifact. The strength of the NVCAT performance, in comparison to the VCAT performance, by the year group is further illustrated in Fig. 3. From this scattergram, it is clear that a proportion of pupils were scoring at, or very close to, the minimum on the VCAT. However, only one pupil was close to the minimum on the NVCAT. Interestingly, many pupils scoring at a low level on the VCAT nevertheless achieved a competent score on the NVCAT. It is worth noting that, whilst four of these pupils obtained NVCAT scores which were also very weak, the remaining eight obtained NVCAT scores that were competent (NVCAT range 79-107).

There were no gender differences in the VESPAR scores (ANOVAs: verbal F = 0.165, p [greater than] .6; spatial F = 0.89, p [greater than] .3). The means and standard deviations obtained on the six sections of the VESPAR by the year group are given in Table 1, along with the same statistics for the youngest group of the adult standardization (18-40 years, Langdon & Warrington, 1995) for comparison. The distributions of the year group raw scores in all but the spatial series section of the VESPAR approximate normal distributions [ILLUSTRATION FOR FIGURE 4 OMITTED]. The intercorrelations among the test scores were computed and all were found to be highly significant (see Table 2).
Table 1. A comparison of the VESPAR scores of year 10 school
students (14-15 years; N = 170) from current sample with young group
(18-40 years; N = 64) from the adult standardization sample

VESPAR section         School pupils       Young adults(a)

Verbal odd one           13.0 (3.3)          17.5 (2.7)
Verbal analogy           14.5 (3.8)          18.9 (2.9)
Verbal series            14.9 (3.9)          18.0 (3.1)
Spatial odd one          15.7 (2.3)          17.1 (2.3)
Spatial analogy          17.3 (3.6)          19.7 (3.2)
Spatial series           21.3 (3.0)          21.9 (1.9)

a Langdon & Warrington, 1995; reprinted by permission of Erlbaum
(UK), Taylor & Francis, Hove, UK.

Note. The values are means (SD).

Individual discrepancy scores between both VCAT and NVCAT, and the verbal and spatial sections of the VESPAR were calculated and a scattergram of the relationship between them plotted [ILLUSTRATION FOR FIGURE 5 OMITTED]. Both CAT and VESPAR discrepancies are largely within the positive range, that is the spatial scores for each individual were generally higher than their verbal scores. However, despite the similar trend, the discrepancy scores of the two tests were not highly correlated, though the correlation was significant (r = .2588, p [less than] .01).

Variables predicting VCAT scores

It has been shown in Table 2 that both the Butt test and both sections of the VESPAR were highly correlated with the VCAT score. To determine the relative importance of the verbal and spatial sections of the VESPAR and the Burr test in determining the VCAT score, a three-step fixed-order multiple regression was [TABULAR DATA FOR TABLE 2 OMITTED] performed on the whole sample, with the VCAT scores as the dependent variable. The verbal VESPAR contributed significantly to the variance in VCAT ([Mathematical Expression Omitted] of .54; p [less than] .01). The spatial VESPAR made no extra contribution. The Burt reading score contributed significantly (additional [Mathematical Expression Omitted] of .18; p [less than] .01).

There were no significant gender differences on any test. Overall, the pattern of performance of the year group on both established and new tests was in reasonably good accord with previous findings. This indicated that a comparative analysis of matched groups derived from the year group would be valid.

Comparison of the three constructed groups

The means and standard deviations of the CAT and Burt scores obtained by the three comparison groups are given in Table 3. The relationship between the groups' [TABULAR DATA FOR TABLE 3 OMITTED] [TABULAR DATA FOR TABLE 4 OMITTED] VCAT and NVCAT mean scores had been determined by the matching process (Table 3): the d- and v-matched groups scored similarly on VCAT and were both significantly worse than the nv-matched group. The scores on the Burt reading test were similar in the d- and v-matched groups and both were significantly worse than the v-matched group.

The scores of the three constructed groups on the VESPAR are compared in Table 4 and Fig. 6. With the verbal VESPAR, the v-matched group scored significantly lower than the nv-matched group on each section (odd one, analogy, series). The d-group was also significantly lower than the nv-matched group on odd one and analogy, but was significantly better than the v-matched group on verbal odd one and verbal series. Overall, when the verbal subtest scores were summed, the total verbal score for the d-group was significantly better than for the v-matched group (though still significantly below nv-matched). With the spatial VESPAR, the d-group did not differ significantly from the nv-matched group on any of the subtests. The v-matched group was significantly worse than the nv-matched and the d-group on spatial analogy and spatial series, but not different on spatial odd one. Overall, therefore, the summed spatial scores were the same for the d-group and the nv-matched group and significantly lower for the v-matched group.

Differences between verbal and spatial scores

The discrepancy between verbal and non-verbal CAT scores was large in the d-group because that was the criterion for their selection. It was of theoretical interest to compare the difference scores between the spatial and verbal VESPAR scores of the pupils in the three matched groups. Table 5 shows that the d-group and v-matched group VESPAR discrepancies are not statistically significantly different, whilst both are more discrepant on the VESPAR than the nv-matched group.



A year group of 170 students, with a mean age of 14.58 years, completed a cognitive battery comprising the VESPAR, the VCAT, NVCAT and the Butt reading test. The VESPAR had previously only been standardized with adults (Langdon & Warrington, 1995). The score distributions of the student sample approximated normal curves on five sections of the VESPAR (the exception being the spatial series, described previously). The extent and location of the score distributions of at least five sections demonstrated that the VESPAR was able to capture and grade the full range of ability in this sample. This contrasts with the VCAT, where a 'floor' effect was in evidence. The statistical characteristics of the VESPAR suggest that it is an appropriate measure of reasoning for the mainstream education population in this age group. There was no significant gender difference on any test.

Whereas the year group was slightly weak on VCAT, compared to the national sample, they were rather strong on NVCAT. This pattern argues against any of our findings being attributable to a generalized low level of ability in our sample. Interestingly, the VESPAR scores were lower than those obtained for young adults, especially on the verbal section, which suggests that the development of abstrract reasoning in the spatial domain may occur at an earlier age than verbal reasoning. Alternatively, it may simply be a further reflection of the weaker verbal skills in this experimental group. The weaker verbal scores on both the CAT and the VESPAR were further explored by the calculation and analysis of discrepancy scores. The discrepancies on the VESPAR were greater than those recorded in young adults. Because these discrepancies were in the same direction for both the VESPAR (which focuses on fluid intelligence) and the CAT (which requires both crystallized and fluid intelligence), it seems likely that at least some of the weakness in verbal functions is a primary property of the population and not directly attributable to learning or experience.

The impact of learned knowledge and procedures on VCAT scores was explored via a multiple regression. A clear contribution of literacy skills to VCAT scores was demonstrated. This is in line with a previous report by Mellanby et al. (1996). They showed by the technique of fixed-order multiple regression that, even when the variance attributable to non-verbal ability has been removed, a significant contribution to the variance in VCAT was made by vocabulary, oral analysis of syntax and reading. The mixed nature of the skills required by VCAT makes it difficult to evaluate VCAT scores and their relation to reading. In contrast, the verbal section of the VESPAR uses simple vocabulary in novel problems and is therefore less dependent on learned knowledge. Therefore, some tentative conclusions can be drawn from different performances on the CAT and VESPAR, in terms of the differential between ability and achievement.

The comparison of the d-group and nv-matched group showed, as would be expected, that the nv-matched group scored significantly higher on VCAT and verbal VESPAR; they were also better readers than the d-group. In contrast, the spatial VESPAR scores of the d-group and the nv-matched group were not significantly different. The comparison of the d-group with the v-matched group is of particular interest, because it speaks directly to the theoretical questions about discrepant pupils and is not merely related to general low ability overall. The most pertinent finding is that the overall verbal VESPAR scores of the d-group were significantly higher than those of the v-matched group, although their reading levels did not differ.

The finding that the d-group had higher verbal VESPAR scores than the v-matched group suggests that not all of the reading deficits of the d-group can be explained by low verbal fluid intelligence, but rather that they are reading below the level that might be predicted by the verbal VESPAR. Clearly, the poor reading skills of the d-group relate in part to factors independent of verbal fluid intelligence. There may also be a secondary effect of relatively poor verbal fluid intelligence leading to problems in learning to read and a resulting reduced exposure to print (Stanovich, 1993). Speculatively, it may be that the identification of pupils who have VCAT scores markedly lower than their NVCAT, but whose verbal VESPAR is not grossly discrepant from their spatial VESPAR, may allow help with reading to be more effectively targeted to those with the potential to obtain especial benefit.

In conclusion, this study has shown the VESPAR to be an appropriate and effective measure of reasoning in this age group. Our results suggest that it reduces the confounding factors of conventional tests for assessing this group: reliance on learned knowledge, which may be affected by reading experience, and varying presentation and response formats between verbal and spatial scales. The finding that the VESPAR spatial-verbal discrepancy was similar in the d and v groups supported the hypothesis that overall the CAT discrepancy was due to poorer acquisition of verbal skills. However, some of the CAT-discrepant pupils were also highly discrepant on the VESPAR. In this subset of CAT-discrepant pupils, poor primary verbal ability would be a contributing factor.


Batchelor, E. S., Kixmiller, J. S. & Dean, R. S. (1990). Neuropsychological aspects of reading and spelling performance in children with learning disabilities. Developmental Neuropsychology, 6, 183-192.

Belmont, L. & Birch, H. G. (1966). The intellectual profile of retarded readers. Perceptual and Motor Skills, 22, 787-816.

Burt, C. (1921). Mental and Scholastic Tests. London: King.

Cattell, R. B. (1963). Theory of fluid and crystallized intelligence: A critical experiment. Journal of Educational Psychology, 54, 1-22.

Cattell, R. B. (1971). Abilities: Their Structure, Growth and Function. Boston: Houghton-Mifflin.

Coughlan, A. K. & Hollows, S. E. (1984). Use of memory tests in differentiating organic disorder from depression. British Journal of Psychiatry, 145, 164-167.

Elliott, C. D. (1983). The British Ability Scales Manual 2 Technical Handbook. Windsor, UK: NFER-NELSON.

Fletcher, J. M., Shaywitz, S. E., Shankweiler, D. P., Katz, L., Liberman, I. Y., Stuebing, K. K., Francis, D. J., Fowler, A. E. & Shaywitz, B. A. (1994). Cognitive profiles of reading disability: Comparisons of discrepancy and low achievement definitions. Journal of Educational Psychology, 86, 6-23.

Langdon, D. W. & Warrington, E. K. (1995). The VESPAR: A Verbal and Spatial Reasoning Test. Hove, UK: Erlbaum.

Mellanby, J. H., Anderson, R., Campbell, B. & Westwood, E. (1996). Cognitive determinants of verbal underachievement at secondary school level. British Journal of Educational Psychology, 66, 483-500.

Rutter, M. & Yule, W. (1975). The concept of specific reading retardation. Journal of Child Psychology and Psychiatry, 16, 181-197.

Schmidt, F. L. & Crano, W. D. (1974). A test of the theory of fluid and crystallized intelligence in middle and low socio-economic status children. Journal of Educational Psychology, 66, 255-261.

Shaywitz, B. A., Fletcher, J. M., Holahan, J. M. & Shaywitz, S. E. (1992). Discrepancy compared to low achievement definitions of reading disability: Results from the Connecticut longitudinal study. Journal of Learning Disabilities, 25, 639-648.

Siegel, L. (1988). Evidence that IQ scores are irrelevant to the definition and analysis of reading disability. Canadian Journal of Psychology, 42, 201-215.

Stanovich, K. E. (1993). Does reading make you smarter? Literacy and the development of verbal intelligence. Advances in Child Behaviour and Development, 24, 133-180.

Stanovich, K. E. & Siegel, L. S. (1994). Phenotypic performance profile of children with reading disabilities: A regression-based test of the phonological-core variable-difference model. Journal of Educational Psychology, 86, 24-53.

Thorndike, R. L. & Hagen, E. (1986). Cognitive Abilities Test. Windsor, UK: NFER-NELSON.

Thorndike, R. L. & Lorge, I. (1936). The Teacher's Word Book of 30,000 Words. New York: Columbia University.

Warrington, E. K. (1967). The incidence of verbal disability associated with backward reading. Neuropsychologia, 5, 1-5.

Wechsler, D. (1949). Wechsler Intelligence Scale for Children. New York: Psychological Corporation. Wechsler, D. (1974).

Wechsler Intelligence Scale for Children - Revised. San Diego, CA: Psychological Corporation.
COPYRIGHT 1998 British Psychological Society
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 1998 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Langdon, D.W.; Rosenblatt, N.; Mellanby, J.H.
Publication:British Journal of Psychology
Date:May 1, 1998
Previous Article:What causes a tip-of-the-tongue state? Evidence for lexical neighbourhood effects in speech production.
Next Article:Theory-of-mind deficits and causal attributions.

Terms of use | Privacy policy | Copyright © 2018 Farlex, Inc. | Feedback | For webmasters