Assessing adult learner social role performance.
This study describes the development and validation of a performance rating scale and an interview protocol designed to assess individual performance in the adult learner social role. The genesis for this work is found in the 1950's social roles research of Robert J. Havighurst and the later research of Abney (1992/1993) and Kirkman (1994). The scale and interview protocol were developed using five specifically structured review panels. The first two panels were involved in the construction of the scales and protocol. The third panel examined the scale for language clarity and completeness of description. The fourth panel reviewed the relationship of the proposed protocol questions and the performance rating scale. The fifth and final panel qualitatively reviewed both the scale and protocol for overall clarity, functionality, and usability. Following field testing, data were examined using confirmatory factor analysis to assess content validity. The major finding of this study was that the scale and protocol serve the purpose for which they were designed.
Robert J. Havighurst's research, The Kansas City Study of Adult Life, as reported in Havighurst and Orr (1956), although dated, forms an important basis for exploring developmental tasks within an adult's social roles. A social role consists of a "pattern of behaviors and attitudes related to a specific function or position as defined and expected by society" (Abney, 1992/1993, p. 48). Social roles and the foundations upon which they were developed have undergone considerable change since Havighurst=s exploration in the 1950s. Societal change (e.g., changes in family grouping, occupational opportunity, and knowledge base requirements) has contributed to the question of relevancy of the Havighurst studies in a contemporary setting. The purpose of this study was to expand upon the 1950s efforts of Robert J. Havighurst and others by creating and validating a performance rating scale and an interview protocol to investigate the adult learner social role.
Abney (1992/1993) undertook an exploratory study to revise and update Havighurst's social roles and developmental events. His study provides a basis for the continuing validation of social roles in a contemporary setting. Based upon review of literature and a pilot panel review, he initially identified 14 social roles and 104 developmental events. Using two separate nationwide panels, the data were refined to 13 social roles and 94 developmental events. A stratified quota sample (N=180) was classified into 18 categories based on age, gender, and socio-economic status level (SES). In his final data analysis, Abney compared his social role rank order findings with those of Havighurst's Kansas City Study and found substantial correlations, r(179)=.53 for age; r(179)=.65 for gender, r(179)=.61 for SES; all ps<.05, with Havighurst's original work. Results of Abney's work provide a conceptual basis to continue the examination of adult social roles. Although not part of Havighurst's investigation, nor part of Abney's 13 social roles, the social role of adult learner was recommended by the review panels within his study and included as a recommendation for further research in his final work.
Kirkman (1994) updated Havighurst's instrument and scales in order to examine the parent, spouse, and worker social role performance in accordance with currently described adult developmental events. Scales and interview schedules were developed for entry, intermediate, and advanced phases of each of the social roles. Two panels were used in the construction of the performance rating scales. A third panel examined the proposed items for language clarity and completeness. Interview schedules were developed for each role and examined by role specific panels for item relevance, completeness and relationship of interview schedule item to scale representation. Following field testing, three quota samples of 90 were used for each of the three roles. Data were examined based on age, gender and socio-economic status level (SES). Kirkman found that among her sample, older workers scored at higher levels within the Worker social role than the middle or younger age levels. Upper-middle SES respondents scored higher than the lower middle respondents within the Worker role. Within the parent role no interactions were found to exist between age, gender, and SES. Spouse/partner (retitled from Havighurst's Spouse role) data analysis
revealed no identifiable patterns based on age, gender, or SES levels.
Based upon Abney=s (1992/1993) exploratory research and Kirkman=s (1994) process development, the adult learner performance rating scale development for this study began with a review of existing research / literature and recommendations of adult educators and researchers to determine the factors appropriate to this study. As a result of this effort, four factors (Perception, Involvement, Application and Use, and Learning Skills) from the domain of the adult learner were identified.
Validity and Reliability
Estimates of construct validity were derived by a series of review panels. Instrument reliability was examined during the test-retest phase of instrument development. The performance rating scale was developed using a series of five independent panels for review and validation. Panel members consisted of national representatives from the fields of adult education, measurement and statistics, psychology, English language, adult education program administrators, education leadership, private sector trainers and evaluators, and representatives of the population of interest. The sequence followed for the development of both the scale and interview protocol was as follows: review of literature, discussions with adult educators, researcher prepared draft, pilot panel review, validation panel, verification panel, confirmation panel, and affirmation panel.
Performance Rating Scale Pilot Panel
Pilot Panel review was accomplished using a Q-sort technique. When utilizing this technique, "An individual is given a set of items or statements, usually on cards, and asked to place them into specified categories so that each category contains some minimum of cards@ (Gay, 1980, p. 121). Panel members were asked to place descriptive statements with the appropriate factor headings of Perception; Involvement; Application and Use; and Learning Skill Level. As a second step, each factor descriptor was rank ordered from low to high within the factor heading. The pilot panel reported four of the six members card sorted with 100% agreement in both descriptor placement and sequencing the descriptors within factor categories. One panel member placed a single descriptor within an inappropriate factor heading and the same panel member reversed the order of two factor descriptors within a factor heading. These variances were considered simple error as no identifiable pattern was established.
Performance Rating Scale Validation Panel
The Q-sort technique and instructions for the validation panel replicated those for the Pilot Panel. Five panel members reported factor descriptor placement outside the desired factor category. A pattern was noted in panel response within the factor application and use. On a scale from 1-5, with 1 being low and 5 being high, panel members were asked to array the descriptors in ascending order. Two panel members reversed the descriptors one and two, indicating the descriptors held slight differentiation. Following overall review, the scale was used without change; however, the perceived closeness of the descriptors was noted and incorporated into interviewer/scorer training.
Performance Rating Scale Verification Panel
The verification panel was asked to accomplish two separate tasks, in sequence. The first task involved the sequencing of factor descriptors. Upon being provided individual envelopes containing the factor title and definition, each panel member was asked to sequence the accompanying five factor descriptors from low to high. This final card sort was accomplished without error. The second task involved rating each factor descriptor for language clarity and completeness. Factor descriptors were rated using a Likert scale ranging from 1 (unclear) to 6 (very clear) for language clarity and a similar scale for completeness. Responses assessing language clarity ranged from 4.33 to 5.88 (M=5.25, SD=417). Responses assessing descriptor completeness ranged from 4.44 to 5.88 (M=5.30, SD=.314).
Interview Protocol Pilot and Validation Panels
The interview protocol pilot panel was provided the four factors, each of which was accompanied by a set of five factor descriptors as developed in the scaling process. The panel members were asked to provide open-ended questions that could be used to rate interviewees within each cluster group. The validation panel completed a similar tasking. Panel responses were compiled as a question bank that was reviewed to eliminate duplicate questions. After review, 41 possible interview protocol questions remained. These potential questions were further refined resulting in 18 primary and eight sub-questions, which were complied for use as a draft protocol instrument.
Interview Protocol Verification Panel
The verification panel was asked to evaluate proposed protocol questions for language clarity and completeness. Proposed questions were rated using a Liken scale ranging from 1 (unclear) to 6 (very clear) for language clarity and a similar scale for completeness. Responses assessing language clarity ranged from 4.33 to 5.78 (M=5.24, SD=.380). Responses assessing descriptor completeness ranged from 4.44 to 5.78 (M=5.32, SD=.335).
The Confirmation/Affirmation Panels were created specifically to review the entire scale and protocol instruments, rather than the individual factors and accompanying descriptors. The first, a 15-member Confirmation Panel, was formed consisting of research team members, advanced graduate students, and non-academic individuals from the population under study. Non-academic panel members were included as a means to represent the population of interest in the scale/protocol development process. Each was provided a set of factors with accompanying descriptors and a set of unnumbered, randomized potential instrument questions. They were asked to match the proposed questions with the appropriate set of factor descriptors. This review (N=260) resulted in the following by factor percentage agreement within groups: Perception (99.90%); Involvement (99.82%); Application and Use (99.97%); and Learning Skill (99.88%). The Confirmation Panel=s investigation of the relationship of interview protocol questions to the performance rating scale was considered to be an important step in determining if the instrument would measure what is was intended to measure.
The final panel review, prior to field testing, was conducted by the Affirmation Panel. This panel consisted of five practicing representatives from the field of Measurement and Research. Each was provided a proposed Performance Rating Scale and Interview Protocol. The panel members were asked to qualitatively assess each in terms of form, usability, bias, and ease of use. The researcher met individually with each panel member to gather feedback and recommended actions. The Performance Rating Scale and Interview Protocol were submitted for field testing with minor editorial changes.
Field testing was conducted using a 20% quota sample of 30 interviewees representing the population under study. Care was taken to insure the 30 participants represented the age, gender, and SES levels in accordance with the study design. The interviews from the 30 participants were rated and scored by five advanced graduate students comprising the research team. Inter-rater reliability was examined using a Pearson Product Moment Correlation to determine strength of relationships corrected for chance. Correlations ranged from .88 to .93 indicating a high level of inter-rater correlation. Intra-rater agreement was accomplished by examining the individual rater=s scores for the same interview repeated following a two-week delay. Overall scores were examined using the Pearson Product Moment Correlation. Correlations, by factor, ranged from .88 to .92.
Test-Retest reliability estimates were employed as an indication of instrument stability. Fifteen individual participants were identified and five were interviewed and scored by the same male interviewer/scorer. A second set of five participants was identified and interviewed and scored by the same female interviewer/scorer. A third set of five was first interviewed and scored by a male interviewer/scorer with the second interview accomplished by a female interviewer/scorer. The same 15 participants were interviewed two weeks later replicating the interview procedure established for the first interview. A Pearson Product Moment Correlation technique was used to examine the overall scores. A test-retest correlation of .86 indicated sufficiently high instrument stability to allow its use for further data gathering purposes.
A formal written training guide was developed to standardize the training process. Subject areas included: the use of the demographic information sheet, consent form, the scale and protocol, including the use of prompts, probes, and the tape recorder. The training also included, but was not necessarily limited to, the interview setting, interviewee comfort levels, and interviewer neutrality. Practice interview and scoring sessions were extensively critiqued.
A quota sample consisting of 150 face-to-face interviews representative of the 1990 census data for the Greater Tampa Bay Area was used for this study. The U.S. Bureau of Census (1991) adult population of the Tampa and St Petersburg-Clearwater (Greater Tampa Bay) area consisted of 10% African-American, 7% Hispanic, 1% Asian, and .5% Native American Indian. To accomplish the proportionate stratification necessary for ethnicity in this study, the sample included 15 African-American subjects, 11 Hispanic, two Asian, and one Native American Indian. The quota sample was established based upon two genders, three age levels, and five SES levels. This resulted in 30 cells, with five interviews each. This allowed for representative distribution of research participants based on age, gender, and SES level with race/ethnicity dispersed throughout each cell.
Researchers have often used classical statistical methods, such as exploratory factor analysis (Kerlinger, 1986), to assess construct validity. Confirmatory factor analysis is a powerful method for investigating the construct validity of a measure (Schmitt & Stults, 1986). Confirmatory factor analysis provides an indication of overall fit and precise criteria for assessing convergent and discriminant validity. Embedded within the purpose of the study is the need to assess the construct validity of the four subscales of the Adult Learner Social Role Performance Interview Protocol.
Covariance matrices were analyzed using confirmatory factor analysis. The data were examined by AMOS version (4.0) maximum likelihood factor analysis (Arbuckle, 1999). This technique assesses the degree to which an expected or hypothesized factor model can effectively reproduce the observed or sample item covariances. Confirmatory factor analysis, in contrast to exploratory factor analysis, begins with an a priori hypothesized model and deductively ascertains its feasibility by offering a more definitive empirical evidence of the underlying factor structure of a scale. The model was evaluated three ways. First, departure of the data from the specified model was tested for significance by using a chi-square test (Joreskog & Sorbom, 1989). Joreskog and Sorbom (1989) and Bentler and Bonett (1980) advise against the sole use of the chi-square value in judging the overall fit of the model because of the sensitivity of the chi-square to sample size. Second, goodness-of-fit between the data and the specified model was estimated by employing the Comparative Fit Index (CFI) (Bentler, 1990) and the Tucker-Lewis Index (TLI) (Bentler & Bonett, 1980). Although numerous goodness-of-fit indices have been developed, the comparative fit index (CFI) was reported because it is less likely to be subject to bias when smaller samples are used (Bagozzi & Heatherton, 1994). Marsh, Balla, and McDonald (1988) have suggested the following criteria to judge fit indices for these purposes: good fit = CFI> .90; adequate but marginal fit = .89 to .89; poor fit + .60 to .79; very poor fit < .60. Third, the hypothesized loadings were examined for statistical significance at p = < .05 level. Although the chi-square for the one-factor model was significant, X2 (2) = 6.850, p = .033, the CFI and the TFI yielded acceptably high goodness of fit indices (.994 and .983 respectively). In addition, all subscales loaded significantly (p = < .05) on the single factor model ranging from .938 to .960.
The performance rating scales and interview protocol created for this study served the purpose for which they were designed. The continued development and use of performance rating scales and interview protocols, similar to the ones used in this study, will provide additional tools with which to empirically assess the adult learner within society. Increasing the understanding of the context of learning, that is the setting in which it occurs, strengthens the concept that education is no longer mere preparation for adulthood, but rather it as a lifelong activity necessary for the well being of society. The creation of instruments, the scale and protocol, directly contribute to the ability to identify and assess the role of the adult learner. The ability to assess the societal roles that are both expected and accepted within society provides insight concerning the training and education needed to fulfill those roles.
Abney, H. M. (1992/1993). The development and initial validation of a domain of adult social roles and developmental events within Havighurst=s model. (Doctoral dissertation, University of South Florida, 1992). Dissertation Abstracts International, 53(07A), 2202.
Arbuckle, J.L. (1999). Amos 4.0 User=s Guide. Chicago: SmallWaters Corporation.
Bagozzi, R. P., & Heatherton, T. F. (1994). A general approach to representing multifaceted personality constructs: Application to state self-esteem. Structural Equation Modeling, 1, 25-67.
Bentler, P.M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107, 238-246.
Bender, P.M., & Bonett, D.G. (1980). Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin, 88, 588-606.
Gay, L. R. (1980). Educational evaluation and measurement. Columbus, OH
Havighurst, R., & Orr, B. (1956). Adult education and adult needs. Chicago: Center for the Study of Liberal Education for Adults.
Joreskog, K. J., & Sorbom, D. (1989). LISREL 7: A guide to the program and applications (2nd ed.). Chicago: SPSS.
Kerlinger, F.N. (1986). Foundations of behavioral research (3rd ed.). New York: Holt, Rinehart and Winston.
Kirkman, M. S. (1994). The development and content validation of contemporary parent, spouse/partner, and worker social role performance rating scales and assessment instruments. (Doctoral dissertation, University of South Florida, 1994), Dissertation Abstracts International, 55(07A), 1793.
Marsh, H. W., Balla, J.R., & McDonald, R. P. (1988). Goodness-of-fit indexes in confirmatory factor analysis: The effects of sample size. Psychological Bulletin, 103, 391-410.
Schmitt, N., & Stults, D. N. (1986). Methodology review: Analysis of multitrait-multimethod matrices. Applied Psychological Measurement, 10, 1-22.
U.S. Bureau of the Census. (1991). Population of metropolitan areas by race and Hispanic origin: 1990. Washington, DC: Department of Commerce.
Witte is an Assistant Professor of Adult Education in the Educational Foundations, Leadership, and Technology Department. Guarino is an Assistant Professor of Measurement and Research in the Educational Foundations, Leadership, and Technology Department. James is a Professor of Adult Education in the Leadership Development Department.
|Printer friendly Cite/link Email Feedback|
|Author:||James, Waynne B.|
|Publication:||Academic Exchange Quarterly|
|Date:||Dec 22, 2001|
|Previous Article:||Writing for nonprofit organizations: a classroom without walls. (On-going topics).|
|Next Article:||Impact of service learning on the cognitive and affective development of pre-service teachers.|