Assessing adult learner social role performance.Abstract
This study describes the development and validation of a performance rating scale and an interview protocol designed to assess individual performance in the adult learner Adult learner is a term used to describe any person socially accepted as an adult who is in a learning process, whether it is formal education, informal learning, or corporate-sponsored learning. social role. The genesis for this work is found in the 1950's social roles research of Robert J. Havighurst Robert James Havighurst (June 5, 1900 in De Pere, Wisconsin – January 31, 1991 in Richmond, Indiana) was a professor, physicist, educator, and aging expert. Both his father, Freeman Alfred Havighurst, and mother, Winifred Weter Havighurst, had been educators at Lawrence and the later research of Abney (1992/1993) and Kirkman Kirk´man
n. 1. A clergyman or officer in a kirk.
2. A member of the Church of Scotland, as distinguished from a member of another communion. (1994). The scale and interview protocol were developed using five specifically structured review panels. The first two panels were involved in the construction of the scales and protocol. The third panel examined the scale for language clarity and completeness of description. The fourth panel reviewed the relationship of the proposed protocol questions and the performance rating scale. The fifth and final panel qualitatively reviewed both the scale and protocol for overall clarity, functionality, and usability. Following field testing, data were examined using confirmatory factor analysis In statistics, confirmatory factor analysis (CFA) is a special form of factor analysis. It is used to assess the the number of factors and the loadings of variables. to assess content validity content validity,
n the degree to which an experiment or measurement actually reflects the variable it has been designed to measure. . The major finding of this study was that the scale and protocol serve the purpose for which they were designed.
Robert J. Havighurst's research, The Kansas City Kansas City, two adjacent cities of the same name, one (1990 pop. 149,767), seat of Wyandotte co., NE Kansas (inc. 1859), the other (1990 pop. 435,146), Clay, Jackson, and Platte counties, NW Mo. (inc. 1850). Study of Adult Life, as reported in Havighurst and Orr (1956), although dated, forms an important basis for exploring developmental tasks within an adult's social roles. A social role consists of a "pattern of behaviors and attitudes related to a specific function or position as defined and expected by society" (Abney, 1992/1993, p. 48). Social roles and the foundations upon which they were developed have undergone considerable change since Havighurst=s exploration in the 1950s. Societal change (e.g., changes in family grouping, occupational opportunity, and knowledge base requirements) has contributed to the question of relevancy of the Havighurst studies in a contemporary setting. The purpose of this study was to expand upon the 1950s efforts of Robert J. Havighurst and others by creating and validating a performance rating scale and an interview protocol to investigate the adult learner social role.
Abney (1992/1993) undertook an exploratory study to revise and update Havighurst's social roles and developmental events. His study provides a basis for the continuing validation of social roles in a contemporary setting. Based upon review of literature and a pilot panel review, he initially identified 14 social roles and 104 developmental events. Using two separate nationwide panels, the data were refined to 13 social roles and 94 developmental events. A stratified stratified /strat·i·fied/ (strat´i-fid) formed or arranged in layers.
Arranged in the form of layers or strata. quota sample (N=180) was classified into 18 categories based on age, gender, and socio-economic status level (SES). In his final data analysis, Abney compared his social role rank order findings with those of Havighurst's Kansas City Study and found substantial correlations, r(179)=.53 for age; r(179)=.65 for gender, r(179)=.61 for SES; all ps<.05, with Havighurst's original work. Results of Abney's work provide a conceptual basis to continue the examination of adult social roles. Although not part of Havighurst's investigation, nor part of Abney's 13 social roles, the social role of adult learner was recommended by the review panels within his study and included as a recommendation for further research in his final work.
Kirkman (1994) updated Havighurst's instrument and scales in order to examine the parent, spouse, and worker social role performance in accordance with currently described adult developmental events. Scales and interview schedules were developed for entry, intermediate, and advanced phases of each of the social roles. Two panels were used in the construction of the performance rating scales. A third panel examined the proposed items for language clarity and completeness. Interview schedules were developed for each role and examined by role specific panels for item relevance, completeness and relationship of interview schedule item to scale representation. Following field testing, three quota samples of 90 were used for each of the three roles. Data were examined based on age, gender and socio-economic status level (SES). Kirkman found that among her sample, older workers scored at higher levels within the Worker social role than the middle or younger age levels. Upper-middle SES respondents scored higher than the lower middle respondents within the Worker role. Within the parent role no interactions were found to exist between age, gender, and SES. Spouse/partner (retitled from Havighurst's Spouse role) data analysis
revealed no identifiable patterns based on age, gender, or SES levels.
Based upon Abney=s (1992/1993) exploratory research Exploratory research is a type of research conducted because a problem has not been clearly defined. Exploratory research helps determine the best research design, data collection method and selection of subjects. and Kirkman=s (1994) process development, the adult learner performance rating scale development for this study began with a review of existing research / literature and recommendations of adult educators and researchers to determine the factors appropriate to this study. As a result of this effort, four factors (Perception, Involvement, Application and Use, and Learning Skills) from the domain of the adult learner were identified.
Validity and Reliability
Estimates of construct validity construct validity,
n the degree to which an experimentally-determined definition matches the theoretical definition. were derived by a series of review panels. Instrument reliability was examined during the test-retest phase of instrument development. The performance rating scale was developed using a series of five independent panels for review and validation. Panel members consisted of national representatives from the fields of adult education, measurement and statistics, psychology, English language English language, member of the West Germanic group of the Germanic subfamily of the Indo-European family of languages (see Germanic languages). Spoken by about 470 million people throughout the world, English is the official language of about 45 nations. , adult education program administrators, education leadership, private sector trainers and evaluators, and representatives of the population of interest. The sequence followed for the development of both the scale and interview protocol was as follows: review of literature, discussions with adult educators, researcher prepared draft, pilot panel review, validation panel, verification panel, confirmation panel, and affirmation panel.
Performance Rating Scale Pilot Panel
Pilot Panel review was accomplished using a Q-sort technique. When utilizing this technique, "An individual is given a set of items or statements, usually on cards, and asked to place them into specified categories so that each category contains some minimum of cards@ (Gay, 1980, p. 121). Panel members were asked to place descriptive statements with the appropriate factor headings of Perception; Involvement; Application and Use; and Learning Skill Level. As a second step, each factor descriptor (1) A word or phrase that identifies a document in an indexed information retrieval system.
(2) A category name used to identify data.
(operating system) descriptor was rank ordered from low to high within the factor heading. The pilot panel reported four of the six members card sorted with 100% agreement in both descriptor placement and sequencing the descriptors within factor categories. One panel member placed a single descriptor within an inappropriate factor heading and the same panel member reversed the order of two factor descriptors within a factor heading. These variances were considered simple error as no identifiable pattern was established.
Performance Rating Scale Validation Panel
The Q-sort technique and instructions for the validation panel replicated those for the Pilot Panel. Five panel members reported factor Reported factor
The pool factor as reported by the bond buyer for a given amortization period. descriptor placement outside the desired factor category. A pattern was noted in panel response within the factor application and use. On a scale from 1-5, with 1 being low and 5 being high, panel members were asked to array the descriptors in ascending order. Two panel members reversed the descriptors one and two, indicating the descriptors held slight differentiation. Following overall review, the scale was used without change; however, the perceived closeness of the descriptors was noted and incorporated into interviewer/scorer training.
Performance Rating Scale Verification Panel
The verification panel was asked to accomplish two separate tasks, in sequence. The first task involved the sequencing of factor descriptors. Upon being provided individual envelopes containing the factor title and definition, each panel member was asked to sequence the accompanying five factor descriptors from low to high. This final card sort was accomplished without error. The second task involved rating each factor descriptor for language clarity and completeness. Factor descriptors were rated using a Likert scale Likert scale A subjective scoring system that allows a person being surveyed to quantify likes and preferences on a 5-point scale, with 1 being the least important, relevant, interesting, most ho-hum, or other, and 5 being most excellent, yeehah important, etc ranging from 1 (unclear) to 6 (very clear) for language clarity and a similar scale for completeness. Responses assessing language clarity ranged from 4.33 to 5.88 (M=5.25, SD=417). Responses assessing descriptor completeness ranged from 4.44 to 5.88 (M=5.30, SD=.314).
Interview Protocol Pilot and Validation Panels
The interview protocol pilot panel was provided the four factors, each of which was accompanied by a set of five factor descriptors as developed in the scaling process. The panel members were asked to provide open-ended questions A closed-ended question is a form of question, which normally can be answered with a simple "yes/no" dichotomous question, a specific simple piece of information, or a selection from multiple choices (multiple-choice question), if one excludes such non-answer responses as dodging a that could be used to rate interviewees within each cluster group. The validation panel completed a similar tasking. Panel responses were compiled as a question bank that was reviewed to eliminate duplicate questions. After review, 41 possible interview protocol questions remained. These potential questions were further refined resulting in 18 primary and eight sub-questions, which were complied for use as a draft protocol instrument.
Interview Protocol Verification Panel
The verification panel was asked to evaluate proposed protocol questions for language clarity and completeness. Proposed questions were rated using a Liken lik·en
tr.v. lik·ened, lik·en·ing, lik·ens
To see, mention, or show as similar; compare.
[Middle English liknen, from like, similar; see like2 scale ranging from 1 (unclear) to 6 (very clear) for language clarity and a similar scale for completeness. Responses assessing language clarity ranged from 4.33 to 5.78 (M=5.24, SD=.380). Responses assessing descriptor completeness ranged from 4.44 to 5.78 (M=5.32, SD=.335).
The Confirmation/Affirmation Panels were created specifically to review the entire scale and protocol instruments, rather than the individual factors and accompanying descriptors. The first, a 15-member Confirmation Panel, was formed consisting of research team members, advanced graduate students, and non-academic individuals from the population under study. Non-academic panel members were included as a means to represent the population of interest in the scale/protocol development process. Each was provided a set of factors with accompanying descriptors and a set of unnumbered, randomized ran·dom·ize
tr.v. ran·dom·ized, ran·dom·iz·ing, ran·dom·iz·es
To make random in arrangement, especially in order to control the variables in an experiment. potential instrument questions. They were asked to match the proposed questions with the appropriate set of factor descriptors. This review (N=260) resulted in the following by factor percentage agreement within groups: Perception (99.90%); Involvement (99.82%); Application and Use (99.97%); and Learning Skill (99.88%). The Confirmation Panel=s investigation of the relationship of interview protocol questions to the performance rating scale was considered to be an important step in determining if the instrument would measure what is was intended to measure.
The final panel review, prior to field testing, was conducted by the Affirmation Panel. This panel consisted of five practicing representatives from the field of Measurement and Research. Each was provided a proposed Performance Rating Scale and Interview Protocol. The panel members were asked to qualitatively assess each in terms of form, usability, bias, and ease of use. The researcher met individually with each panel member to gather feedback and recommended actions. The Performance Rating Scale and Interview Protocol were submitted for field testing with minor editorial changes.
Field testing was conducted using a 20% quota sample of 30 interviewees representing the population under study. Care was taken to insure the 30 participants represented the age, gender, and SES levels in accordance with the study design. The interviews from the 30 participants were rated and scored by five advanced graduate students comprising the research team. Inter-rater reliability Inter-rater reliability, Inter-rater agreement, or Concordance is the degree of agreement among raters. It gives a score of how much , or consensus, there is in the ratings given by judges. was examined using a Pearson Product Moment Correlation to determine strength of relationships corrected for chance. Correlations ranged from .88 to .93 indicating a high level of inter-rater correlation. Intra-rater agreement was accomplished by examining the individual rater rat·er
1. One that rates, especially one that establishes a rating.
2. One having an indicated rank or rating. Often used in combination: a third-rater; a first-rater. =s scores for the same interview repeated following a two-week delay. Overall scores were examined using the Pearson Product Moment Correlation. Correlations, by factor, ranged from .88 to .92.
Test-Retest reliability test-retest reliability Psychology A measure of the ability of a psychologic testing instrument to yield the same result for a single Pt at 2 different test periods, which are closely spaced so that any variation detected reflects reliability of the instrument estimates were employed as an indication of instrument stability. Fifteen individual participants were identified and five were interviewed and scored by the same male interviewer/scorer. A second set of five participants was identified and interviewed and scored by the same female interviewer/scorer. A third set of five was first interviewed and scored by a male interviewer/scorer with the second interview accomplished by a female interviewer/scorer. The same 15 participants were interviewed two weeks later replicating the interview procedure established for the first interview. A Pearson Product Moment Correlation technique was used to examine the overall scores. A test-retest correlation of .86 indicated sufficiently high instrument stability to allow its use for further data gathering purposes.
A formal written training guide was developed to standardize stan·dard·ize
1. To cause to conform to a standard.
2. To evaluate by comparing with a standard. the training process. Subject areas included: the use of the demographic information sheet, consent form, the scale and protocol, including the use of prompts, probes, and the tape recorder tape recorder, device for recording information on strips of plastic tape (usually polyester) that are coated with fine particles of a magnetic substance, usually an oxide of iron, cobalt, or chromium. The coating is normally held on the tape with a special binder. . The training also included, but was not necessarily limited to, the interview setting, interviewee comfort levels, and interviewer neutrality. Practice interview and scoring sessions were extensively critiqued.
A quota sample consisting of 150 face-to-face interviews representative of the 1990 census data for the Greater Tampa Bay Tampa Bay, inlet of the Gulf of Mexico, 25 mi (40 km) long and 7 to 12 mi (11.3–19 km) wide, W Fla., separated from the Gulf by numerous small islands; it receives the Hillsborough River. St. Area was used for this study. The U.S. Bureau of Census Bureau of Census
A division of the federal government of the United States Bureau of Commerce that is responsible for conducting the national census at least once every 10 years, in which the population of the United States is counted. (1991) adult population of the Tampa and St Petersburg-Clearwater (Greater Tampa Bay) area consisted of 10% African-American, 7% Hispanic, 1% Asian, and .5% Native American Indian American Indian
or Native American or Amerindian or indigenous American
Any member of the various aboriginal peoples of the Western Hemisphere, with the exception of the Eskimos (Inuit) and the Aleuts. . To accomplish the proportionate stratification necessary for ethnicity in this study, the sample included 15 African-American subjects, 11 Hispanic, two Asian, and one Native American Indian. The quota sample was established based upon two genders, three age levels, and five SES levels. This resulted in 30 cells, with five interviews each. This allowed for representative distribution of research participants based on age, gender, and SES level with race/ethnicity dispersed throughout each cell.
Researchers have often used classical statistical methods, such as exploratory factor analysis (Kerlinger, 1986), to assess construct validity. Confirmatory factor analysis is a powerful method for investigating the construct validity of a measure (Schmitt & Stults, 1986). Confirmatory factor analysis provides an indication of overall fit and precise criteria for assessing convergent and discriminant validity Discriminant validity describes the degree to which the operationalization is not similar to (diverges from) other operationalizations that it theoretically should not be similar to. . Embedded Inserted into. See embedded system. within the purpose of the study is the need to assess the construct validity of the four subscales of the Adult Learner Social Role Performance Interview Protocol.
A measure of the degree to which returns on two risky assets move in tandem. A positive covariance means that asset returns move together. A negative covariance means returns vary inversely. matrices were analyzed using confirmatory factor analysis. The data were examined by AMOS Amos (ā`məs), prophetic book of the Bible. The majority of its oracles are chronologically earlier than those of the Bible's other prophetic books. His activity is dated c.760 B.C. version (4.0) maximum likelihood factor analysis (Arbuckle, 1999). This technique assesses the degree to which an expected or hypothesized factor model can effectively reproduce the observed or sample item covariances. Confirmatory factor analysis, in contrast to exploratory factor analysis, begins with an a priori a priori
In epistemology, knowledge that is independent of all particular experiences, as opposed to a posteriori (or empirical) knowledge, which derives from experience. hypothesized model and deductively de·duc·tive
1. Of or based on deduction.
2. Involving or using deduction in reasoning.
de·ductive·ly adv. ascertains its feasibility by offering a more definitive empirical evidence of the underlying factor structure of a scale. The model was evaluated three ways. First, departure of the data from the specified model was tested for significance by using a chi-square test chi-square test: see statistics. (Joreskog & Sorbom, 1989). Joreskog and Sorbom (1989) and Bentler and Bonett (1980) advise against the sole use of the chi-square value in judging the overall fit of the model because of the sensitivity of the chi-square to sample size. Second, goodness-of-fit between the data and the specified model was estimated by employing the Comparative Fit Index (CFI CFI
cost, freight, and insurance ) (Bentler, 1990) and the Tucker-Lewis Index (TLI (Transport Level Interface) A common interface for transport services (layer 4 of the OSI model). It provides a common language to a transport protocol and allows client/server applications to be used in different networking environments. ) (Bentler & Bonett, 1980). Although numerous goodness-of-fit indices have been developed, the comparative fit index (CFI) was reported because it is less likely to be subject to bias when smaller samples are used (Bagozzi & Heatherton, 1994). Marsh, Balla, and McDonald (1988) have suggested the following criteria to judge fit indices for these purposes: good fit = CFI> .90; adequate but marginal fit = .89 to .89; poor fit + .60 to .79; very poor fit < .60. Third, the hypothesized loadings were examined for statistical significance at p = < .05 level. Although the chi-square for the one-factor model was significant, X2 (2) = 6.850, p = .033, the CFI and the TFI TFI Tobacco Free Initiative (World Health Organization)
TFI The Franklin Institute (Philadelphia, Pennsylvania)
TFI The Fertilizer Institute
TFI Technology Futures, Inc. yielded acceptably high goodness of fit Goodness of fit means how well a statistical model fits a set of observations. Measures of goodness of fit typically summarize the discrepancy between observed values and the values expected under the model in question. Such measures can be used in statistical hypothesis testing, e. indices (.994 and .983 respectively). In addition, all subscales loaded significantly (p = < .05) on the single factor model ranging from .938 to .960.
The performance rating scales and interview protocol created for this study served the purpose for which they were designed. The continued development and use of performance rating scales and interview protocols, similar to the ones used in this study, will provide additional tools with which to empirically assess the adult learner within society. Increasing the understanding of the context of learning, that is the setting in which it occurs, strengthens the concept that education is no longer mere preparation for adulthood, but rather it as a lifelong activity necessary for the well being of society. The creation of instruments, the scale and protocol, directly contribute to the ability to identify and assess the role of the adult learner. The ability to assess the societal roles that are both expected and accepted within society provides insight concerning the training and education needed to fulfill those roles.
Abney, H. M. (1992/1993). The development and initial validation of a domain of adult social roles and developmental events within Havighurst=s model. (Doctoral dissertation, University of South Florida
• • [ , 1992). Dissertation Abstracts International, 53(07A), 2202.
Arbuckle, J.L. (1999). Amos 4.0 User=s Guide. Chicago: SmallWaters Corporation.
Bagozzi, R. P., & Heatherton, T. F. (1994). A general approach to representing multifaceted mul·ti·fac·et·ed
Having many facets or aspects. See Synonyms at versatile.
Adj. 1. multifaceted - having many aspects; "a many-sided subject"; "a multifaceted undertaking"; "multifarious interests"; "the multifarious personality constructs: Application to state self-esteem. Structural Equation Modeling Structural equation modeling (SEM) is a statistical technique for testing and estimating causal relationships using a combination of statistical data and qualitative causal assumptions. , 1, 25-67.
Bentler, P.M. (1990). Comparative fit indexes in structural models. Psychological Bulletin, 107, 238-246.
Bender, P.M., & Bonett, D.G. (1980). Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin, 88, 588-606.
Gay, L. R. (1980). Educational evaluation Educational evaluation is the evaluation process of characterizing and appraising some aspect/s of an educational process.
There are two common purposes in educational evaluation which are, at times, in conflict with one another. and measurement. Columbus, OH
Havighurst, R., & Orr, B. (1956). Adult education and adult needs. Chicago: Center for the Study of Liberal Education for Adults.
Joreskog, K. J., & Sorbom, D. (1989). LISREL LISREL Linear Structural Relations 7: A guide to the program and applications (2nd ed.). Chicago: SPSS A statistical package from SPSS, Inc., Chicago (www.spss.com) that runs on PCs, most mainframes and minis and is used extensively in marketing research. It provides over 50 statistical processes, including regression analysis, correlation and analysis of variance. .
Kerlinger, F.N. (1986). Foundations of behavioral research (3rd ed.). New York New York, state, United States
New York, Middle Atlantic state of the United States. It is bordered by Vermont, Massachusetts, Connecticut, and the Atlantic Ocean (E), New Jersey and Pennsylvania (S), Lakes Erie and Ontario and the Canadian province of : Holt, Rinehart and Winston.
Kirkman, M. S. (1994). The development and content validation of contemporary parent, spouse/partner, and worker social role performance rating scales and assessment instruments. (Doctoral dissertation, University of South Florida, 1994), Dissertation Abstracts International, 55(07A), 1793.
Marsh, H. W., Balla, J.R., & McDonald, R. P. (1988). Goodness-of-fit indexes in confirmatory factor analysis: The effects of sample size. Psychological Bulletin, 103, 391-410.
Schmitt, N., & Stults, D. N. (1986). Methodology review: Analysis of multitrait-multimethod matrices. Applied Psychological Measurement, 10, 1-22.
U.S. Bureau of the Census Noun 1. Bureau of the Census - the bureau of the Commerce Department responsible for taking the census; provides demographic information and analyses about the population of the United States
Census Bureau . (1991). Population of metropolitan areas by race and Hispanic origin: 1990. Washington, DC: Department of Commerce.
Witte is an Assistant Professor of Adult Education in the Educational Foundations, Leadership, and Technology Department. Guarino is an Assistant Professor of Measurement and Research in the Educational Foundations, Leadership, and Technology Department. James is James I, king of Aragón and count of Barcelona
James I (James the Conqueror), 1208–76, king of Aragón and count of Barcelona (1213–76), son and successor of Peter II. a Professor of Adult Education in the Leadership Development Department.