Printer Friendly


A New Paradigm for College Content Area Student Assessment

Currently, colleges of education use performance assessments to determine whether candidates have the requisite knowledge and skills. These performance assessments have limited usefulness because they are typically based upon isolated snapshots, generating data points that can only answer questions about knowledge and skills at one particular moment in time. Some of these performance assessments are project grades, tests, and course grades. In addition, isolated assessments, such as course grades, are relatively subjective, especially in this age of grade inflation (Johnson, 2003). Finally, these isolated snapshots cannot be compared to other performance assessments because they do not measure the same construct. For example, one snapshot may be the score on an application project, demonstrating professional dispositions; whereas, another snapshot may be a test as a final in a traditional course (e.g., Introduction to Special Education). Extending this example, dispositions and content knowledge have little in common and represent completely dissimilar concepts, thus limiting their usefulness in determining whether a candidate is successfully progressing through a program.

Ensuring that candidates are progressing through a program successfully is important to the individual candidates, the program, and to the field of special education in general. First, programs are expensive. A quick internet search of public, private, and for profit special education shows program costs vary, but are clearly expensive, ranging from $20,000 to $50,000. Second, professional organizations and accreditation bodies, such as Council for Exceptional Children and Council for the Accreditation of Educator Preparation hold institutions accountable for candidates who are non-completers. Third, Special Education is a high need area for teachers across districts in all states (U.S. Department of Education, 2018). A delay in providing qualified special education teachers can delay, because of the limited number of qualified personnel, the educational services for children with disabilities across the United States.

Successful completion of a program is often linked to what candidates can do as well as what they know. For example, certification in the state of Washington for special educators is dependent upon two high-stakes assessments, the Teacher Performance Assessment (edTPA) and the Washington Educator Skills Test (WEST-E). The edTPA is a performance assessment, meaning it measures skills as a practicing teacher and is assessed with real students in real classrooms (e.g., lesson planning, delivery of instruction, student assessment). In fact, candidates upload videos of themselves as examples of their ability to teach, demonstrating their competence. The WEST-E, on the other hand, is a multiple choice exam assessing a candidate's knowledge across the general field of special education (e.g., knowledge of philosophical and historical issues, human growth and development, impact of disabilities on family and community, professional responsibilities, research-based strategies for positive outcomes for academics and behavior). The two high-stakes tests, taken together, provide information for the state licensing body regarding the preparation of teachers in both areas, teaching skills and essential knowledge about special education. While these assessments are necessary for licensure, they are completed at the end of a candidate's program. Therefore, they do not inform the candidate or the program as to on-going progress, nor do they provide information useful for support of the candidate or critical decisions in a timely manner that could impact the candidate's future.

Isolated snap shots linked to artifacts, such as the development of a lesson plan, can provide useful information for performance assessments (e.g., edTPA), but have limited usefulness to ensure that candidates are gaining the fundamental content knowledge that can lead to successful program completion and certification. Therefore, the assessment system should include other than isolated snapshot assessments. The challenge is to provide consistent, meaningful data that measure progress in the acquisition of the knowledge, linked throughout the program for the purpose of providing meaningful individual feedback to candidates as they progress toward the licensure assessments and, ultimately, their teaching goals. Meaningful feedback needs to be timely and linked to the program as a whole, instead of to individual faculty, courses, and/or projects that might provide a skewed reflection of a candidate's competence. A system that provides comprehensive, reliable information about the progress of candidates across a program can promote faculty conversations and collaboration regarding the identification and the support for candidates who might otherwise withdraw towards the end of their program or ultimately fail as a teacher after expending both personal monetary as well as programatic resources.

A solution to this problem is to include progress monitoring assessments in order to answer questions about how well a candidate is progressing through content instruction throughout the program in comparison to other candidates. Using multiple data points, progress monitoring assessments can answer such questions as the following: (a) What is the rate of growth of a particular candidate, especially in comparison to their peers? (b) Will the current rate of progress lead to meeting an established standard by a certain point in time? and (c) Did a change in support or instruction lead to a change in progress over time?

Progress monitoring scores can be translated into learning trajectories. Successful candidates can be identified and data can be interpreted based upon individual and expected trends. Thus, candidate trajectories as they proceed through the program can be compared to a projected aimline. If discrepancies in growth rates are present, the results from the assessment may be a beginning point for the identification of a candidate in need of support a timely manner. When used as such, progress monitoring scores are seen as formative assessments. In order to track growth over time, assessments must be reflective of a corpus of skills and/or knowledge and be parallel in form (Fuchs, Compton, Fuchs & Bryant, 2008; Fuchs & Fuchs, 2008). Basically, the progress-monitoring assessment must to linked to content of the the program as a whole and not to isolated components in order to ensure preparation of candidates for licensure assessments.

Curriculum Based Evaluation (CBE) as a Process for Program Assessment

Faculty in teacher preparation programs need to monitor a canididate's progress in the acquisition of knowledge in order to ensure that all candidates are successful in gaining content knowledge as they progress through the special education teacher preparation program. In order to do this, the program has borrowed from Curriculum Based Evaluation (CBE).

In short, CBE is "a systematic problem-solving process for making education decisions" (Howell, Hosp and Rums, 2008, p.353) and is often referred to as Response to Intervention (RTI) or multi-tiered systems of supports (MTSS). It has been used in special education classrooms as well as elementary and high schools to make educational decisions for over 30 years and has been effective in informing faculty and facilitating student achievement (Burns, Appleton & Stehouwer, 2005; Gersten, Compton et al., 2008; Gersten, Beckmann et al., 2009).

For CBE to work, however, valid and reliable ongoing assessments, which closely monitor the learning trajectories of each learner must be in place. Students must be able to be assessed in content areas repeatedly and easily so that teachers can make immediate and accurate decisions about the effectiveness of instruction and/or accommodation and modifications, using rather sophisticated, yet easy to administer assessments that can be used to closely monitor learning and can chart learning trajectories over time. The data generated from these assessments can support teachers, administrators, and other stakeholders in making decisions about the academic or behavioral performance of a student or group of students (e.g. small group, class, grade level, school, etc.). These assessments are thus designed to help school districts make decisions to identify students in need of a variety of supports in order to be successful.

According to Deno (1985), the critical attributes of these effective assessment include the following: (a) simple to use, (b) inexpensive, (c) efficient, and (d) ability to be repeatedly administered over time, directly sampling the curriculum. In addition, because they are administered frequently, progress monitoring assessments are sensitive to small changes in the acquisition of knowledge and skills. Results from the administration of these short assessments, often called probes, are correlated to other measures of achievement and show student growth in specific content over time. Data from the probes can be graphed, and the use of such progress monitoring charts enable teachers to see results of student learning, thus supporting timely changes in instruction.

Valid and reliable assessments for these purposes have been established and are used widely in both general and special education settings throughout North America in reading, writing and math. As an additional note, recent literature suggests that progress monitoring assessments may help to eliminate cultural bias found in many assessment practices for school-age children (Fore III, Burke, and Matin, 2014). Because these assessments and processes were originally applied with K-12 students, the questions then become the following: (a) whether it is appropriate to apply the process to college-age learner, (b) whether these assessments can provide collaboration among faculty in order to promote targeted conversation regarding candidate progress and structures required to facilitate successful program completion, and (c) whether these measures will identify candidates in need of support in a timely manner.

Vocabulary as a Measure of Student Growth in Content Area Knowledge

Importance of Vocabulary

A measure of vocabulary knowledge was chosen by faculty as a measure of content knowledge. Vocabulary is important for three reasons: (a) is key to comprehension in reading and writing, (b) is important to effective communication, and (c) is an indicator of content area knowledge and academic achievement for all ages (Laufer, 1997; Saville-Troike, 1984). Additionally, the 2011 National Assessment of Educational Progress (NAEP) report of 6th and 12th grade students supports the claim that vocabulary knowledge is positively correlated with comprehension outcomes (National Center for Education Statistics, 2012). Vocabulary is also an important indicator for success in learning a foreign language (Folse, 2004; Nation, 2001; Zhang, & Anual, 2008). Recent research in the field of education has focused upon the importance of teaching vocabulary and has linked deficits in vocabulary knowledge to school failure (Biemiller, 2004; Graves, 2006; Townsend, Filippini, Collins, & Biancarosa, 2012).

Research Supporting Vocabulary Assessments

Research into the use of vocabulary matching as a potential indicator of content area knowledge for middle and high school students began in the early 1990's with several researchers (Nolet &Tindal, 1995; Tindal & Nolet, 1995; Espin & Foegen, 1996; and Espin & Tindal,1998) and has continued through the present (Busch & Espin, 2003; Espin, et. al., 2013). Espin and Foegan (1996) showed that content vocabulary-matching for measuring performance on a content area task resulted in strong correlations with an outcome measure (i.e., r-.62-.65) and showed the largest proportion of variance on the outcome measure when compared to two other common assessments. Two additional studies examined the use of vocabulary matching measures in 7th grade social studies as measures of content knowledge (Espin, Busch, Shinn & Kruschwitz, 2004; Espin, Busch, & Shinn, 2005). In these two studies, correlations ranged between the vocabulary matching and a teacher made knowledge test as well as a standardized test (i.e., r=.56 to .84) and course grades (i.e., r=.27 to .51). Furthermore, vocabulary probes have been used successfully in middle and high schools as progress monitoring tools (Mooney, McCarter, Schraven, & Callicotte, 2013). For example, in a study with 198 students from 10 science classrooms, Espin et al. (2013) found correlations between growth rates on 5-minute vocabulary matching probes that were given across 14 weeks and performance on the ITBS, course grades, and pre-post gains on a science knowledge test.

It is likely that similar results may be found for post-secondary learners. Therefore, the importance of vocabulary, arguably, is an important aspect of knowledge in all areas of higher education, especially in areas that require specialized content.

Depth and Breadth of Vocabulary Knowledge

At first glance vocabulary assessments can appear to be a comparatively surface type for content area knowledge as it involves the use of a timed assessment requiring the matching of definitions to words. However, the literature on the assessment of vocabulary, and the concepts behind the development of assessments for vocabulary, provides a foundation for understanding the potential of vocabulary matching as indirect measures and reflection of content area knowledge.

Vocabulary knowledge can be measured for breadth and depth. Breadth refers to the amount of vocabulary that one has acquired and can be assessed in a variety of ways, including multiple-choice and matching activities. Depth, on the other hand, is the level of understanding of a word. Understanding of a word can be categorized as: (a) no knowledge of the term, (b) general understanding, (c) use in a specific context, (d) knowledge, but limited recall, and (e) decontextualized knowledge and relationship to other words (Beck, McKeown, & Omanson, 1987; Beck, McKeown, & Kucan, 2013).

Depth can most accurately be assessed by engagement in the use of the word in context. Measuring depth, however, is impractical on a large scale for two main reasons: (a) measurements of words in context are time consuming, and (b) assessing an entire corpus of word knowledge is not possible. In fact, some linguists even question whether or not testing vocabulary in context gives an accurate measure of vocabulary knowledge (Laufer, 1998). For example, Laufer, Elder, Hill, and Congdon (2004) and Schmitt (1999) pointed out that students who knew a word in context were not able to identify the meaning of the word when given the word in isolation or outside of the specific context. Thus, although assessing depth appears to be valuable, it is not necessarily a realistic or representative way to assess knowledge of vocabulary. Finally, depth at the level of general understanding and knowledge, but limited recall, can be measured through vocabulary matching and multiple choice; thus, matching formats can be used as an indicator of a certain level of depth as well as breadth (Laufer et al., 2004). This concept is key because various vocabulary probes are generally created as matching assessments.

As pointed out by Read (1989), measuring breadth of knowledge may appear to be superficial, but in actuality it is much more likely to be representative of overall vocabulary knowledge rather than tests attempting to measure depth. Furthermore, Laufer et al, (2004) presented evidence that having breadth of vocabulary knowledge was far better than knowing a few words in depth. In fact, Laufer et al., suggested tests that measure breadth of vocabulary were superior to those that measured depth. Therefore, vocabulary measures that focus upon breadth of vocabulary knowledge may also indirectly represent depth of knowledge.

Vocabulary in Post-secondary as a Measure of Content Knowledge

Growing evidence is available that supports the usefulness of vocabulary as an indicator of content knowledge at post-secondary levels. For example, medical schools have identified specific vocabulary terms that indicate knowledge of content and have mapped those terms across blocks of courses in order to ensure comprehensive coverage of required material (Dexter, Koshland, Waer, & Anderson, 2012). Similarly, special education requires discipline specific, highly technical and sophisticated vocabulary, analogous to that found within the medical field, used to represent and explain key concepts. Candidates must be able to utilize this vocabulary, not just in everyday speech, but also in application of the concepts to instruction. Additionally, candidates must be able to articulate the concepts in a way that is understandable to families and other professionals.

Vocabulary Measures Applied to Special Education Content Knowledge

Because vocabulary knowledge has been found to be the largest predictor of success in content areas for middle and high school students (Espin & Deno, 1995), although not thoroughly researched in higher education, it stands to reason that vocabulary could also be used as an indicator, or predictor, of content knowledge in any professional field, including special education. Key vocabulary can be identified that represents the content area knowledge in multiple program standards, which are also reflected of the content courses.

Context of the Project

Setting. The project was located within the Special Education Program in the College of Education at a Pacific Northwest university. The Special Education plus Elementary Education Program (SPEL) is a dual endorsement program, meaning that candidates study for licensure in special education as well as in general education at the elementary level. The program maintains an enrollment of approximately 180 candidates at any time who have matriculated into SPEL. The program is designed for undergraduates and results in a Bachelor of Education. The typical time frame of the program is 10 quarters, approximately three years. The majority of candidates were admitted to the program during their sophomore year or after accumulating 45 general undergraduate credits. The program admits candidates each quarter (i.e., fall, winter, spring). The number of candidates admitted each quarter ranges between 15 to 25. Thus, there are three new incoming cohorts each year and a total of 10 cohorts in the program at any given time.

Faculty. The program had a total of eight tenured and tenure track faculty and five full-time non-tenure track faculty. Four of the faculty were full professors, three were associate professors, and one was an assistant professor. Non-tenure track faculty held either masters degree in special education or a Ph.D in special education. Each faculty member's expertise was reflected in special education course offerings (e.g., Introduction to Special Education, Formal and Informal Assessment, Behavior Assessment and Intervention, Reading Instruction for Children with Disabilities, Motivation and Learning, Child Development and Educational Psychology, Curriculum Based Evaluation, Complex Needs, Transitions, Disability Law and IEP, Professional Collaborations, Interventions for Classroom Management, Math and Written Expression Instruction and Interventions, Assistive Technology, and Education, Culture, and Equity).

Focus of the program. It is important to note that one of the major themes of the SPEL program was CBE. In fact, candidates were required to take a full 10-week, quarter-long course in Curriculum-Based Evaluation: Data-Based Assessment for Effective Decision Making . Furthermore, CBE concepts were integrated in other courses throughout the program. The program of study also included a variety of elementary education courses, such as Science Methods, Integrated Arts, Teaching Mathematics K-8, Social Studies Methods. The Vocabulary Assessment measured terms for special education coursework only and not for general education coursework. Therefore, candidates were not assessed over vocabulary common in elementary education unless a term was also common in special education.

Candidates. Candidates ranged in age between 18 and 28 years. Self-reported ethnicity was 77% Caucasian, 2.5% African American, 8.0% Hispanic, 1% Native American, 9% Asian, and 2.5% other.

Candidates were informed of the purpose of the assessment at the beginning of their program and provided informed consent for use of their data in the research project. All candidates were required to complete the assessment each quarter since the assessment was connected to program improvement and to accreditation. Candidates were reminded at each administration that the purpose of the assessment was to ensure that they were making adequate progress throughout their program of study. Candidates were also reminded that an additional purpose of the assessment was to model CBE process with candidates throughout the program, demonstrating the effective use of the process, thus supporting one of the core concepts and emphases of the program under authentic conditions. Approximately 97% of candidates provided consent to allow researchers to use their data as part of the research project.

Vocabulary Signature Assessment

Description of the Assessment

The assessment was a standard matching of terms to definitions task. The final product had 100 terms, ten terms and ten definitions on each page of the assessment, thus 10 pages. Each quarter, 100 terms were randomly selected using an excel program and the assessment was formatted. Candidates had a total often minutes to match all 100 terms, thus indicating automaticity as well as knowledge. The assessment was administered every quarter to all candidates in SPEL. The first administration of the vocabulary assessment was Fall 2015. The following sections discuss the development of the instrument.

Development of the Instrument

The development of the instrument was accomplished in three phases. Each phase included multiple steps.

Stage one-Selection of terms. All 13 full-time special education faculty participated in the development of the vocabulary assessment. Faculty in the program participated in identifying unique vocabulary terms for special education content that represented the concepts and knowledge expected to be learned in each course. First, faculty were given instructions to identify terms used in the courses they taught. The number of terms identified ranged from 25-75 for each course. Faculty then submitted the terms electronically in order to compile a comprehensive database. The database was then used to search for unique terms, thus eliminating any overlap. More than 1,200 unique terms were initially identified. Faculty were then provided with criteria for creating user friendly definitions in contrast to dictionary definitions. The criteria for definitions were the following: (a) relatively short, (b) no utilization of any part of the term, and (3) avoidance of technical jargon. All words and definitions were placed in an excel file and compiled into a comprehensive list that designated the term, the definition related to the term, and course origination.

Stage two-Review of terms. Since terms were often shared across courses and were defined in multiple ways, faculty gathered for an initial two-hour meeting for the purpose of agreeing to common understandings regarding definitions. Faculty who shared vocabulary terms across courses met in small groups and reviewed their set of terms and definitions in order to determine the best definition, representing the most common usage of the term. Collaboration was required in order to come to consensus on a standard definition to be used across all courses. Next, the assessment committee reviewed the terms and definitions to ensure that definitions were formatted in a uniform manner (e.g., definitions stated in the active voice, concise definitions). The assessment committee returned the edited database back to the to the entire faculty for full review, and an updated list was generated. Faculty also inspected each term and its definition for the importance of the term related to the program, clarity of the definition, and succinctness of the definition. Faculty also reviewed the list to to ensure that definitions maintained the integrity of the term in relationship to the courses. Words with questionable applicability and/or definitions were either again modified or eliminated from the corpus.

A third compilation of all terms was then created and reviewed by the assessment team again for format. Finally, the entire file was reviewed one last time by the assessment committee for any edits of spelling errors or mechanics. The entire final corpus consisted of approximately 1,000 terms and definitions.

Stage three-Format of the assessment. Once the review was completed, the faculty considered the format of the assessment and the number of terms per page. In order to determine the optimal number of terms per page, faculty asked the question, "How many terms and definitions would an expert be able to match within a specific time frame?" Using high school probe models, the assessment committee randomly sorted all of the terms using the Excel file. First, a total of 20 definitions were placed on a page with a word bank of 25 words, five of which were distractors. At a faculty meeting, all faculty were given 10 minutes to complete as many items as possible. After the assessment, faculty discussed the experience. Most faculty were highly frustrated with the number of terms on each page. In addition, faculty felt that the distractors were excessive and limited their ability to demonstrate their knowledge. In fact, because of the complexity of the format, some faculty refused to complete the activity. Decisions were made to eliminate the distractors and to ensure that the terms as the word bank were located on the right hand side of the page, thus encouraging the candidate to read the definition first and then select the term (Appendix A). Appendix A is the first page of ten-page assessment and is a sample from an the actual assessment.. The assessment team then administered the modified assessment for expert review again at a subsequent faculty meeting. Faculty made the decision to limit each page of the assessment to ten items per page with no distractors.

Stage four--Field Test. The assessment committee prepared copies. In order to standardize the administration, simple directions were printed on the cover sheet. Candidates enrolled in SPEL were asked to match the terms to the definitions as quickly as possible. If they did not know an answer they were directed to skip the item or to take a guess. They were directed to complete as many items as possible within the ten-minute time frame. Instructors were directed to administer the assessment at the beginning of class before other work was scheduled.

Copies were housed with the program coordinator who kept a check out and check in sheet. The administration for the field test was completed in Spring quarter, 2015. Each candidate was allowed one attempt. Therefore, if candidates had taken the assessment in a different course, the candidates read or waited patiently until the end of the time set aside for the assessment. Faculty were provided with a key to score the assessments. Scores were reported back to the assessment team, who entered the scores into an EXCEL spread sheet. The field test had two purposes (a) norming the assessment across quarters and (b) finding and addressing any problems that occurred with format or administration.

Results of the field test. Faculty met and reviewed the results of the field test. A distinct trend line was apparent from quarter one through quarter eight. Faculty found that candidates, overall, in the final quarter of the program outperformed candidates in the first quarter. Since this was the first time the assessment was administered, and only as a field test, individual scores for candidates had limited usefulness.

Use of Assessment for Candidates in Need of Support

The use of vocabulary terms as a measure of content knowledge is especially relevant for special education teacher preparation programs because the field of special education has dedicated terms, similar to terms found in other specific disciplines such as science, medicine, or engineering, and are not commonly used by the general public (e.g., subtractive bilingualism, pairing, sensory integration therapy, PDDNOS, Board of Education v. Rowley, concurrent validity). Furthermore, as with other fields, special education has some terminology that is shared with the general population (e.g., positive reinforcement, negative reinforcement, extinction, shaping, fading), but yet, have different, precise meanings relevant to special education. It is important that candidates know the specific definitions of these terms that have dual usage across multiple contexts. In addition, the quanity of terms in special education are substantial across behavior support, categories of disabilities, special education planning, and legal issues, thus sampling from a a broad content area.

Primary purposes of the vocabulary assessment was to model the CBE process in an authentic environment, provide a means of feedback to candidates regarding their progress in the program, and to identify, early on, those candidates who may have difficulty as well as to institute interventions in a timely manner. The first step, however, was to determine whether the assessment could possibly provide reliable information. This necessitated whole data sets for multiple cohorts who had completed the program. Having all the data on cohorts from the first quarter through the eighth quarter provided faculty the opportunity to explore individual trends based upon the results of the vocabulary assessment as related to performance of individual candidates throughout the program.

Faculty currently have whole data sets for three cohorts. These cohorts were admitted in Fall 2015, Winter 2016, and Spring 2016. The cohorts completed all coursework and took the assessement each quarter of the program from their first quarter through their eighth quarter. Candidates did not take the assessment during the two quarters of internship at the end of the program.

Figure 1--Overall Candidate Performance. An example of results of one administration of the Vocabulary Assessment was Figure 1. This graph showed a scatterplot of scores for all cohorts of candidates enrolled in during the quarter for Fall 2018. Overall, scores for candidates who were enrolled in their first quarter of the program were below scores for those candidates who were were enrolled in subsequent quarters. The trend, overall, was ascending, indicating that, on the whole, candidates gained in knowledge of terms from one quarter to another.

In addition, Figure 1 compared the individual candidate score with their cohort scores. For example, for the cohort in quarter 7, one candidate matched 11 terms correctly within the ten-minute time frame. This is far below the average scores for the other candidates in that same cohort, which ranged from 53 correct matches to 75 correct matches. This information then shared by the advisor with the candidate can be the foundation for meaningful conversations about performance. Naturally, there can be many explanations for this particular candidate's performance (e.g., illness, lack of attention to detail, motivation, or external factors). The knowledge, however, of a candidate's individual performance in relation to their own cohort at one moment of time is important, but it is only a starting point that provides an opportunity to begin thinking about whether a problem exists, not necessarily at this point about how to solve the problem. What is more instructive is to examine a visual representation of performance on the assessment across time in relation to trend lines.

Figure 2 and Figure 3 individual progress monitorings. What we found most helpful was the individual performance graphs. While the scatterplot was more global and provided some general information, it was not specific enough to drill down to a deeper level, having implications for informing decision making. It did not provide information that was targeted for each individual candidate and did not inform about any candidate's individual progress over time. In order to remedy this situation, graphs were developed that showed candidates scores on the assessment as the candidate progressed from quarter to quarter. Figure 2 represents scores for individual candidates who completed their coursework in Winter 2018; Figure 3 represents scores for candidates who completed their course-work in Fall 2018. The graphs showed scores, quarter by quarter, for the highest performer and the lowest performer, as well as the trend line that represents the average scores for all candidates from one quarter to the next. Two examples were provided to show the fluctuation of scores from cohort to cohort.

Overall, the graphs showed that the first three data points provided a general indicator of future performance on the vocabulary assessment in the event of no intervention. It was well established in both CBE research that three data points were commonly considered best practice for determining performance trends using multi-tiered systems of support (Hosp, Hosp, Howell, & Allison, 2013; McMaster& Wagner, 2007). Three data points are required because the first score is not necessarily reflective of knowledge of vocabulary. It is possible that any beginning individual score reflects sheer lack of exposure to special education academic language, measures the candidate's test taking strategies, clever guesses, compliance, or motivation rather than relying upon targeted terminology used in this assessment.

There seems to be little predictability in relationship to first quarter performance, meaning that the first administration shows little evidence of predictive value, particularly if results for the cohort reflect higher beginning scores as in Figure 3. By the third administration, however, it appears evident that the first three data points, taken together, established a relatively predictable trend. It required about three quarters to differeniate acquisition of specific vocabulary relative to program requirements from prior knowledge. It also appeared that by the third vocabulary assessment, in quarter three, candidates had opportunity to engage in content sufficient enough through exposure in coursework to engage with academic language so that the assessment actually reliably measured learning in content, especially vocabulary learning.

Implications for practice. Faculty met regarding the dissemination of information to candidates and established a process. Each advisor has access to the overall scores for each administration (Figure 1). The following steps were determined to be helpful: (a) Each quarter advisors review the graph and invite candidates to visit with them regarding their scores, (b) Advisors review graphs for individual candidates after their third quarter in the program to ensure candidates trendlines are either at or above the average. Any candidate whose trend line, after the third quarter, shows a trajectory below the average trend is to be contacted for follow-up. (c) The advisor and the candidate discuss the possible reasons for their performance on the assessment, implications, and other evidence as well as other benchmarks of progress in the program (e.g., scores on projects, course grades), (d) Advisors continue to review individual candidates graphs each quarter and meet with candidates, especially those candidates whose trendlines were below the average the quarter before and determine a plan of action if necessary. In this way, the scores on the vocabulary assessment becomes a reliable tool, based upon data, in order to fully inform candidates in a timely manner as they progress through the program. This process also models for candidates the CBE process, thus reinforcing an important aspect of assessment and teaching in a classroom.

The intention of this particular project, at this point in time, was not to intervene in a candidates progress, but rather, to establish whether or not the vocabulary assessment could be a reliable predictor of future success in the program. Interventions for candidates that may be considered as the project continues, based upon the data and upon individual circumstances, may range from no intervention (e.g., watch and see) to withdrawal from the program and could possibly include some recommendations from the following: (a) meeting with the course instructor during office hours for clarification of concepts, (b) additional time studying, (c) joining a study group, (d) reduction of outside activities, (e) exploration of professional options other than certification, (f) mentoring from a professional or peer.

The indication from the project is that the ramifications of not intervening in a timely manner for candidates who are experiencing difficulty has dire consequences for not only the candidates, but also, for the program, and the field in general. This tool provides a means for identifying those candidates, provides a reality check based upon data, and promotes conversations to identify problems and find solutions.

As yet, there is no guarantee that any one intervention or combination of interventions will be successful in changing the trends or supporting the candidate in meeting program and certification expectations. That is yet to be determined as the project moves forward. However, it is informative to note the outcomes for the two lowest performers for Figure 1 and Figure 2. The lowest performer in Figure 1 withdrew from the program during her final quarter. The lowest performer in Figure 2 did not meet the cut-score for passing the WEST-E. In each case, the expense, in time and resources, to the candidate and to the program was immense. Faculty wonder, with early intervention, if these outcomes could have been improved. It is possible that a targeted intervention could have changed the candidates trajectory and could have contributed to their future success or a more timely decision regarding a change in major or career choice.

One of the more fortuitous outcomes of the project was that it inspired conversation and collaboration among the faculty, a collegial focus upon student retention and possible supports. In addition, faculty took the opportunity to communicate about their courses and how the courses fit within the program overall.

This project will continue across time. The next steps will be the implementation of the CBE procees, meaning that information will be shared with the candidates. Advisors will identify candidates who are having difficulty based upon the data and discuss and implement interventions if necessary. Overtime, it will be interesting to watch to see if the trends on the Vocabulary Assessment will be changed relative to the interventions and if those intervetions will eventually impact successful program completion, or at least provide timely feedback to guide future decisions. For future research, faculty will use the assessment results for program continual improvement.


The purpose of this paper was to provide an overview and rationale of a unique and original method of gauging candidate progress through a teacher special education preparation program. This vocabulary assessment is based upon the theoretical proposition that vocabulary is an indication of content knowledge. In addition, the work is framed within the CBE and RtI models that have a 30-year research base. The assessments extrapolated models that are frequently used within public schools to higher education. In this case, curriculum based measures were used in order to monitor the progress of candidates as they proceeded through the special education program. Furthermore, the technology appears to hold promise for the identification of candidates who likely need support in academic content; the results showed a difference in candidate acquisition of vocabulary from the beginning of the program to the end of the program.

It is important to note that the vocabulary assessment is only one of the many assessments used by the special eduction program to provide evidence of program quality. In addition, program assessment scores on the Washington Educator Skills Test--Endorsement (WEST-E) and scores on the edTPA provide a thorough picture of the quality of candidates who complete the program as well as pertinent information that can guide program improvement.

A triangulation of data provide a strong base for decisions for individual support, analysis of cohort performance.. Faculty are looking forward to more thoroughly applying the principles of quality assessment, instruction, and evaluation. It is the hope that such a system can later serve as a model for other colleges of education.


Beck, I. L., McKeown, M. G., & Kucan, L. (2013). Bringing Words to Life (2nd ed). New York, NY: Guilford.

Beck, I. L., McKeown, M.G., & Omanson, R.C. (1987). The effects and uses of diverse vocabulary instructional techniques. In M.G, McKeown & M.E. Curtis (Eds.) The nature of vocabulary acquisition (pp. 147-163). Hillsdale, NJ: Erlbaum.

Biemiller, A. (2004). Teaching vocabulary in the primary grades: Vocabulary instruction needed. In .1. Baumann & E. Kame'enui (Eds), Vocabulary instruction: Research to practice, second edition (pp. 28-40). New York: The Guilford Press.

Burns. M.K., Appleton, J.J., & Stehouwer, J.D. (2005). Meta analytic review of responsiveness to intercention research: Examining field based and research based models. Journal of Psychoeducational Assessment, 23(4), 381-384.

Busch, T, & Espin, C.A. (2003). Using curriculum-based measurement to prevent failure and assess learning in content areas. Assessment for Effective Intervention, 28, 49-58.

Deno, S. (1985). Curriculum-based measurement: The emerging alternative. Exceptional Children, 52, 219-232.

Dexter, J., Koshiand, G., Waer, A., & Anderson, D. (2012) Mapping a curriculum database to the USMLE Sept 1 content outline. Medical Teacher, 34, 666-675.

Espin, C.A., & Deno, S.L. (1995). Curriculum-based measures for secondary students: utility and task specificity of text-based reading and vocabulary measures for predicting performance on content-area tasks. Diagnostique, 20, 121-142.

Espin, C.A., Busch, T, & Shinn, J. (2005). Curriculum-based measurement in the content areas: Vocabulary matching as an indicator of progress in social studies learning. Journal of Learning Disabilities, 38, 353-363.

Espin, C.A., Busch, T, Shinn, J., & Kruschwitz, R. (2004). Curriculum Based Measurement in the content areas: Validity of vocabulary matching as an indicator of performance in social studies. Learning Disabilities Research and Practice, 16(3), 142-152.

Espin, C, & Foegen, A. (1996). Validity of general outcome measures for predicting secondary students' performance on content-area tasks. Exceptional Children, 62(6), 497-514.

Espin. C.A., Busch, T.W., Lembke, E.S., Hampton. D.D., Seo, L., & Zukowski, B.A., (2013). Curriculum-based measurement in science learning: Vocabulary-matching as an indicator of performance and progress. Assessment for Effective Intervention 38(4), 203-213.

Espin, C. A. & Tindal. G. (1998). Curriculum-based measurement for secondary students. In M. Shinn (Ed.). Advanced applications of curriculum-based measurement (pp. 214-253). New York: Guilford Press.

Fore III. C, Burke, M.D., & Matin, C. (2014). Curriculum-based measurement: An emerging alternative to traditional assessment for African American children and youth. The Journal of Negro Education 75(1), 16-24

Folse, K. (2004). Vocabulary Myths: Applying second language research to classroom teaching. University of Michigan Press. Ann Arbor.

Fuchs, D., Compton, D. L., Fuchs, L. S, & Bryant, J. (2008). Making "secondary intervention" work in a three-tier responsiveness-to-intervention model: Findings from the first-grade longitudinal reading study at the National Research Center on Learning Disabilities. Reading and Writing: An Interdisciplinary Journal, 21, 413-436.

Fuchs, L. S., & Fuchs, D. (2008). The role of assessment within the RTI framework. In D. Fuchs, L. S. Fuchs, & S. Vaughn (Eds.), Response to intervention: A framework for reading educators (pp. 27-49). Newark, DE: International Reading Association.

Gersten, R., Compton, D. L., Connor, C. M., Dimino, J., Santoro, L., Linan-Thompson, S., & Tilly, D. (2008). Assisting students struggling with reading: Response to intervention and multi-tier intervention for reading in the primary grades. A practice guide. (NCEE 2009-4045) Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education. Retrieved from:

Gersten, R., Beckmann, S., Clarke, B., Foegen, A., Marsh, L., Star, JR., & Witzel, B. (2009). Assisting students struggling with mathematics: Response to intervention (RTI) for elementary and middle schools (NCEE 2009-4060). Washington, DC: National Center for Education Evaluation and Regional Assistance. Institute of Education Sciences, U.S. Department of Education. Retrieved from

Graves, M.F. (2006). The vocabulary book. New York: Teachers College Press.

Hosp, J., Hosp. M. and Howell. K., & Allsion, R., (2013). The ABCs of curriculum-based evaluation: A practical guide to effective decision making. New York: Guilford Press

Howell, K W., Hosp, J. L., & Hums. S. (2008). Best practices in curriculum-based evaluation. In A. Thomas and J. Grimes (Eds.), Best practices in school psychology V (pp. 349-362). Bethesda, MD: NASP.

Johnson. V. E. (2003). Grade inflation: A crisis in college education. New York, NY: Springer.

Laufer. B. (1997). Vocabulary and testing. In Schmitt, N. and McCarthy, M. (Eds.) Vocabulary: description acquisition and pedagogy. Cambridge: Cambridge University Press, 303-320.

Laufer, B. (1998). The development of passive and active vocabulary in a second language: Same or different? Applied Linguistics, 19(2), 225-271.

Laufer, B., Elder, C, Hill, K., & Congdon, P. (2004). Size and strength: do we need both to measure vocabulary knowledge? Language Testing 21(2) 202-226.

McMaster, K. L., & Wagner, D. (2007). Monitoring response to general education instruction. In S. R. Jimerson, M. K. Burns, & A. M. VanDerHeyden (Eds.). Handbook of response to intervention: The science and practice of assessment and intervention (pp. 223-233). New York: Springer.

Mooney, P., McCarter, K.S., Schraven, J., & Callicoatte, S. (2013). Additional performance and progress validity findings targeting the content-focused vocabulary matching. Exceptional Children, 80(1), 85-100.

Nation, I.S.P. (2001). Learning vocabulary in another language. Cambridge: Cambridge University Press.

National Center for Education Statistics (2012). The Nation s Report Card: Vocabulary Results From the 2009 and 2011 NAEP Reading Assessments (NCES 2013-452). Institute of Education Sciences, U.S. Department of Education, Washington, D.C.

Nolet, V., & Tindal, G. (1995). Essays as valid measures of learning in middle-school science classes. Learning Disability Quarterly, 18, 311-324.

Read, J. (1989). Towards a deeper assessment of vocabulary knowledge. Washington, DC. ERIC Clearinghouse on Language and Linguistics. ERIC Document Reproduction Service. No. ED 301 048

Saville-Troike, M. (1984). What really matters in second language learning for academic achievement? TESOL Quarterly 18, 199-219.

Schmitt. N. (1999). The relationship between TOEFL vocabulary items and meaning, association, collocation and word-class knowledge. Language Testing 16. 189-216.

Tindal, G., & Nolet, V. (1995). Curriculum-based measurement in middle and high schools: Critical thinking skills in content areas. Focus on Exceptional Children, 27(7), 1-22.

Townsend, D., Filippini, A., Collins, P., & Biancarosa, G. (2012). Evidence for the importance of academic word knowledge for the academic achievement of diverse middle school students. Elementary School Journal, 112(3), 497-518.

U.S. Department of Education (2018). Teacher shortage areas. Retrieved from

Zhang, L. J., & Anual, S.B. (2008). The role of vocabulary in reading comprehension: The case of secondary school students learning English in Singapore. RELC Journal, 39( 1), 51-76.


Western Washington University


Western Washington University


Western Washington University
COPYRIGHT 2019 Project Innovation (Alabama)
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2019 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Coulter, Gail; Robinson, Leanne; Lambert, Michael Charles
Publication:Reading Improvement
Date:Sep 22, 2019

Terms of use | Privacy policy | Copyright © 2020 Farlex, Inc. | Feedback | For webmasters