An empirical evaluation of Specification Oriented Language in Visual Environment for Instruction Translation (SOLVEIT): a problem-solving and program development environment.
Traditional programming environments lack adequate facilities to seriously meet the needs of novice programmers and suffer from two main deficiencies: (a) they focus on the programming aspect of the process, and (b) they lack a structured facility for problem-solving. Apart from editing and debugging tools, traditional programming environments offer no support for learning software engineering; only the functionalities required to construct, test, and debug code are provided. To a large extent, the software process in these systems remains a methodology disconnected from the language and the tool. This may perhaps be viewed as a natural result of using a tool (the compiler) designed for one purpose--the noninteractive translation of high-level language programs into executable code--for another purpose--teaching programming skills. Programming environments designed for learning must offer facilities beyond basic tools such as editors and debuggers and must take into consideration the cognitive skills required to perform the tasks of problem-solving and programming (Shneiderman, 1980).
An integrated environment supporting the problem-solving and program development approach starting with the initial activity of understanding the problem and continuing through program implementation has been developed (Deek & McHugh, 2002). The new environment--Specification Oriented Language in Visual Environment for Instruction Translation (SOLVEIT) aims to support students' development of problem-solving and cognitive skills; enhance students' acquisition of knowledge necessary for program development; promote students' development of metacognitive abilities; encourage students' favorable perception, attitude, and motivation toward problem-solving and programming; and as a long term goal, enhance the retention and transfer of such skills, knowledge, and abilities to other situations.
To promote the development of students' cognitive skills, it is important to explicitly identify those skills and then design systems which encourage their development. To that effect, we have reviewed the various problem-solving methodologies to synthesize a common method which incorporates the essential features of classic problem-solving methods (Deek, 1997). Next, we identified the problem-solving tasks specific to program development and adapted the common problem-solving method to the area of program development (Deek, Turoff, & McHugh, 1999). Each stage of the common method was scrutinized for the appropriate cognitive skills (Bloom, 1956; Sternberg, 1985) it requires to define a Dual Common Model that takes into consideration the cognitive techniques needed at each step of the process. This Dual Common Model (called dual because it integrates problem-solving/program development tasks and the required cognitive activities) serves as the basis for the functional specification of the problem-solving and program development environment discussed here. This environment supports a six stage application of the problem-solving method to programming, starting with the initial activity of problem formulation, and continuing through solution planning, solution design, solution translation, solution testing and solution delivery (Deek & McHugh, 2002).
The empirical study described in this article sought to verify the claims made regarding the cognitive model of SOLVEIT and to investigate the impact resulting from using the SOLVEIT environment as a support tool for problem-solving and program development. The focus of the remaining sections of this article is on the methods, techniques, and results. This includes qualitative formative evaluation, summative evaluation, assessment and instruments, reliability and validity, and results and discussions.
QUALITATIVE METHOD FOR FORMATIVE EVALUATION
User participation and feedback were systematically used during system development. First, predeployment testing of system functionality and its readiness to be integrated into the classroom was performed. This began with an initial version of SOLVEIT and covered user testing and protocol analysis. Two distinct student populations were selected in order to obtain a broad variety of student feedback. One group was comprised of students in a summer program in computer science, designed for highly motivated high school students in grades 9-12. The other group was comprised of students in the computer science component of a summer enrichment program, designed for underprepared high school seniors accepted to attend the New Jersey Institute of Technology (NJIT) as freshmen. These students, as well as other students, are the system's expected future users when taking their first course in computer science.
SOLVEIT was installed and used by students in both summer programs to solve assigned programming projects. Sixty-two students participated in testing. Teaching assistants and tutors selected to work with students in the lab solicited and compiled user feedback. The results were used to enhance the system, correct bugs, and produce a new version prior to integration into the curriculum for the first semester of evaluation.
The system was tested once again before the subsequent semester with 16 first-year students, prior to their first course in computer science. The students solved a complete problem using a new version of SOLVEIT; the instructor and two teaching assistants solicited and compiled user feedback. The testing results, along with the results of a protocol analysis (discussed later), were used to produce a second version for use in the following semester evaluation.
The method used to assess usability and to understand how students formed their mental model of SOLVEIT was protocol analysis (Ericsson & Simon, 1993). This method involves asking potential users of the system to perform a predetermined task using the system and at the same time "think out loud" about what they are doing (Newell & Simon, 1972; Carroll & Thomas, 1982).
Two distinct student groups were once again selected in order to obtain diverse feedback on the user interface and the ease of use of system functionality. The students were asked to think out loud while performing an assigned task in SOLVEIT; the results were recorded and analyzed later. The first group of students consisted of two graduate students whose undergraduate degrees were not in computer science and who were taking an introductory course in program design and data structures. The second group was made up of three students randomly selected from the aforementioned 16 student group. Students participating in the protocol analysis gave a running commentary on what they were attempting to do with the software, what type of problems they encountered, and any other task related thoughts. Each session was tape recorded and analyzed. Both groups encountered significant problems with the user interface of the information elicitation and the goal decomposition tools. Some minor problems with inconsistencies in the placement of help, next, and back buttons were also reported. Corrective actions were taken by redesigning the screen interface for both the information elicitation and goal decomposition tools. Also, a reorganization of command buttons was made on all screens. Problems with the goal decomposition tool were uncovered again during a subsequent informal protocol analysis session with the same graduate students that lead to more changes and simplification of the user interface. No other major problems were reported.
The principal method of evaluation was controlled experimentation designed to understand and thereby improve the learning and teaching of problem-solving and program development to novice students. An experiment to test hypotheses and investigate research questions was conducted over two semesters. Hypotheses testing and research questions for this study entailed analysis of students' performance on programming assignments, quizzes, exams, self-administered questionnaires and reports, and in-class student' observation and monitoring. Additionally, students' performance in other courses during the same and subsequent semesters, as well as their overall subsequent performance, was examined.
The evaluation took place over two semesters. Three sections of the first course on problem-solving and programming taken by students majoring in computer science, information systems, mathematics, and physics at NJIT were used each semester. In the first semester, two sections were the control and one received the experimental treatment. In the second semester, one was the control and two were experimental. The reported results are based only on the performance of those students who completed the experiment.
There were 81 students (48 students in the control sections and 33 students in the experimental section) in the fall semester and 105 students (30 in the control section and 75 in the experimental section) in the spring. Two thirds of the students in each semester were computer and information science majors. Students filled out pretest questionnaires regarding their demographics, including: whether the students were undergraduates, full-time or part-time status, number of credits taken that semester, first or second-year, whether majoring in computer and information science, age bracket, sex, native language, and ethnic composition. They answered questions regarding their experiences with computers and programming, including: frequency of use of computers, prior programming experience, prior programming language, expected degree, and expected field of employment. The students also completed a self-assessment of their level of programming and problem-solving skills. Cross-tabulations and chi square tests were performed on the results of the course questionnaire items and students' SAT scores and the results of their placement tests. No significant difference was found in any of the categories regarding the students in the two groups.
The experiment was a pre/post control group design covering two different methods of doing problem-solving and programming. The first method used a traditional programming environment (C++ compiler) and the second method used both SOLVEIT for problem-solving activities and the same programming environment. The traditional group was the control group and the SOLVEIT group was the experimental group. Both groups received the same instruction, assignments, quizzes, and exams.
The independent variable was the integration of the SOLVEIT environment into the course for students in the experimental group and the absence of SOLVEIT for students in the control group. The dependent variables fell into four categories. Each category consisted of a set of variables: (a) the problem-solving process, including problem formulation, planning, design, translation, and testing (the last two tested in terms of product), (b) the product, including quality, reliability, readability, and correctness, (c) academic effects, including overall academic performance, performance in comparable courses, performance in the next computer science course, and a composition course, and (d) subjective including, perception, attitude and motivation. Table 1 summarizes the independent and dependent variables of the hypotheses and research questions.
Instrumentation and Data Collection
Researchers analyzing students' problem-solving abilities have developed various assessment methods and evaluation instruments, with comparable objectives and performance (Hartman, 1996; Meier, 1992; Szetela, 1987). How to Evaluate Progress in Problem-solving (Charles, Lester, & O'Daffer, 1987) provided comprehensive guidelines for conducting evaluation strategies for problem-solving. Some of the techniques and instruments for measuring problem-solving progress provided in this book were adapted to fit the needs of this evaluation. New instruments for evaluating the process and the product of problem-solving and for students' self-assessment reports were devised. Each process and product instrument contained a description of possible outcomes, the indicators to be examined, and a scoring scale. An additional instrument for quiz and exercise problems was also devised. These instruments were initially used to grade students' work in a prior course on problem-solving and programming. Some problems were identifie d and corrected and the instruments were revised before they were used during the experiment in the following two semesters.
Evaluation instruments must be appropriate to an application for the results to be meaningful. All measurements are subject to fluctuations that influence their reliability and validity (Rosenthal & Rosnow, 1991). Reliability refers to the consistency of results obtained using a certain method; validity refers to the appropriateness of the interpretation made of such results (Gronlund, 1985). Both characteristics are essential for any instruments used for evaluation.
The results of performance-based evaluation should be viewed as a combination of the student's ability level and the method used. The results should reflect the student's ability and an effort be made to minimize errors (Moore, 1983). Various reliability tests can establish the consistency of results. One test, called inter-rater reliability or internal consistency (Rosenthal & Rosnow, 1991), is essential when performance is judged by an instrument. The reliability of scoring using the performance assessment instruments developed for this evaluation was established in two separate courses prior to this study, where the instruments were used to grade programming assignments and exams. All grading was done by course Teaching Assistants (TAs) who were not given any details of the evaluation plan, and who received training covering course content, teaching skills, grading criteria, and scoring instruments. The grading process was blind as to whether the student was in the experimental or control group. The instru ctor was in charge of the lecture and preparation of course material, programming assignments, quizzes, and exams, but did not participate in any judging of students' performance. Inter-rater reliability tests were performed on the grading. The correlation coefficient for the three graders for the course were found to have a high degree of agreement. For example, the reliability coefficients for the graders for the last programming assignment ranged from 0.82 to 0.95 is shown in Table 2.
Various validity tests can be used to establish whether or not an instrument measures what it purports to (Moore, 1983). A nonnumeric measure called content validity (Moore, 1983) is appropriate in this case. A measurement has content validity if a group of experts in the field believe the measurement is actually relevant to the objectives of the study. This is in contrast, for example, to face validity where the measurement merely seems relevant to the human subjects of the study, who are of course not typically experts in the field. The experts who verify content validity, by presumption at least three, analyze both the separate elements of the measurement and the measurement as a whole, and also screen for omitted elements (Moore, 1983). Rosenthal and Rosnow (1991) recommended defining a measurement that will have content validity for a study by using a list of skills or materials the subjects must have mastered per the objectives of the study. These techniques were used to develop the performance assessme nt instruments for this study. Following Rosenthal and Rosnow (1991), a detailed breakdown of problem-solving and programming subtasks was produced and used as the basis for designing the performance assessment instruments. A description of possible outcomes for each subtask of the problem formulation, planning, design, implementation, and testing stages was developed. This was then correlated with indicators to be examined for outcome evaluation and a scoring scale was associated with each outcome. Following Moore (1983), a group of five experienced computer science teachers examined the instruments, analyzed them and provided written feedback that was used to refine the instruments which were subsequently content validity screened once again by the group of experts.
The data were obtained from multiple sources including: (a) students' pre and posttest questionnaires, (b) student performance on programming assignments, quizzes, exams, (c) students' self-assessment reports, and (d) in-class student observation and monitoring. Five programming assignments, five quizzes, and two major exams for all students, in all sections of the course, were selected for evaluation. Table 3 summarizes the data collection for the hypotheses and research questions.
Hypotheses and Research Questions
Hypotheses and research questions were designed to assess whether the tools within the SOLVEIT environment aided students in determining a solution, producing better results, enhancing their perceptions, attitude, motivation, and the skills and knowledge necessary for problem-solving and program development or affected academic performance in comparable and subsequent courses. The study examined measures of the four sets of variables in the following categories: process, product, academic, and subjective, as shown in Table 4 (detailed description of each set of variables are included).
The SOLVEIT environment provided tools specifically intended to help the students perform problem formulation, planning, and design tasks. One could expect that these tools would direct effect students' work. These expectations were formulated as hypotheses. SOLVEIT did not contain any tools to promote better quality programming, test for reliability, correctness, and so forth. But since it was thought that the environment would possibly effect these characteristics, it was decided to also investigate such effects, which were formulated as research questions rather than hypotheses, because of the more indirect nature of the expected effect. Similarly, changes in the subjective responses of the subjects such as favorable perception, better attitudes, and motivation were also formulated as research questions.
The hypotheses were designed to test the relationship between the various tools included in the system and students' performance. The major assumption of the hypotheses was that students using SOLVEIT will perform better on problem-solving and program development tasks than students not using the system. The research questions on the other hand were designed to examine unforeseen effects not directly related to the tools in SOLVEIT. Tables 5-8 describe the hypotheses and the research questions. Detailed discussion of hypotheses and research questions, the rationale for their selection, their measures and impact, the procedure and instruments for their verification is provided later in this article. Measures refer to what should be evaluated to establish the hypotheses, such as how well a problem is formulated. Assessment refers to deliverables that should be examined to obtain a measure, such as the problem description, and mental and structured models of the formulated problem. Impact refers to the affect of the performance in this aspect on the process of problem-solving and programming or general academic accomplishment.
Problem-solving is a process, entailing problem formulation, planning, and design, whose output is a product, including coding, documentation, and results, which is performed by subjects, the novice learners of programming, who participate in an academic environment and exhibit subjective responses, such as satisfaction and commitment. Separate evaluation instruments were developed to assess students' performance in each of the process stages. Translation, testing, and delivery stages were evaluated as part of the product of problem-solving with separate instruments to assess student solution quality, reliability, readability, and correctness. Academic performance was evaluated using existing measures such as transcript grades. Finally, an evaluation instrument was developed to assess students' subjective responses.
Process hypotheses. Three hypotheses were designed to test the effects of SOLVEIT on problem formulation, planning, and design. The following presents details on these hypotheses and the instruments designed to assess students' work in these stages.
H 1 Problem Formulation. The process of problem-solving requires a variety of cognitive skills and begins with problem formulation (Polya, 1945; Rubinstein, 1975; Greeno, 1978), which requires understanding the question, the meaning of the problem's terminology, and the identification of its facts. Techniques such as verbalization create an initial understanding of the problem. For example, making a drawing, talking, or answering questions about the problem aid the task of problem understanding. A more precise model of the problem requires elicitation and organization of all relevant information and elimination of irrelevant information. SOLVEIT requires the student to describe the problem to be solved in written form. The student interacts with the system and answers questions about the problem, which prompts the student to think and construct interpretations about problem facts, conditions, and constraints, and facilitates problem understanding. Using the information elicitation tool of SOLVEIT, students ga ther information about the goal, givens, unknowns, conditions, and constraints of the problem, which enables the student to form a concise model of problem.
Problem formulation was measured using the students' problem description, preliminary mental model, and structured problem representation. Outcomes of the problem formulation stage for both experimental and control groups for programming assignments, quizzes and exams were assessed using the instrument in Table 9. On the basis of the given problem, the students were asked to write, in their own words, the statement of the problem, to answer questions about the problem and to identify and organize the information needed to solve the problem. The assessment instrument used a scale of 0-4. The impact of students in the experimental group demonstrating superior problem definition and facts identification skills, is that they will be able to construct a more precise problem model, leading to a better solution plan and correctness.
H 2 Solution Planning. Planning is the cognitive activity where the development of an appropriate solution strategy begins (Duncker, 1945; Newell & Simon, 1972, Wickelgren, 1974; Mayer, 1983). The student considers various alternatives to achieving the goal of the problem, subdivides the goal into subgoals, and identifies the tasks needed to accomplish each subgoal. The information identified in the previous stage is related to each subgoal, with its role and meaning defined. This begins the process of progressing toward each subgoal, eventually producing a complete solution. SOLVEIT requires the student to first outlines the strategy to solve the problem, explicitly refine the goal into subgoals, define the tasks associated with each subgoal, and finally transform the information into a formal data representation.
Solution planning and goal decomposition were measured using the students' initial plan, the identified major components of the problem, and the structured data model. Outcomes of the planning stage of students' in the experimental and control groups for programming assignments, quizzes, and exams were assessed using the instrument in Table 10. On the basis of the outcomes of the problem formulation stage, students were asked to generate alternative plans from which they would select a particular solution strategy, break the problem down into major components, and organize and associate facts with the major components. The impact of students in the experimental group demonstrating superior strategic planning, early decomposition, and data representation skills, is that they will be able to produce a more carefully planned solution improving solution design quality and correctness.
H 3 Solution Design. Design is the cognitive activity where students organize and refine the components of the solution strategy, and defines specifications to be translated into program code (Wirth, 1971, 1975; Dijkstra, 1976). There are two levels of design. The first is a high-level design where a structure for a solution is produced, typically in visual or outline form. This involves organizing and sequencing subgoals, determining whether subgoals require further refinement, establishing the relation among solution components, and associating data with specific subgoals. Subsequent detail design transforms subgoals into algorithmic specifications, preparing the solution logic for translation. SOLVEIT supports a modular design methodology that allows the student to decompose and represent the problem in terms of smaller subproblems. Using a structured chart representation, the subproblems are presented visually as modules along with a data description table showing the data flow between the various modules . The algorithmic logic and module specification details are constructed within SOLVEIT.
Solution design and module decomposition were measured using the students' refinement, sequencing, and organization of subcomponents and the specification functions and data interface, as well as the logic specification. Outcomes for both groups were assessed using the instrument in Table 11 shows the instrument for assessing students' solution design skills. On the basis of the outcome of the planning design stage, students were asked to produce a well-organized, refined, and specified design, including charts and algorithmic specifications. The impact of students in the experimental group demonstrating superior refinement and specification skills, is that they will be able to produce a more carefully designed solution, leading to a better solution quality, program implementation, and correctness.
Product research questions. To complete the problem-solving and program development process, students translate a detailed design into a programming language, which is then followed by solution testing. The translation and testing stages are evaluated in terms of solution quality, reliability, readability, and correctness using four research questions that examine "product measures."
Research questions are more appropriate than hypotheses for these measures because the version of SOLVEIT tested only had tools for formulation and planning and simulated stubs for the design stage. There were no tools available for the later stages of translation, testing, and delivery. Unlike process measures where we claimed, in forming the hypotheses, that using the various tools of SOLVEIT will have positive cognitive benefits, with product measures we simply asked questions and searched for answers on whether such an advantage will translate into producing a better solution to a given problem. Any solution that correctly meets the requirements of the problem is considered a correct solution. The quality of solution, however, is a measure that extends beyond correctness and includes quality, reliability and readability.
RQ 1 Quality of Solutions. Quality in programming and problem-solving entails finding and implementing a well-suited solution for the problem. While SOLVEIT does not provide tools to facilitate the choice of data or algorithm structures, it is nonetheless reasonable to ask whether the systematic approach to problem-solving in SOLVEIT will lead a student to create higher quality solutions. Typically, there are multiple alternative choices for algorithm and data structures when solving a problem, some more appropriate and better than others. Choosing well-suited algorithms, data, and control structures for a specific problem situation leads to higher solution quality. Quality was measured on the basis of the students' choice of data structures, algorithms, control structures, and language constructs for sample programming assignments, quizzes, and exams, as assessed using the instrument in Table 12. On the basis of the problem requirements, students were expected to produce an effective problem solution by sele cting suitable algorithms and constructs for the given problem. The answer to this research question will demonstrate whether the problem-solving skills developed by students in the experimental group impacts on solution quality, as demonstrated by the use of more appropriate algorithms, data, and control structures than students in the control group.
RQ 2 Reliability of Solutions. Reliability refers to whether or not a program provides a complete and robust solution to a problem. As with program quality, no explicit tools were provided in the initial version of SOLVEIT to promote either solution reliability or to facilitate code testing. However, it was reasonable to ask whether the systematic approach fostered by SOLVEIT would lead to more reliable solutions. Ensuring reliability requires a programmer to verify a program will function properly under a broad range of test cases, will check for valid input, and will anticipate and respond to invalid input. Reliability was measured on the basis of how well students tested their programs. They were expected to develop test data suited to verification of program reliability based on the problem requirements and the solution. Their efforts were assessed using the instrument in Table 13. The answer to this research question will demonstrate whether the problem-solving skills developed by students in the experim ental group impacts their development of more reliable programs which, when tested at the successive stages of program development, survive the exhaustive code testing necessary to verify program correctness more often than students in the control group.
RQ 3 Readability of Solutions. Solution readability refers to the clarity of the solution and code. Readability is a function of program documentation and programming style. Though not explicitly supported, it was expected SOLVEIT would indirectly foster writing well-organized, readable programs. Program comprehension and modification is facilitated when comments and explanations are embedded within the code, explaining the approaches and techniques used to solve the problem. Maintenance would be difficult without such adequate documentation. Program style is additionally enhanced by establishing and adhering to coding conventions and guidelines that contribute to producing readable solutions. Readability was measured on the basis of the student's use of comments and explanations within the code describing the approach and technique used to solve the problem and provision of user documentation. Documentation and style of solution code of students in experimental and control groups for programming assignments, quizzes and exams were assessed using the instrument in Table 14. Students were expected to produce programs that could be easily read and understood, with each module documented using comments to explain code, data definition statements, and a consistent programming style that enhances readability. The answer to this research question will demonstrate whether students in the experimental group will impact the documentation skills and habits essential for writing understandable programs, as needed to support changing program logic or functionality, or reusing previously written code to solve new problems.
RQ 4 Correctness of Solutions. Correctness in problem-solving and programming refers to the accuracy of the solution specification, the fidelity of the program code to the preliminary solution design, and the correspondence between the program results and the problem requirements. Errors can be made at any stage of the process, so solution specification, program code, and the correspondence of programs results with problem requirements must each be addressed. Correctness was measured on the basis of solution specifications, program code, and execution results for students in the experimental and control groups for programming assignments, quizzes, and exams, and assessed using Table 15. On the basis of problem requirements, students were expected to produce solution specifications, program code and program results satisfying the problem being solved. Each module's specification, code, and results were carefully examined to ensure the module reflected the solution plan and that the results reflected the proble m requirements. The answer to this research question will show whether or not problem-solving skills developed by students in the experimental group will impact solution quality as demonstrated by their ability to produce more accurate solution specifications, code consistent with solution design, and correct results more often than students in the control group.
Academic performance evaluation. Four hypotheses were designed to test the effect of SOLVEIT on students general academic performance as well as their performance in related scientific, computing, and writing courses. The following presents details on these hypotheses and the methods to assess students' work.
H 4 Overall Academic Performance. A key goal of education is to allow students to move from guided to independent learning, and be able to transfer knowledge and strategies from old to new problems (Greeno, Collins, & Resnick, 1996). The independent learner must demonstrate not only self-instruction and self-regulation of learning, but also retention and transfer of knowledge (Schoenfeld, 1992). The problem-solving and cognitive skills promoted by SOLVEIT foster independent learning and are required throughout the academic curriculum. Possible effects of SOLVEIT on students was measured using their overall academic performance, as assessed using the students' GPA in the same and subsequent two semesters. The impact is on overall academic performance.
H 5 Academic Performance in Comparable Course. Previous studies examined the question of knowledge retention and transfer from one area to another, with different studies pointing in diverse conclusions (Mayer, 1981; Perkins, Hancock, Hobbs, Martin, & Simmons, 1986; Salomon & Perkins, 1987; Martin & Hearne, 1990). The problem-solving and cognitive skills promoted by SOLVEIT are directly related to, indeed partly modeled on, the skills required in mathematics. The effect of SOLVEIT on student ability in mathematical problem-solving was measured using their academic performance in a corequisite math course, as assessed using the students' grade for the course. The impact is on academic performance in a related area.
H 6 Academic Performance in Subsequent CS Course. The transfer of knowledge and strategies (Greeno, Collins, & Resnick, 1996) within the same area of study is expected. The problem-solving and cognitive skills promoted by SOLVEIT are the same for the subsequent computer science course. The effect of SOLVEIT on student performance in the subsequent computer science course was measured using their academic performance in the course, as assessed using the students' grade for the course. The impact is on academic performance in computer science.
H 7 Writing Ability. The issues of knowledge retention and transfer from one area to another are relevant here as well. The problem-solving and cognitive skills addressed in SOLVEIT are quite relevant to the skills required in composition, with points of contact ranging from the formulation of a thesis (problem formulation) to structure decomposition of topics. The effect of SOLVEIT on student writing ability was measured using their academic performance in a required composition course, as assessed using the students' grade for the course. The impact is on academic performance in an apparently remote area.
Subjective research questions. In addition to performance-based measures of the process and product of problem-solving, the effect of the SOLVEIT environment on both student perception of their learning experience, and on student attitude and motivation, was also investigated. Two research questions were designed to test the effect of SOLVEIT on students perception and attitude/motivation. Perception refers to students' feelings toward their learning environment and towards the methodology for learning problem-solving and programming. Attitude and motivation, on the other hand, refer to the commitment of students to the course, as evidenced in their attitude to learning and their motivation for achievement. Both measures were examined from the viewpoint of both the student and the teacher. Just as in the case of product measures, the research question posed was whether or not using SOLVEIT would have cognitive benefits that would translate into favorable perceptions, better attitudes, and increased motivation toward learning problem-solving and programming.
Two methods were used to evaluate these measures: (a) student self-reporting and (b) observations and monitoring of student performance. Self-reporting allowed students to participate in ongoing evaluation of their progress by providing information about their performance on homework problems, quizzes, or exams, and on difficulties they might be encountering. Self-reporting was restricted to a limited number of assignments, not used for course grading, and only anecdotal reports of results were to be collected. The course questionnaires, which were also self-reporting mechanisms, included questions related to both perception and attitude/motivation. The student activities observed and monitored included class attendance, the completion and quality of assigned work, and interest in course topics as determined by the course instructor. The following discusses these subjective measures in more detail, and describes the associated research questions.
RQ 5 Perception Student perception about the learning environment is an important satisfaction measure (Gagne & Driscoll, 1988). Perception (Rokeach, 1972) is difficult to determine. The evaluation of perception depends on the students' recognition and communication of their beliefs and feelings. Students form perceptions based on their observation about particular situations or experiences (Solso, 1988.) The research question investigated whether or not using SOLVEIT improved student confidence regarding learning, increased student satisfaction in their learning experience, or enhanced the relevance of learning the course goals.
Student satisfaction was measured on the basis of posttest questionnaires and on periodic student self-assessment reports on their experience with problem-solving and program development. These reports were examined to uncover successes, difficulties, or other relevant information. To assess student perception toward their learning experience, students were asked to reflect on specific problem-solving experiences and to evaluate their own performance through open-ended comments, which were to be written immediately after completion of a programming assignment or exam problem. The course questionnaires also included a series of questions pertaining to perception. Table 16 shows the instrument used by the students' for their self-assessment. The answer to this research question demonstrates whether or not the SOLVEIT instructional environment for students in the experimental group impacts on their perceptions of problem-solving and programming methodology.
RQ 6 Attitude and Motivation. The research literature contains many definitions for attitude. Rokeach (1972) offered the following: "An attitude is a relatively enduring organization of beliefs around an object or situation predisposing one to respond in some preferential manner." Attitude is problematic to measure because it is difficult to determine what data to include or exclude as part of an attitude (Rokeach, 1972). Nonetheless, successful learning experiences are known to shape students' general attitudes toward learning and motivation to achieve is reflected in performance in school (Gagne, 1985). In particular, the use of a specific teaching method or tool can have indirect positive effects on students' attitude (Mager, 1968; Papert, 1980) in addition to its direct cognitive benefits. Differences between the two groups may be observed in students' commitment to the course, their motivation to accomplish the required course tasks, their interest in the topic, as well as their feedback on the learning experience. The research question investigated whether there were differences between the two groups regarding students' attitude and motivation as a result of the use of SOLVEIT.
Students' commitment was measured using students' answers on posttest questionnaires, their course record for attendance in lecture and recitation-laboratory sessions, quality of course work, and timely submission of homework. An observational and monitoring record of students' commitment and performance was maintained in addition to the overall course grading database. This included information such as attendance, class participation, and meeting course due dates. The posttest questionnaires included a series of questions on attitude and motivation. This information was used to assess students' attitude and motivation toward problem-solving and programming. The answer to this research question will show whether or not the instructional context for students in the experimental group impacts on their attitude toward what they are learning and their motivation regarding their responsibilities to the course requirements.
RESULTS AND DISCUSSIONS
Seven hypotheses and six research questions were examined as previously described. To test these hypotheses and research questions, a spectrum of analysis of variance (ANOVA) tests and means comparisons for students' performance were performed. This section describes and discusses the results for the hypotheses and research questions given in Tables 5-8.
Process Hypotheses Results
The following three hypotheses refer to the process variables defined in Table 1 and the hypotheses in Table 5. The experimental group's scores on the midterm and final exams showed statistically significant differences for hypotheses I to 3. In contrast, the scores on programming assignments and quizzes were statistically similar for both groups, which was expected, given the nature of the quizzes and the context in which the programming assignments were given. Thus, the quizzes were administered in class immediately following the introduction of a specific concept, and involved primarily ad hoc syntax questions with minimal problem-solving. Since both groups had the same level of exposure to the material when these quizzes were given, the performance was comparable, as expected. As to the programming assignments, although only the experimental students used SOLVEIT, all students were allowed to obtain various kinds of supplemental assistance, such as discussing the problem with others, using problems solved in class as guiding examples, and obtaining help from the instructor, TAs, and tutors in the school's learning center. Since every student had a comparable level of assistance, it was expected that the results on programming assignments would be comparable, as they were. The exams, on the other hand, were different than the quizzes and programming assignments in key respects. In contrast to the quizzes, the exams posed substantive questions on problem formulation, planning, and design, and unlike the programming assignments, students worked independently on the exam questions. Thus, the statistically significant differences in the performance of the experimental and control groups on the exams can be attributed to the sole independent variable; namely, their prior exposure to SOLVEIT or not.
H1: Strongly Supported. Using the instrument described in Table 9, the results of problem formulation questions on midterm and final exams for the two semesters revealed statistically significant differences between the two groups as follows: final 1 (F=15.652, p=.001), midterm 2 (F=13.596, p=.001), and final 2 (F=13.304, p=.001). Since the tool was available for only a week before midterm 1 was given, so the results on the problem formulation on midterm 1 were, as expected, not significant. The mean for final 1 was 3.42 for the experimental group and 1.74 for the control group out of a possible score of 4; for midterm 2 the corresponding results were 2.41 (experimental) and 1.20 (control); for final 2 the corresponding results were 2.37 (experimental) and 1.27 (control).
H2: Strongly Supported. The instrument described in Table 10 was used to evaluate the solution planning questions on midterm and final exams for the two semesters and revealed statistically significant differences between the two groups as follows: final 1 (F=9.056, p=.004), midterm 2 (F=2.002, p=.160), and final 2 (F=13.958, p=.001). The mean for final 1 was 2.58 for the experimental group and 1.48 for the control group out of a possible score of 4; for midterm 2 the corresponding results were 2.05 (experimental) and 1.57 (control); for final 2 the corresponding results were 2.25 (experimental) and 1.08 (control). Once again, the results on the solution planning question on midterm 1 given after a week's exposure to the tool were not significant.
H3: Weakly Supported. Using the instrument in Table 11, the results of the solution design questions on midterm and final exams for the two semesters indicate only a slight difference between the two groups. The results for the first semester were as follows: final 1 (F=1.977, p=.164). The mean for final 1 was 2.71 for the experimental group and 1.69 for the control group out of a possible score of 4. The results on the solution design question on midterm 1 given after a week's exposure to the tool were not significant. The results on the second semester exams were not statistically significant, although even in these cases the mean scores for the exam questions were higher for the experimental group. The means for midterm 2 were 1.87 (experimental) and 1.63 (control); the means for final 2 were 1.76 (experimental) and 1.57 (control). An explanation for the minimal differences between the two groups is that the solution design tools in the prototype of SOLVEIT used in this experiment were only simulated prese ntations with no actual functionality, as opposed to the fully functional problem formulation and planning tools.
Product Research Questions Results
The following four research questions refer to the product variables defined in Table 1 and the research questions defined in Table 6. The experimental group's scores on the midterm and final exams for the first semester showed statistically significant differences for research questions 1 to 3, but mixed results for research question 4. The results for semester two were not statistically significant, although the experimental means were generally higher than the control means on the research questions. The reason for these inconclusive results in the second semester appears to be that the section on both midterm and final exams of the second semester, which was designed to test these research questions was excessively difficult. The scores for both groups on this section were extremely low. Indeed the highest score on any of these questions in the second semester was lower than the lowest score on any of the corresponding sections for the previous semester. Despite this, the means of the experimental groups on these questions generally exceeded the means of the control group. A more detailed explanation is described later. As with hypotheses 1 to 3, related scores on programming assignments and quizzes were statistically similar for both groups, as expected.
RQ1: Supported Students' scores on specific midterm and final exam questions, dealing with solution quality, were graded using the instrument in Table 11. The first semester reveals statistically significant differences between the two groups as follows: midterm 1 (F=8.258, p=.005) and final 1 (F=7.779, p=.007). The means for midterm 1 were 7.28 (experimental) and 5.15 (control) out of a possible score of 8; the means for final 1 were 5.00 (experimental) and 2.38 (control). The results on the second semester exams were not statistically significant, though the mean scores for the exam questions were higher for the experimental group. The means for midterm 2 were only 2.52 (experimental) and 2.43 (control) out of a possible score of 8. The means for final 2 were 1.71 (experimental) and 1.77 (control) out of a possible score of 8. As previously indicated, these second semester means were dramatically lower than the first semester means for all of research questions 1 to 4, in both groups. The same graders and g rading procedures were used in the two semesters of the experiment, thus the authors believe the low scores for both groups indicate the related questions were too difficult, making it problematic to obtain statistically significant results.
RQ2. Supported Students' scores on specific midterm and final exam questions, dealing with solution reliability, were graded using the instrument in Table 13. The first semester reveals statistically significant differences between the two groups as follows: midterm 1 (F=10.872, p=.001) and final 1 (F=7.453, p=.008). The means for midterm 1 were 6.03 (experimental) and 3.79 (control) out of a possible score of 8; the means for final 1 were 6.15 (experimental) and 3.67 (control). The results on the second semester exams were not statistically significant. Once again, as for RQ1, the scores on the second semester exams were dramatically lower. Indeed, the means for midterm 2 were only 2.09 (experimental) and 2.23 (control) out of a possible score of 8, while the means for final 2 were only 1.71 (experimental) and 1.65 (control) out of 8.
RQ3: Supported Students' scores on specific midterm and final exam questions, dealing with solution readability, were graded using the instrument in Table 14. The first semester reveals statistically significant differences between the two groups as follows: midterm 1 (F=7.778, p=.007) and final 1 (F=11.430, p=.001). The means for midterm 1 were 4.53 (experimental) and 2.65 (control) out of a possible score of 8; the means for final 1 were 6.35 (experimental) and 3.81 (control). The results on the second semester exams were not statistically significant, although in these cases the mean scores for the exam questions were higher for the experimental group. Once again, the second semester scores were dramatically lower than the first semester. The means for midterm 2 were 1.11 (experimental) and 1.03 (control) out of a possible score of 8. The means for final 2 were 1.66 (experimental) and 1.54 (control) out of 8.
RQ4: Supported Students' scores on specific midterm and final exam questions, dealing with solution correctness, were graded using the instrument in Table 15. The first semester reveals statistically significant differences between the two groups as follows: midterm 1 (F=14.025, p=.000) and final 1 (F=1.718, p=.195). The means for midterm 1 were 6.22 (experimental) and 3.29 (control) out of a possible score of 8; the means for final 1 were 3.23 (experimental) and 2.08 (control). The results on the second semester exams were not statistically significant, although the mean scores for the exam questions were once again higher for the experimental group. Just as for RQ1 to RQ3, the second semester scores were dramatically lower. The means for midterm 2 were only 1.91 (experimental) and 1.67 (control) out of a possible score of 8, while the means for final 2 were only 1.51 (experimental) and .85 (control) out of 8.
Academic Performance Hypotheses Results
The next four hypotheses refer to the academic performance evaluation variables defined in Table 1 and the hypotheses defined in Table 7.
H4: Not Supported. The students' GPA over the semester of experiment and the subsequent two semesters were used. They reveal no statistically significant differences between the two groups.
H5: Not Supported The students' course grades in calculus I and II were used. They reveal no statistically significant differences between the two groups.
H6: Weakly Supported The students' grades in the follow up course in computer science, taken in the subsequent semester, were used. They do not reveal statistically significant differences between the two groups. Nonetheless, the means for the experimental group exceed the control means for both semesters. The means for the semester I group were 3.44 (experimental) and 2.71 (control) out of a possible score of 4; the means for the semester 2 group were 2.69 (experimental) and 1.75 (control).
H7: Weakly Supported. The students' course grades in English composition were used. Although they do not reveal statistically significant differences for either semester, nonetheless, for both experiments, the means for the experimental group exceeded the means for the control group. The means for the first semester group were 2.98 (experimental) and 2.59 (control) out of a possible score of 4. The means for the second semester group were 2.70 (experimental) and 2.46 (control).
Subjective Research Questions Results
The following two research questions refer to the subjective variables defined in Table 1 and the research questions defined in Table 8. Subjective differences were examined based on students' self-reporting as well as by observation and monitoring of students' performance. For self-reporting, the results of a posttest questionnaire and periodic student self-assessment reports were examined. The posttest questionnaire contained questions that addressed perception and motivation, the periodic self assessment reports addressed students' perceptions, while the observation and monitoring records were used to evaluate student motivation.
RQ5: Not Supported All students who took the final exam completed a posttest questionnaire containing a sequence of questions which quantified their perception of their own problem-solving and programming ability, level of motivation, satisfaction with the course, and attitude. About half of the questions dealt with perception while the other half dealt with motivation. Students selected responses which ranged on a scale of 1-5. The results revealed no significant differences between the experimental and control groups for either semester. However, for the second semester, the means for the experimental groups exceeded the means for the control group on every question related to motivation and perception.
The responses to the request for periodic self-reporting were minimal during both semesters. Students were asked to provide feedback on their performance on homework problems, exams, and any difficulties they might be encountering in the course, using the instrument in Table 15. However, these reports were not easily obtained. The self-assessment technique required a written response which was viewed as burdensome by the students. The number and quality of responses was minimal, apparently because they did not contribute to the students' grades. The small sample size precluded performing any meaningful statistical analysis. The authors refer to Charles, Lester, and O'Daffer, (1987) who described important factors that hinder students' self-reporting that appear to apply directly in this case. Their work indicated that students may resent spending time on activities that are not graded and not directly related to the course work; students may also simply not remember all the important information about their e xperiences; while others may not possess the writing skills necessary for such a task. The last factor may also be relevant in this study. The introductory programming course is normally taken in the first year concurrently with English composition, with many of these students taking remedial composition. Indeed, the results of the pretest questionnaire indicate 52% of the students reported English was not their native language, a factor that may have been relevant to the low number of written responses.
RQ6: Not Supported. As indicated previously, about half the questions on the posttest questionnaires addressed motivation. However, aside from higher means for the experimental group during the second semester, the results revealed no significant differences between the experimental and control groups.
Instructors' observation and monitoring of students' performance included records of class attendance, produced work, and timely submission of homework which were used as indicators of students' commitment to the course. Other evidence of students' attitude and motivation toward learning problem-solving and programming was also investigated. For example, to examine students' commitment to the course, the authors used the dates on which the five unannounced quizzes were given as attendance indicators, and the rate of submission of the five programming assignments as motivation indicators. Records on late assignments were also kept. No significant difference was found in attendance rate or timely submission of assignments. However, the instructors' evaluation of the work quality of students in the experimental group was better, which was consistent with the results of the process hypotheses and product research questions.
SOLVELT is based on a cognitive model for problem-solving and program development. This article reports the results of a statistical analysis conducted to evaluate the effectiveness of the SOLVEIT environment. SOLVEIT integrated problem-solving, program development, and the corequisite cognitive foundation. The researchers hypothesized that the system will have positive benefits on the development of cognitive skills and abilities required for problem-solving and programming. Most previous research focused on developing such environments rather than evaluating their effectiveness. The current research implemented a cognitively oriented assessment method and related instruments to evaluate the process and product of problem-solving and program development as well as its subjective and academic effects, including a longitudinal analysis of the impact on related courses. Each cohort was tracked for two semesters beyond the initial treatment semester. The experiment included system testing by protocol analysis an d examined the effect of the environment on students' problem-solving and program development skills. The results of the evaluation indicate that students in the experimental group acquired a significantly higher-level of competence in both problem-solving and program development skills than the control group.
Table 1 Summary of Independent and Dependent Variables for Hypotheses and Research Questions Independent Variable Dependent Variable Hypothesis (process) 1 SOLVEIT Problem (presence Formulation or absence) 2 Solution Planning 3 Solution Design Research Question (product) 1 SOLVEIT Quality (presence or absence) 2 Reliability 3 Readability 4 Correctness Hypothesis (academic) 4 SOLVEIT Overall (presence Academic or absence) 5 Comparable Courses 6 Next CS Course 7 Composition Course Research Question (subjective) 5 SOLVEIT Preception (presence or absence) 6 Attitude and Motivation Table 2 Inter-Rater Reliability for the Three Grade Grader 1 Grader 2 Grader 3 Grader 1 1.00 0.82 0.92 Grader 2 0.82 1.00 0.95 Grader 3 0.92 0.95 1.00 Table 3 Data Collection Summary Data Collection Timing Hypotheses Programming Assignments Throughout the (process) Quizzes semester Midterm Exam Mid-semester Final Exam End-of-semester Research Programming Assignments Throughout the Questions Quizzes semester (product) Midterm Exam Mid-semester Final Exam End-of-semester Hypotheses Transcript Grades Same/Subsequent Semester (academic) Research Self-reports Periodic Questions Observation and Monitoring (subjective) pre/post Questionnaires Beginning/end-of-semester Table 4 Definitions of Dependent Variables' Categories Category Definition Process Problem solving and program development method and cognitive skills required to produce solution. Product Solution as a product of problem solving process. Academic Performance in related or subseq- uent academic subjects. Subjective Perception, attitude, and motivation toward problem solving and programming. Table 5 Process Hypotheses Hypotheses Measures Assessment (process) H1. Students in the experimental Problem Problem Description, group will show superior problem Formualtion Preliminary Mental understanding as demonstrated by Model, and Structured their ability to clearly and Problem Representation correctly state problems and extract problems' facts better than students in the control group. H2. Students in the experimental Solution Strategy Discovery, group will show superior planning Planning Goal Decomposition, skills as demonstrated by their and Data Modeling ability to provide detailed and clear plans, complete goal refinements and representation of facts better than students in the control group. H3. Students in the experimental Solution Organization and group will show superior design Design Refinement, skills as demonstrated by their Function/Data ability to refine, sequence, and Specification, and Logic organize solution components as Specification well as specify data and algorithmic logic better than students in the control group. Hypotheses Impact (process) H1. Students in the experimental Problem Model, group will show superior problem Solution Plan, and understanding as demonstrated by Correctness their ability to clearly and correctly state problems and extract problems' facts better than students in the control group. H2. Students in the experimental Solution, Plan, group will show superior planning Solution Design, and skills as demonstrated by their Correctness ability to provide detailed and clear plans, complete goal refinements and representation of facts better than students in the control group. H3. Students in the experimental Solution Design, group will show superior design Solution Translation, skills as demonstrated by their Quality, and Correctness ability to refine, sequence, and organize solution components as well as specify data and algorithmic logic better than students in the control group. Table 6 Product Research Questions Research Question Measures Assessment (product) RQ 1. Will students in the Program Algorithms, experimental group produce higher Quality Data/Control quality programs compered to structures, and students in the control group? Language Constructs RQ 2. Will students in the Code Test experimental group produce more Reliability Cases/ complete and robust programs Results compared to students in the control group? RQ 3. Will students in the Code Commented experimental group produce clearer Readability Code and User and more understandable solutions Documentation compared to students in the control group? RQ 4. Will students in the Solution Solution experimental group produce more Correctness Specification, accurate solution specification and Program Code, program code compared to students and test results in the control group? Research Question Impact (product) RQ 1. Will students in the Solution Quality experimental group produce higher quality programs compered to students in the control group? RQ 2. Will students in the Solution experimental group produce more Reliability and complete and robust programs Correctness compared to students in the control group? RQ 3. Will students in the Program Reuse, experimental group produce clearer Maintenance, and and more understandable solutions and Modification compared to students in the control group? RQ 4. Will students in the Solution Quality experimental group produce more accurate solution specification and program code compared to students in the control group? Table 7 Academic Hypotheses Hypotheses Measures Assessment (academic) H4. Students in the experimental Overall GPA, group will show overall improved academic academic performance compared to Performance students in the control group. H5. Students in the experimental Academic Co-requisite group will show improved academic Performance in math course performance in courses with cmparable grade comparable academic goals compared course to students in the control group. H6. Students in the experiental Academic Subsequent CS group will show improved academic Performance in course grade performance in the subsequent subsequent CS course in computer science. course H7. Students in the experimental Writing ability Composition group will show improved academic course grade performance in composition and writing. Hypotheses Impact (academic) H4. Students in the experimental Improved overall group will show overall improved Academic academic performance compared to Accomplishment students in the control group. H5. Students in the experimental Improved group will show improved academic Academic performance in courses with Accomplishment comparable academic goals compared in related area to students in the control group. H6. Students in the experiental Improved group will show improved academic Academic performance in the subsequent Accomplishment course in computer science. in CS H7. Students in the experimental Improved group will show improved academic Academic performance in composition and Accomplishment writing. in composition Table 8 Subjective Research Questions Research Question Measures Assessment (subjective) RQ 5. Will students in the Students' Students' experimental group have a more Satisfaction Feedback and favorable perception of their Questionnaires learning experience compared to students in the control group? RQ 6. Will students in the Students' Students' experimental group exhibit better Commitment Records, attitude and increased motivation Feedback and toward learning problem solving Questionnaires and programming compared to students in the control group? Research Question Impact (subjective) RQ 5. Will students in the Students' experimental group have a more Perception of favorable perception of their Problem learning experience compared to Solving and students in the control group? Programming RQ 6. Will students in the Students' experimental group exhibit better Attitude and attitude and increased motivation Motivation toward learning problem solving and programming compared to students in the control group? Table 9 Instrument for Assessing Students' Problem Formulation Skills The Process - Formulating the Problem Outcome Indicator Scoring Scale Excellent representation of problem Problem is clearly and 4 and complete identification of correctly stated. All relevant facts, indicating full goals, givens, and understanding, required to solve unknowns are identified. the problem. Reasonable representation of Problem is correctly 3 problem and identification of stated. Most goals, almost all relevant facts, givens, and unknowns are indicating adequate understanding, identified. required to solve the problem. Incomplete representation of Problem is partially 2 problem and/or identification of stated and/or some facts facts, indicating some are identified. understanding, but not enough to solve the problem. Inappropriate representation of Problem statement is 1 problem and inability to identify incorrect and meaningless relevant facts, indicating complete facts are identified. misunderstanding, required to solve the problem. Lack of problem representation and No problem representation 0 identification of relevant facts, /fact identification indicating complete attempted or completely misunderstanding, required to irrelevant work. solve the problem. Table 10 Instrument for Assessing Students' Planning Skills The Process - Planning the Solution Outcome Indicator Scoring Scale Excellent planning strategy and Detailed and clear 4 refinement of goals that will lead planning. Complete goal to a correct solution for the refinement, task problem. identification, and data representation. Reasonable planning strategy and Adequate planning. 3 refinement of goals that could Sufficient lead to a correct solution for the goal refinement, task problem. identification, and data representation. Imcomplete planning strategy and/or Partially correct 2 some evidence of goal refinement, planning and/or some but not enough to solve the goal refinement, problem. task identification, and data representation. Inappropriate planning strategy and Incorrect planning 1 complete lack of adequte goal and meaningless refinement necessary to solve the goal refinement. problem. Lack of planning and refinement No planning/refinement 0 necessary to solve the problem. attempted or completely irrelevant work. Table 11 Instrument for Assessing Students' Design Skills The Process - Designing the Solution Outcome Indicator Scoring Scale Excellent design strategy and Complete module 4 module specifications that will decomposition lead to a good quality solution organization, and for problem. detailed specifications. Reasonable design strategy and Sufficient module 3 module specifications that could decomposition, lead to a solution for the organization, and problem. sufficient specifications. Incomplete design strategy and/or Partial design 2 evidence of module specification, and/or some but not enough to solve the module specifications. problem. Inappropriate design strategy and Improper module 1 specifications that will not lead decomposition, lead to a solution for the problem. organization, and specification. Lack of design and specifications No design/specifications 0 necessary to solve the problem. attempted ar completely irrelevant work. Table 12 Instrument for Assessing Solution Quality The Product - Solution Quality Outcome Indicator Scoring Scale Well suited solution Most appropriate algorithms, data 2 is produced. structures, control structures, and language constructs for this problem situation are chosen. Minimally acceptable Program accomplishes its task, but 1 solution is produced. lacks coberence in choice of either data and/or control structures. Unacceptable solution Program solution lacks coherence in 0 quality is produced. choice of both data and control structures. Table 13 Instrument for Assessing Solution Reliability The Product - Solution Reliability Outcome Indicator Scoring Scale Robust solutions Program functions properly under 2 is produced. all test cases. Works for all valid input, and responds to all invalid input. Minimum requirements Program functions under limited 1 solution is produced. test cases or works only for valid input and fails to respond to invalid input. Unacceptable solution Program fails under most test 0 quality is produced. cases. Table 14 Instrument for Assessing Solution Readability The Product - Solution Readability Outcome Indicator Scoring Scale Clear and understandable Program includes commented code, 2 solution is produced. meaningful identifers, indentation to clarify logical structure, and user instructions. Minimally documented Program lacks clear documentation 1 solution is produced. and/or user instructions. Unacceptable solution Program is totally incoherent. 0 quality is produced. Table 15 Instrument for Assessing Solution Correctness The Product - Solution Correctness Outcome Indicator Scoring Scale Appropriate solution Correct solution specifications, 2 in produced. program code and results consistent with problem requirements. Incomplete solution Partial solution 1 in produced. specifications/program code and/or some results. No solution or totally No solution specifications/ 0 inappropriate solution is program code, or results produced. inconsistent with problem requirement. Table 16 Instrument for Students' Self-Assessment Reports Self-assessment Report Completing this report provides the instructor and the TA with a retrospective feedback on your success, difficulties, feelings or anything else you wish to comment regarding the problem you have just solved. This enables you to communicate your thoughts to us throughout the semester. Please be as candid and informal as you wish. Answers can be as short (or as long) as you feel is necessary (use the back side of this form or attach additional sheets). The following is intended to give you some direction to the report: 1. When I first saw the problem ... 2. I formed the solution by ... 3. This problem solving experience ... 4. Anything else?
Bloom, B.S. (Ed.). (1956). Taxonomy of educational objectives, Handbook I: Cognitive domain. New York: McKay.
Carroll, J.M., & Thomas, J.C. (1982). Metaphor and the cognitive representation of computing systems. IEEE Transactions on systems, man, and cybernetics, 12(2), 107-115.
Charles, R., Lester F., & O'Daffer, P. (1987). How to evaluate progress in problem-solving, Reston, VA: National Council of Teachers of Mathematics.
Deek, F.P., & McHugh, J. (2002). SOLVEIT: An experimental environment for problem-solving and program development. To appear in Journal of Applied Systems Studies, 2(2).
Deek, F.P. (1997). An integrated environment for problem-solving and program development. Unpublished doctoral dissertation, New Jersey Institute of Technology.
Deek, F.P., Turoff, M., & McHugh, J. (1999). A common model for problem solving and program development. IEEE Transactions on Education, 42(4), 331-336.
Dijkstra, E. (1976). A discipline of programming. Englewood Cliffs, NJ: Prentice Hall.
Duncker, K. (1945). On problem-solving. Psychological Monographs, 58(5), Whole no. 270.
Ericsson, K.A., & Simon H.A. (1993). Protocol analysis: Verbal reports as data, Cambridge, MA: MIT Press.
Gagne, R.M. (1985). The conditions of learning, (4th ed.). New York: Holt, Rinehart, and Winston.
Gagne, R.M., & Driscoll M.P. (1988). Essentials of learning for instruction. Englewood Cliffs, NJ: Prentice Hall.
Greeno, J.G. (1978). Natures of problem-solving abilities. In W.K. Estes (Ed.), Handbook of learning and cognitive processes, 5, 239-270, Hillsdale, NJ: Lawrence Erlbaum.
Greeno, J.G., Collins, A.M., & Resnick, L.B. (1996). Cognition and learning. In D.C. Berliner & R.C. Calfee (Eds.), Handbook of educational psychology, (pp. 15-46). New York: Macmillan.
Gronlund, N.E. (1985). Measurement and evaluation in teaching, (5th ed.). New York: Macmillan Publishing.
Hartman, H. (1996). intelligent tutoring, preliminary edition. Clearwater, FL: H&H Publishing.
Mager, R.F. (1968). Developing attitude toward learning. Belmont, CA: Fearon.
Martin, B., & Hearne, J.D. (1990). Transfer of learning and computer programming. Educational Technology, 30(1), 41-44.
Mayer, R.E, (1981, March). The psychology of how novices learn computer programming. ACM Computing Surveys, 3(1), 121-141.
Mayer, R.E. (1983). Thinking, problem-solving, cognition. New York: W.H. Freeman.
Meier, S.L. (1992). Evaluating problem-solving processes. Mathematics Teacher, 85(8), 664-666.
Moore, G.W. (1983). Developing and evaluating educational research. Boston, MA: Little, Brown and Company.
Newell, A., & Simon, H.A. (1972). Human problem-solving. Englewood Cliffs, NJ: Prentice Hall.
Papert, S. (1980). Mindstorms: Children, computers and powerful ideas. New York: Basic Books.
Perkins, D.N., Hancock, C., Hobbs, R., Martin, F., & Simmons, R. (1986). Conditions of learning in novice programmers. Journal of Educational Computing Research, 2(1), 37-56.
Polya, G. (1945). How to solve it. Princeton, NJ: Princeton University Press.
Rokeach, M. (1972). Beliefs, attitudes, and values. London: Jossey-Bass.
Rosenthal, R., & Rosnow, R. (1991). Essentials of behavioral research: Methods and data analysis, (2nd ed.). New York: McGraw-Hill.
Rubinstein, M. (1975). Patterns of problem-solving. Englewood Cliffs: NJ: Prentice Hall.
Salomon, G., & Perkins, D. (1987). Transfer of cognitive skills from programming: When and how? Journal of Educational Computing Research, 3(2), 149- 169.
Schoenfeld, A.H. (1992). Learning to think mathematically: Problem-solving, metacognition, and sense making in mathematics. In D. Grouws (Ed.), Handbook for research on mathematics teaching and learning, New York: Macmillan.
Shneiderman, B. (1980). Software psychology: Human factors in computer and information systems. Boston: Little, Brown and Company.
Solso, R.L. (1988). Cognitive psychology, (2nd ed.). Boston: Allyn and Bacon.
Sternberg, R.J. (1985). Beyond IQ: A triarchic theory of human intelligence. Cambridge, MA: Cambridge University Press.
Szetela, W. (1987). The problem of evaluation in problem-solving: Can we find solutions. Arithmetic Teacher, 35, 36-41.
Wickelgren, W.A. (1974). How to solve problems. San Francisco: W.H. Freeman and Company.
Wirth, N. (1971). Program development by stepwise refinement. Communications of the ACM, 14(4), 221-227.
Wirth, N. (1975). Algorithms + data structures = programs. Englewood Cliffs, NJ: Prentice Hall.