Printer Friendly

Mining student data captured from a web-based tutoring tool: initial exploration and results.

In this article we describe the initial investigations that we have conducted on student data collected from a web-based tutoring tool. We have used some data mining techniques such as association rule and symbolic data analysis, as well as traditional SQL queries to gain further insight on the students' learning and deduce information to improve teaching. In our work, applying data mining facilities serves two purposes: (a) understand better both how students grasp the tool and assimilate the knowledge they need to learn and (b) get pedagogically relevant information that may influence or help improve teaching.


With the emergence of e-learning, flexible education, and the increasing number of students in some fields, online teaching tools are becoming more and more important. Online teaching tools provide a more or less personalised environment where learners can learn at their own pace, have access to tutorial lessons, practice exercises, be given explanations and feedback on their performance, and so on. These benefits to the learners are extremely valuable and we assist to "a quiet revolution taking place in the classrooms" (Forster, 2002). However, less attention has been given to the reflection and monitoring that can be made to improve the teaching. Since online teaching tools are computer-based, they allow storing complete student answers, including mistakes made while solving exercises. The fact that they are online tools means that all this information, for all students using the tool, can be stored on a common server rather than stored locally. Having electronic access to complete student answers makes it possible to extract pedagogically relevant information and provide feedback to the teacher about how a class, a group of students or an individual student is going. It also makes it possible to get more insight on how students get along with the tool and the content.

The Logic-ITA is a web-based Intelligent Teaching Assistant system that is currently used within the School of Information Technologies, University of Sydney, for an undergraduate course on formal languages and logic. Its aim is to facilitate the whole teaching and learning process by helping the teacher as well as the learner. It allows students to practice formal proofs in propositional logic while receiving feedback and also keeps the lecturer informed about the progress the class is making and problems encountered. The system embeds the Logic Tutor, a web-based intelligent tutoring system destined to the students that stores their complete work including mistakes, along with tools dedicated to the teacher for managing teaching configuration settings and material, as well as collecting and analysing data. A multimedia article on the Logic Tutor can be found in Abraham, Crawford, Lesta, Merceron, and Yacef (2001) and a description of the Logic-ITA in Lesta and Yacef (2002). We are now extending the capabilities of the system to provide more information and more intelligent help to the teacher. First results in this direction can be found in Merceron and Yacef (2003).

In this article we investigate the impact that the analysis of the data collected in such a tool can have on the whole process of teaching and learning. We use data mining techniques on the data stored by the Logic-ITA to better investigate the impact on learning and to improve teaching. More precisely, symbolic data analysis allows us to gain further insight into the students' learning and associations, rules of items that we apply to mistakes opens new perspectives to improve teaching. With e-learning, complete student answers will be more and more available in electronic format. This work shows some possibilities of what can be done with them.

There is an increasing interest in providing assistance to the human teacher and to integrate him or her formally into the loop (Jean, 2000; Kinshuk, Patel, Oppermann, & Russell, 2001; Kinshuk, Hong, & Patel, 2001; Leroux, Vivet, & Brezillon, 1996; Virvou & Moundridou, 2001; Vivet, 1992; Yacef, 2002) and this is supported by the combining of computational intelligence with web-based education (Calvo & Grandbastien, 2003; Vasilakos, Devedzic, Kinshuk, & Pedrycz, 2004; Yacef, 2003). However, in terms of help in diagnosis and assessment of learning, analysis, and synthesis of results, not as much work has been done. Implicative statistical analysis (Gras et al., 1996; Gras, Briand, Peter, & Philippe, 1997) has been developed to extract information from data gathered among students. It is supported by the C.H.I.C. software. C.H.I.C. accepts standard data where students are described in a homogeneous way. This is not exactly what happens with the data we get from the Logic Tutor, since students do not necessarily attempt the same exercises, nor the same number of exercises. Jean's (2000) PepiDiag and PepiProfil system, like the Logic-ITA, collects data from students' exercises, reports them to the teacher, and provides tools to analyse these results. One main difference resides in the fact that it processes one student's data at a time, whereas the Logic-ITA combines data from all students. The tool OASIS (Smail & Hussmann, 2003) bears similarity with the Logic-ITA in the sense that it stores complete answers of students, including wrong answers, and provides extensive statistics. However, it does not have a tutoring facility and provides only a yes/no answer to students when they enter a result for an exercise. No mistake diagnosis is provided.

In this article we describe the investigation that we have conducted on the data collected from such a system, and how this data is used to gain further insight on the students' learning and how it can impact the teaching. The article is organized as follows. First we will present the student data manipulated and stored in the Logic-ITA both inside the student side, the Logic Tutor, which allows students to practice logical proofs, and inside the teacher's side that structures all answers entered by all students for the teacher. Then we will explain the impact of mining the data in the Logic-ITA from a learning aspect, looking at the correlation between exam performance and Logic Tutor activity as well as using symbolic data analysis. We consequently describe the impact of mining the data in the Logic-ITA from a teaching perspective, by extracting pedagogically relevant information through SQL queries and association of mistakes. We then conclude the article.


From the Student's Side: The Logic Tutor

The Logic Tutor is an online Intelligent Tutoring System (Abraham et al., 2001) allowing students to build formal proofs in propositional logic while receiving step-by-step, contextualised feedback. It uses a conventional interface allowing forward and sequential construction of proofs, as opposed to other computerised educational systems for this domain such as Croy (1989, 1999) and Scheines and Sieg (1994). We do not hold any particular argument for the style of interface. We just kept the same interface as the one used in previous years without the Logic Tutor. Our aim was to design an Intelligent Teaching Assistant system and then evaluate its usefulness using the previous year as a control group. It would have been more difficult to interpret results if we had changed the style of the interface.

We will describe the relevant data that the Logic Tutor stored in each student model in the context of an exercise, since this is the input data to the mining methods that we will describe in the following sections.

Exercises start with a given set of premises, that is, a set of well-formed formulae (wff) of propositional logic, and exactly one wff, the conclusion. The task then consists of deriving the conclusion from the premises, step-by-step, using laws of equivalence and rules of inference (we will refer to both of these as rules for the rest of this article). Figure 1 shows a screen shot of the interface. Here the student was given the first two lines (lines 0 and 1) and the conclusion at the bottom left corner, that is "C." For each step, the student must fill out a new line, entered at the bottom of the screen. The student needs to do the following:

1. enter a formula in the Formula section;

2. choose, from a pop-up menu, the rule used to derive this formula from one or more previous line(s) (Rules);

3. the references of those previous lines (Line References); and

4. the premises the formula relies on (Premises).

For example in Figure 1, the student is currently deriving the formula "C," using the rule "Indirect Proof" using the formulae of lines 2 and 7. Because lines 2 and 7 rely respectively on premises {2} and {0,1,2} (as can be seen in the first column of the screen) and Indirect proof removes the premise 2, the line entered therefore relies on premises {0,1}. It is actually the last step of this exercise, deriving the conclusion.

There are often many ways to prove an argument valid. The important aspect is that the reasoning must be sound. The actual path followed is not important, as long as each step is valid. In this regard, our approach is less sophisticated than one such as Model-Tracing (Anderson, Corbett, Koedinger, & Pelletier, 1995), which identifies the student's reasoning by checking his/her answers against predetermined solutions. The Logic Tutor instead assesses the validity of each step on the fly, but not its appropriateness. During the exercise, the system only assesses whether the line entered by the student is logically valid, and whether or not the conclusion was reached, but does not know how far the conclusion is nor whether the step entered moves towards the conclusion. Hence students have total freedom in the reasoning they choose to follow. However, once the exercise is finished, it can then evaluate whether all the steps were actually useful. This has a consequence on the evaluation of the student. The performance of a student is calculated in terms of whether or not the conclusion was finally reached, whether or not mistakes were made along the way and whether or not useless steps were entered (Table 1).


Handling of mistakes. After each line is entered, the system first checks its syntactical validity (right type of data, syntax of the formula) then its logical validity. In the latter, the formula, the rule, the references, and premises entered must all be consistent. If not, then the system checks whether altering just one of them could give a valid result. If this is the case, the system will use this substitution to provide hints to the student. For example, suppose the line entered is invalid but becomes valid when the rule is altered from Modus Ponens to Modus Tollens. The hint would then be to try applying Modus Tollens. If this is not the case, then the system looks up the database of common mistakes and tries to match the line with a common mistake. As these come with remedial hints, the student receives the corresponding feedback. For example a common mistake is to apply Simplification before Commutation to derive C from (B & C). The common mistake has an associated feedback and the student reads "This was an invalid application of Simplification (Simp). Applying Simplification to (B & C) only lets you deduce the left hand side: B. Try to use Commutation first."

This means that there are two important aspects in a mistake: its type (for example Wrong reference lines) and the rule involved (for example Modus Ponens).

All exercise steps and mistakes are stored in the current student model. In the example shown in Figure 1, the student model would contain the whole exercise, with all its steps, the erroneous steps that cannot be shown on the screen with the type of mistake and rule involved. At the end of the exercise, a performance would also be calculated and stored. There are 6 possible performances, as shown in Table 1. Then, according to the level of difficulty of the exercise (which depends on the variety and the complexity of the rules needed to solve it, as well as the number of steps required to reach the conclusion), the student's level may be updated. For example, if the student has completed a sufficient number of exercises at the current level and used successfully the rules that corresponded to that level, then s/he will be allowed to go onto the next level (for more details, see Lesta & Yacef, 2002)).

The student model data is used in the Logic Tutor for adapting the exercise content to the student. It can also be browsed by the student at anytime for reflection purposes.

From the Teacher's Side: The LT-Analyser

The Logic-ITA contains two tools dedicated to the teacher. The LT-Configurator, which we will not describe here, has authoring features. It allows the teacher to set up exercise and students' levels, progression criteria through the curriculum and so on. The second one is the LT-Analyser, which collates all the student information in a database that the teacher can query and visualise graphically.

The student models, all centralized in one place on the server, contains the history of all exercises attempted and mistakes made. The LT-Analyser regularly scans all the student models and builds a database collating all that information. The various tables in this database are summarised in Table 2.

The database is in Microsoft Access and is connected to Microsoft Excel. The teacher has then the choice of querying the database with either software and visualise graphics in MS Excel. The aim of the LT-Analyser is to provide information to the teacher about the class, so that s/he can adapt his/her teaching accordingly, or be aware of individuals needing assistance. This is relevant for large classes and for distance teaching. We will show in section Simple Queries to Get a First Impression of the Class how teachers can use this tool.


Analysis of Marks

As explained earlier, the Logic-ITA is currently used in an undergraduate course at the School of Information Technologies, University of Sydney. The course is on theoretical aspects of computer science, the student population for this particular course is mainly composed of computer science and engineering students. We used, within the normal assessment of the course, the assessment items that were only concerned with logic formal proofs. We gave similar homework in 2000, 2001 and 2002 as well as a similar exam question on formal proofs. The homework consisted of logic proofs, given the premises and the conclusion. The exam question was a logic proof to complete with some parts of the proof provided, and the student had to fill in the missing parts.

We analysed the results over the various assessments between those of the control group (2000 class, 431 students who did not used the tool at all) and the two groups who used the Logic Tutor (the 2001 class with 390 students who used the geographically constraining unix-based Logic Tutor and the 2002 class with 245 students who used the web-based Logic Tutor). Results showed that 2001 students scored higher in both assessments than 2000 students, with an effect size of 0.4 and 0.6 sigma respectively. Then 2002 students scored higher again, reaching an effect size of around 1 sigma over the control group. The ANOVA test confirmed a statistical difference between the results (Yacef, 2003).

We also looked at the correlation between the results at the final exam question and the student activity in the Logic Tutor. Table 3 compares exam question results with the levels that the students reached in the Logic Tutor (on the left) and with the number of exercises they attempted (on the right). The right columns indicate the mean, the standard deviation in parentheses and the number of scores.

As we can see, scores appear to relate to the level and the activity of the student on the Logic Tutor. Students who used the tool the most and who progressed the most through the curriculum received higher marks. The only odd result is for the students in level 3, where the mean is only 3. This is due to three students in this set who did not attempt the question at all (therefore received 0). Omitting them would have given 5.3.

These observations made in hindsight suggest that it is important to motivate students to use the tool as much as possible, to increase their results. It is actually planned to give this year's students access to these statistical data to show them that more practice can yield positive results at their exam.


To understand better how students use the tool, how they practice, and how they come to master both the tool and logical proofs, we have performed an analysis of the Logic-ITA data using symbolic data analysis methods (Diday, 2000).

Symbolic Data Analysis

With standard data one understands data that consist of a set of individuals and all individuals are described in an homogeneous way, usually by a list of variables that can be numerical or categorical. The collection of the exercises of the logic-ITA is an example of standard data. Each exercise is described with an identity number called qid, a level, its length called nlines and the number of different rules used in its proof called nrules. The variables qid and level are categorical variables while nlines and nrules are numerical. These data are presented in the table questioncontext.

By contrast, the collection of students stored by the Logic-ITA is not an example of standard data. Because the Logic Tutor is a tool that students access freely, different students attempt different exercises, and a different number of them. Further, different students make different mistakes. It would be very cumbersome and counterintuitive to have a table with all students and, as variables, all possible exercises with an empty field in case a student has not attempted a particular exercise. Further, we wish to group students into categories according to the number of exercises they have attempted. Symbolic data analysis allows to do both: it allows for introducing structured information in the definition of an individual, that is, to characterize a student only with the list of exercises he/she has attempted, and it allows to group individuals into symbolic objects.


We have performed symbolic data analysis on the data obtained from the Logic-ITA with the tool SODAS. We have created the following symbolic objects. The first object represents all students who have attempted exactly 1 exercise, the second represents all students who have attempted exactly 2 exercises, and so on, the last one represents all students who have attempted more than 21 exercises (see Table 4).

These objects have been defined that way to get a good picture of the students who do not use the tool much (1 exercise, 2 exercises and 3 exercises), and then to follow roughly the quartiles for the data collected during the year 2001. How students are distributed among these objects is shown below. Nb gives the total number of students in the object while % gives the percentage. The last object for 21 exercises and more is almost empty for year 2002, hence it has been put together with the preceding one.

Using the existing tables of the Logic-ITA, we have defined new variables for students. The variable Ratio_faults gives the average number of mistakes a student has made by attempted exercise. The variable Ratio_Correct gives the average number of correct lines a student has entered by attempted exercise. The variable Finish counts how many exercises a student has managed to finish successfully.

We have used SODAS to get a visualisation of these objects under an histogram form along 5 axes: Finish, Ratio_Correct, Ratio_faults, Performance and Level. We summarize all symbolic objects constructed with SODAS and present them following the axes in the series of tables below.

Axis 1: Finish

How to read tables 5 and 6

For example for the second line, among the students who have attempted 2 exercises, 47% could not complete any of them, 32% could complete one and 21% could complete both. And similarly for the following lines.


We can clearly see that the more students practice, the more exercises they are able to complete. In the year 2001, one notices a dramatic gap between the first three objects and the subsequent ones. More than half of the students (56%) who attempt only one exercise do not complete it. Among the students who attempt two or three exercises, more than 40% do not complete any exercise. This percentage drops dramatically when four or more exercises are attempted. For the year 2002, the trend is similar. However, the dramatic drop appears earlier, when two or more exercises are attempted.

Axis 2: Ratio_Correct

How to read tables 7 and 8

For example for the second line, among the students who have attempted 2 exercises, 37% could not enter any correct line for any of the two exercises attempted, 11% could enter an average of 3 correct lines per exercise attempted, 5% could enter an average of 4 correct lines per exercise attempted, 11% could enter an average of 7 correct lines per exercise attempted, 5% could enter more than 8 but not more than 10 correct lines, 26% could enter more than 10 but not more than 15 correct lines and 5% could enter more than 15 correct lines on average per exercise attempted.


For object 1 exercise year 2002, one notices that the percentage of students who have not entered any correct line (46%) is the same as the percentage of exercises with Finish 0. It appears to us that this is not mere coincidence. These two figures may well describe students who start the tool, look at an exercise and give up, without trying to do anything.

The average of correct lines entered by exercise attempted should be interpreted with care as different exercises have different numbers of lines. Thus, the longer an exercise is, the more correct lines one has to enter to complete it. Further, the more exercises students do, the more they are likely to do shorter or longer exercises than the average exercise. The column 0 does not solely represent weak students who cannot enter anything correctly. It also includes students who take a look at an exercise and decide, for some reason, not to do it (performance 0). However, the trend here is also clear and, of course, similar to the one of Finish: the more they practice, the more correct lines students are able to enter on average. There are fewer students attempting an exercise without entering any line in year 2002 than in 2001.

Axis 3: Ratio_faults

How to read tables 9 and 10

For example for the second line, among the students who have attempted 2 exercises, 42% did not make any mistake for the two exercises attempted, 5% made on average 1 mistake per exercise attempted, 11% made an average of 2 mistakes per exercise attempted, 16% made an average of 3 mistakes, another 16% made an average of 4 mistakes and 11% made more than 5 mistakes on average per exercise attempted.


For object I exercise, one notices that the percentage of students in 2001 who have not made any mistake is the same as the percentage of exercises with performance 0 (38% see Table 11). Again, it appears to us that this is not mere coincidence. These two figures may well describe students who start the tool, look at an exercise and give up for some reason, without trying to do anything.

As with Ratio_Correct, this axis has to be interpreted with care. The more exercises students do, the more different logical rules they manipulate and the more mistakes they are likely to make. Here too, column 0 does not solely represent clever students who do not make any mistake. It includes also students who take a look at an exercise and decide, for some reason, not to do it (performance 0) and, therefore, they don't make any mistake! However, the trend here is also clear and confirms the one of Finish and Ratio_Correct: the more students practice, the fewer mistakes they make on average. Columns with high number of mistakes tend to disappear in both tables as the number of attempted exercises grows. One notices a larger number of mistakes for year 2002 than for year 2001 (columns 6, 7, 8-9, 9+), which may be surprising at first. However, we think that this fact has to be put in relation with column 0 of Performance below. There are much fewer students who just take a look at an exercise without attempting anything in year 2002 than in year 2001. Therefore, the number of mistakes per exercise which has been attempted is also higher on average in year 2002 than in year 2001.

Axis 4: Performance

How to read tables 11 and 12

For object 2 exercises, 42% of the exercises attempted by students who attempted 2 exercises yielded performance 0, 5% of the exercises yielded performance 1, 32% yielded performance 3, 5% yielded performance 4 and 12% yielded performance 5. Performance 2 is not present for this object. The meaning of performances was described in Table 1.


For the year 2001 columns 0 and 5 do not show any clear trend through the objects. The proportion of students who decide not to do an exercise (without giving any reason) and the proportion of students who give up after making mistakes because the exercise is too difficult (performance 5) remain stable as students attempt more and more exercises. By contrast, the proportion of students who do exercises without making any mistake (performance 1 when they complete and performance 4 when they give up) grows as they attempt more exercises, while the percentage of students who complete exercises after correcting mistakes (performance 3) diminishes. For the year 2002, there is no real clear trend for students who practice enough (4 and more attempted exercises). For students who do not practice much (first 3 objects), one retrieves the general trend: the more they practice, the more they complete exercises. Adding the performances showing that students completed the exercise (1, 2 and 3) gives 56% for 1 exercise, 76% for 2 exercises and 74% for 3 exercises.

Axis 5: Level

How to read tables 13 and 14

For example for 2 exercises, 17% of the exercises attempted by students who attempted 2 exercises correspond to a difficulty level of 0, 6% of the exercises to a level 1 and 77% to a level 5. As briefly explained in the section on the Logic Tutor, exercises have a level of difficulty ranging from 0 (unknown) to 5 (very difficult). The reason why there are so many exercises attempted at level 5 is that students were asked to do an assignment on the Logic Tutor and that this assignment involved exercises of level 5.


For both years, students prefer to practice general exercises from the database of exercises (level 5) followed by exercises from level 0. Exercises from level 0 are not stored in the database. Either students copy them from books, or they have downloaded them from the course's web site. The more students practice, the more they try anything from the database, including levels 1, 2, 3, and 4.

Overall comments: This analysis confirms what students' surveys and students' marks reveal: the tool is useful and the more students practice, the more successful they become at doing formal proofs; 2001 and 2002 have the same general trend.

However, students in 2002 performed better than students in 2001. This is shown in the tables. The proportion of students who do not complete any exercise is higher in 2001 than in 2002. The number of correct lines per attempted exercise is, on average, higher in 2002 than in 2001. Grouping together Performance 1, 2, and 3 generates Table 15. It shows clearly that students perform better in 2002 than in 2001. There are two main differences between year 2001 and year 2002. First, in 2002 the tool has been put online. Students could use it at university as in 2001, but also at home. Thus they could access it whenever it was convenient for them, and this may explain why figures in Performance 0 are much lower in 2002 than in 2001. In 2001, they may have accessed the tool, tried to do something and, then, they had to leave the terminal because their lab time expired. Also, in 2002, because they can access the tool when it is really convenient for them to do so, not just because they have their time in the lab, they may be able to concentrate better, thus to perform better. The second difference is that in 2002, an assignment had to be done with the Logic Tutor. Thus students may have been more motivated to perform better from the beginning.


Presently, the information that can be extracted from the database includes descriptive statistics given by SQL queries and association of mistakes given by the implementation of a fast algorithm for association rules.

Simple Queries to Get a First Impression of the Class


At any time, the teacher can use SQL queries on the database collated by the LT-Analyser to retrieve certain information or visualize through Microsoft Excel graphs. In particular the following queries were made:

1. analysis of the most common mistakes: most common mistakes made by students, number of mistakes per logic rule, percentage of incorrect usage of each rule;

2. analysis of exercises: exercises that caused the most mistakes, how many students have attempted each question, average performance of all students in a specific question; and

3. status on students' progress: number of students per level (we set five levels), find the types of errors per level of exercise, students who stagnates at level 1 for more than 10 exercises.

Examples of queries and charts are shown in Figure 2: breakdown per rule of incorrect and correct usage (top left window), and associated graph for the top ones (bottom left window) mistakes, breakdown of type of mistakes for each rule (right top window), and breakdown of mistakes for one particular rule (right bottom chart, for the rule Indirect Proof). Other common queries are the exercises producing the most mistakes, or the average performance for a particular exercise.



The teacher queries the student database with the purpose of identifying the most common mistakes and the logic rules causing the most problems and having a general feeling of how well the class is going.

For instance, in Figure 2, the teacher sees that Modus Ponens and Simplification were the most used rules and they were used correctly around two thirds of the times. Indirect and Conditional proofs, as well as Disjunctive Syllogism generated a lot of errors. In particular the incorrect calculation of premises was a predominant cause of errors. This indicates strongly to the teacher that IP and CP rules must be re-explained, with a strong focus on the manipulation of premises. The pie chart also shows that improvement can be made to the mastering of general elements of the formal proof (manipulation of premises, choice of reference lines).

Overall, some important results we found in 2002 were the following:

1. Out of 2746 mistakes in total, the most frequent mistakes was the use of Premise set incorrect with Indirect Proof and Conditional Proof. These were also the most frequently misused rules (71% and 59% respectively). This is due to the fact that they are the most difficult to grasp because they both require the assumption of an additional premise (for example the negation of the conclusion for Indirect Proof, i.e., proof by contradiction) and then the removal of this premise to reach the conclusion. The most frequent mistakes were with Modus Ponens (1146), however it was also the most frequently used rule. In the end it was used incorrectly 30% of the time.

2. Analysis of exercises: not surprisingly, the exercises that produced the most mistakes were the ones involving Indirect and Conditional Proof but they were also attempted by a larger proportion of students.

3. Status on students' progress: more than half the class stayed on level 1 (this included students who only logged in once or twice), then 20% moved to level 2 and 10% reached level 3, 4, and 5. These last 3 levels mean that the student is able to complete successfully almost all the exercises attempted.

These findings are mostly useful for the revision lectures and for the next teaching semester. Since time does not allow reviewing everything in class with the students before the exam, we were able to focus the revision lecture on the mistakes made the most frequently and the rules used the most incorrectly. Teachers can also use this information to reflect on their own teaching and change their strategies or contents in the following semesters. For example in our course, the level and nature of mistakes made with the two most difficult rules (IP and CP) suggested that some changes in the way the underlying concepts are introduced would be beneficial to improve students' understanding. At a more general level, if the teacher decides to change something in the curriculum or the teaching practice, s/he can compare data from various years to investigate whether there is a statistical difference between the two methods. This can be extremely useful for teaching control quality.

Association of Mistakes

The goal of association mining techniques is to find items, in our case mistakes, often occurring together. We will first describe the general algorithm. If the reader is familiar with the algorithm, s/he may wish to proceed directly to the following section.

Method (General Algorithm)

We suppose that we have a population of N transactions and each transaction is a set of items. Table 16 gives an illustration. Transaction 1, for instance, is the set {1, 2}. Items often occurring together are given by rules of the following form:

3->4, support 40%, confidence 66%, or 1->2, support 20%, confidence 66%.

The proper way to write a rule should be {3}->{4}. To simplify, we write 3->4 instead. The first rule means that if item 3 is present, then item 4 is also present. This is supported by 40% of the transactions with a confidence of 66%.

The concepts support and confidence have a precise meaning that we introduce now. Let T_1, T_2, ... T_N be N transactions, and I be the sets of items occurring in all T_1. In our example, we have I = {1, 2, 3, 4, 5, 6, 7} and 10 transactions. One is looking for rules of the form X->Y, where X and Y are subsets of I having a support and a confidence above a minimum threshold.

a- Support: sup(X->Y) = |{T_i with X, Y subsets of T_i}| / N. In other words, the support of a rule X->Y is the number of transactions that contain both X and Y divided by N, the total number of transactions.

b- Confidence: conf(X->Y) = |{T_i with X, Y subsets of T_i}| / |{T_i with X subset of T_i}|. In other words, the confidence of a rule X->Y is the number of transactions that contain both X and Y divided by the number of transactions that contain X.

The concept of support is to make sure that only items occurring often enough in the data will be taken into account to establish the association rules. Confidence measures whether Y is really implied by X. If X occurs a lot naturally, then almost any subset Y could be associated with it. A high enough confidence ensures that X and Y have some causal link and that it is not a coincidence caused by the naturally occurring high frequency of, say, X.

The algorithm (Agrawal & Srikant, 1994) works by constructing several extra lists. First, the list [L.sub.1] of single items having the desired support is constructed. Let us take a minimum support of 20% with the data of Table 16. Item I occurs in three transactions. Thus, sup(1) = 3 / 10 = 30%, hence sup(1) > 20%, therefore item 1 belongs to [L.sub.1]. Making similar calculations for all items leads to [L.sub.1] = (1, 2, 3, 4, 5). From [L.sub.1], one deduces the list [L.sub.2] of pairs having a support above or equal to the minimum. For example sup(1,2)= 2 / 10 = 20%, since two transactions only contain the subset {1,2}. Similarly, we obtain sup(1,4) = 1 / 10 = 10%. Making similar calculations for all possible pairs of the list [L.sub.1] gives in our example [L.sub.2] = ((1, 2), (1, 3), (2, 4), (2, 5), (3, 4), (3, 5), (4, 5)). Then, the list [L.sub.3] of triples with support above or equal to the minimum is constructed, which gives here [L.sub.3] = ((3, 4, 5)). In our example no extra list can be added because no quadruple has enough support.

From each list [L.sub.i], i>1, one tries all combinations and only rules with a confidence above the desired threshold are kept. For example, conf(1->2) = 2 / 3 = 66%, conf(5->3) = 3 / 5 = 60%. Taking rules having a confidence greater or equal to 60%, we get the association rules shown in Table 17.

Application to the Data from the Logic-ITA

To find association rules using the data of the Logic-ITA, we first have to clarify what transactions and items mean in our context. Obviously, transactions have to do with students and exercises, and items have to do with mistakes. However, there are various interpretations.

Not all students make the same use of the tool. Some have logged in for one session only, while others have logged in for many sessions. When using the tool, a student may attempt one or several exercises. Not all students do the same exercises. Thus, a transaction could be the set of mistakes made by a student on a specific exercise, or the set of mistakes made by a student on all attempted exercises, or the set of mistakes made by a student on all attempted exercises during one session. We have considered the first two interpretations as our data can not distinguish between two different sessions made the same day, (only dates are recorded, no times).

Consider the mistake Wrong rule used Modus Ponens described in the section on the Logic Tutor. The mistake has two parts: Wrong rule used and Modus Ponens. If the student would have used Disjunctive Syllogism instead of Modus Ponens, then the mistake would have been Wrong rule used Disjunctive Syllogism. If we take the first part only, we know that the student has not provided the right justification that matches the two lines. With the second part, we know that some mistake has been done while using the Modus Ponens rule. Looking for association on mistakes, we could take the full mistake name for an item, in the present case Wrong rule used/ Modus Ponens, or we can focus on the diagnosis, that is what went wrong, here Wrong rule used and ignore the name of the rule presently used, or we can focus on the rule involved in the mistake, here Modus Ponens and ignore the diagnosis. We have allowed all three interpretations.


The purpose of looking for mistakes often occurring together is for the teacher to emphasize subtleties while explaining concepts to students. Thus, it makes sense to have a support that is not too low. We fixed it at 60%.

We have obtained associations when a transaction is the set of mistakes on all exercises attempted by a student, and an item is interpreted as the diagnosis, in other words the type of the mistake. With other interpretations, the support was too low. Both in 2001 and 2002, the associations found for diagnosis made use of the three following items: Rule can be applied, but deduction incorrect, Premise set incorrect and Wrong number of line references given.

Let us explain what these messages mean, referring to the example shown in Figure 1. Consider line 3. If the student enters the formula A instead of ~A, the mistake Rule can be applied, but deduction incorrect is generated. Indeed, Modus Tollens can be applied, but the deducted proposition is ~A, as given in Figure 1, not A. Suppose now that the student gives only {1} in the Premise References field. Then the mistake Premise set incorrect is generated. Suppose now that the student gives only {1} in the Line References field. Then a Wrong number of line references given mistake is generated, because 2 lines of reference are needed.

All possible permutations of these 3 items gave associations with a support above 60% and a confidence above 76% as can be seen in Table 18 and Table 19.


The fact that the support was too low when rules are (part of) the item leads us to think that it is not the rules themselves, which are not the difficulty in logical proofs but it is how to use them.

The associations found show relations between mistakes involving line numbers in the premises (Premise set incorrect), line numbers in the reference lines a logic rule applies to (Wrong number of line references given) and incorrect use of logic rules (Rule can be applied, but deduction incorrect). This confirms observations made by human tutors. First, students often have difficulties at grasping all details required in a proof: one has to provide not only a logic rule, but also the lines it applies to, and these are different from the premises involved. Second, students do not realize at once that there are two kinds of logic rules: rules of equivalence that are applied to one formula only, and rules of inference that are mostly applied to two formulas. Most importantly, rules of equivalence can be applied to subparts of a formula whereas rules of inference can only be applied to whole formulae. For example in the formula ((A & B)->C) we can validly replace (A & B) with (B & A) in the formula using the rule of equivalence Commutation but it is not valid to deduce B from ((A->B)->C) and A using the rule of inference Modus Ponens.

We believe that the details (line numbers, premises involved) are well taken care of by the Logic Tutor itself. The tool forces students not to omit any of these details, while they tend to be sloppy when using paper and pencil only. However, these results indicate to lecturers that, in the future, they should stress more the differences between rules of equivalence and rules of inference once they have been introduced.

Finally, one notices that figures for support are lower for the year 2002 then for the year to 2001. This is consistent with the increase of students performances observed both with the marks they have obtained and the results of the symbolic data analysis presented earlier.


In this article, we have reported on our initial exploration of using some data mining techniques to analyse the data captured in web-based tutoring tools. We have described how this information may be used to improve both learning and teaching. Students benefit from a tutor which can give them step-by-step feedback. They get a chance to understand their mistakes quickly, as they make them, not only after getting back their work from a human tutor. Teachers get a chance to know their students better. Using symbolic data analysis, they are provided with information that gives them more insight on how students get along with the tool and the content. Querying the database, they know who is doing well and who has difficulties, what the most common mistakes made are, and so on. They may extract hidden information, such as mistakes often associated together and, that way, reflect on their way of explaining concepts to students.

We are currently investigating several directions to pursue this work. First, there is clearly a scope for expanding the range of data mining techniques. For example classification and clustering techniques appear very relevant to this pedagogical context. More information can be extracted from the database, such as classifying students according to their difficulties or abilities (Aguado, Merceron, & Voisard, 2003). However, experience shows that it is not always straightforward to use data mining techniques in a way that is relevant to the particular field of education. For example, there are temporal, sequential, and contextual aspects in the data that are very important. Mistakes made by students do not have the same weight for instance: a mistake made the first time a concept was introduced and then never done again may be interpreted as a positive evidence of learning: the student learned from the mistake and now has assimilated the concept. On another hand, a mistake made on an exercise that was too difficult for the student at the time (for example the student was at level 1 and the exercise was at level 5) could practically be ignored. And so on. Further work includes taking a closer look at implicative statistical analysis and study how it could complement our present work.

The next step is naturally to facilitate a more sophisticated analysis and data mining process. We are developing a platform called TADA-Ed integrating various visualisation and data mining techniques to extract pedagogically relevant information and help teachers to get more insight in the homework of their students. TADA-Ed is not specific to the Logic-ITA. Its modular design makes it extensible to accept data in various formats. Usability issues are on our agenda too: we cannot expect teachers to be proficient in data mining to extract the pedagogical information.

Second, there is also a large scope for applying these methods to much richer data. The more the student models are sophisticated and contain a richer model of the student background, cognitive abilities and knowledge reasoning, the more information we should be able to extract.

The ways the information can be exploited are numerous. We have only shown here some aspects that can be used by the teacher in a classroom. Other possibilities include the use of this information by the tutoring system to provide better and more proactive feedback to the student.
Table 1 Performance Measures and Meaning

0 Exercise unfinished
1 Solution found, no mistake made, shortest path
2 Solution found, no mistake made, longer path
3 Solution found, mistakes made
4 No solution found, no mistake
5 No solution found, mistakes made.

Table 2 Database Scheme

Table Attributes Description

Student student's login, tutorial group Tutorial group each
 student belongs to
Mistake student's login, exercise id, Index to mistakes
 name of mistake made, rule for each question
 involved, date attempted by each
Exercise context exercise id, exercise level, General context of
 number of lines Index to each exercise
 mistakes for each question
Performance student's login, exercise id, Each student's
 number of lines and number of overall performance
 rules used in student's in each question
 solution, performance measure attempted
 (between 0 and 5), date
Correct usage student's login, exercise id, Rules used correctly
 line number in the proof, rule for each student
 used, date
Count logins student's login, number of Number of times each
 times logged in student has logged
Student level student's login, student's level Current level of
 each student

Table 3 Breakdown of Exam Question Results per Student Level and per
Activity on the Logic Tutor (year 2002)

 Activity on the Logic
Student level Results Tutor Results

 5 6.6 (0.7), N=3 More than 10
 exercises 5.4 (2.5), N=36
 4 6.1 (1.1), N=10 6 to 9 exercises 5.1 (2.7), N=48
 3 3.0 (3.1), N=7 3 to 5 exercises 5.0 (2.8), N=56
 2 5.1 (2.8), N=46 0 to 2 exercises 4.2 (2.9), N=105
 1 4.6 (2.9), N=179

Table 4 Symbolic Objects and Distribution

Objects Nb. 2001 % 2001 Nb. 2002 % 2002

 1 16 6.89 13 7.73
 2 19 8.18 31 18.45
 3 22 9.48 18 10.71
 4-6 63 27.15 48 28.83
 7-10 57 24.56 33 19.35
11-15 29 12.50 19 11.30
16-20 16 6.89 6 3.57
21 + 10 4.31

Table 5 Distribution of Students According to the Number of Attempted
Exercises (Row) and the Number of Completed Exercises (Column) for Year

Fin. 0 1 2 3 4 5 6 7 8 9 10 11 12 14 15 16 19

 1 56 44
 2 47 32 21
 3 41 14 31 14
 4-6 14 8 29 27 13 6 3
 7-10 9 12 16 16 19 16 5 5 2
11-15 10 3 14 10 10 14 14 14 3 3 3
16-20 6 6 25 13 13 19 6 6 6
21 + 30 30 10 10

Fin. 20 21 26

21 + 10 10

Table 6 Distribution of Students According to the Number of Attempted
Exercises (Row) and the Number of Completed Exercises (Column) for Year

Fin. 0 1 2 3 4 5 6 7 8 9 10 11 12 14 15 16 19

 1 46 54
 2 13 23 65
 3 6 11 39 44
 4-6 4 8 27 19 29 10 2
 7-10 3 6 18 36 12 18 3 3
11-15 16 16 16 21 5 5 11 5 5
16 + 17 17 17

Fin. 20 21 26

16 + 33 17

Table 7 Distribution of Students According to the Number of Attempted
Exercises (Row) and the Number of Correct Lines per Attempted Exercise
(Column) for Year 2001

R_cor. 0 1 2 3 4 5 6 7 8 9-10 11-15 16 +

 1 44 6 6 25 19
 2 37 11 5 11 5 26 5
 3 23 5 9 18 14 5 9 9 5 5
 4-6 6 5 11 13 6 11 6 6 6 14 13 2
 7-10 5 2 19 9 12 14 9 9 11 7 2 2
11-15 3 3 10 17 10 24 17 3 3 7
16-20 13 6 25 25 6 13 13
21 + 20 40 30 10

Table 8 Distribution of Students According to the Number of Attempted
Exercises (Row) and the Number of Correct Lines per Attempted Exercise
(Column) for Year 2002

R_cor. 0 1 2 3 4 5 6 7 8 10 10 +

 1 46 8 8 8 8 23
 2 3 3 6 6 10 3 10 26 23 10
 3 6 11 22 11 11 17 17 6
 4-6 2 2 19 13 17 27 8 6 4 2
 7-10 3 3 15 9 27 33 6 3
11-15 11 5 26 21 21 11 5
16 + 17 17 17 33 17

Table 9 Distribution of Students According to the Number of Attempted
Exercises (Row) and the Average Number of Mistakes Made per Attempted
Exercise (Column) for Year 2001

R_fau. 0 1 2 3 4 5 5+

 1 38 25 19 19
 2 42 5 11 16 16 11
 3 27 14 27 14 9 9
 4-6 8 22 24 29 6 10 2
 7-10 5 33 32 16 7 5 2
11-15 38 45 17
15-21 31 50 13 6
21 + 60 30 10

Table 10 Distribution of Students According to the Number of Attempted
Exercises (Row) and the Average Number of Mistakes Made per Attempted
Exercise (Column) for Year 2002

R_fault 0 1 2 3 4 5 6 7 8-9 9 +

 1 62 15 15 19
 2 16 13 16 10 6 16 10 3 10
 3 11 28 17 17 11 6 11
 4-6 6 15 21 15 13 13 4 4 8 2
 7-10 3 9 30 33 6 6 6 3 3
11-15 21 32 16 11 16 5
16 + 17 50 17 17

Table 11 Distribution of Students According to the Number of Attempted
Exercises (Row) and the Performance Achieved in the Attempted Exercises
(Column) for Year 2001

Performance 0 1 2 3 4 5

 1 38 44 19
 2 42 5 32 5 16
 3 39 12 3 24 9 12
 4-6 34 11 1 38 5 11
 7-10 43 8 1 30 7 11
11-15 37 10 1 26 13 14
16-20 32 12 1 21 17 18
21 + 24 18 23 15 21

Table 12 Distribution of Students According to the Number of Attempted
Exercises (Row) and the Performance Achieved in the Attempted Exercises
(Column) for Year 2002

Performance 0 1 2 3 4 5

 1 23 15 38 23
 2 5 26 2 48 11 8
 3 9 30 44 9 7
 4-6 20 17 0 44 8 10
 7-10 18 13 0 39 12 17
11-15 32 10 0 33 11 15
16 + 10 22 48 10 10

Table 13 Distribution of Students According to the Number of Attempted
Exercises (Row) and the Level of the Attempted Exercises (Column) for
Year 2001

Level 0 1 2 3 4 5

 1 19 6 75
 2 17 6 77
 3 12 5 83
 4-6 10 7 1 82
 7-10 12 4 2 0 82
11-15 16 8 8 1 68
16-20 11 21 20 49
21 + 10 17 31 3 39

Table 14 Distribution of Students According to the Number of Attempted
Exercises (Row) and the Level of the Attempted Exercises (Column) for
Year 2002

Level 0 1 2 3 4 5

 1 15 23 62
 2 18 3 79
 3 13 87
 4-6 8 6 0 86
 7-10 12 11 0 1 76
11-15 11 9 6 2 73
16 + 5 15 13 12 4 51

Table 15 Comparison of Cumulated Performance for all Exercises Finished
With or Without Mistakes (Performance 1, 2, or 3)

 Perf. 1+2+3 (2001) Perf. 1+2+3 (2002)

 1 44 53
 2 37 76
 3 39 74
 4-6 50 61
 7-10 39 53
11-15 37 43
16-20 34 70
21 + 41

Table 16 Transactions and Their List of Items

Transactions Items sets

 1 {1, 2}
 2 {1, 2, 3}
 3 {1, 3, 4}
 4 {2, 4}
 5 {2, 4, 5}
 6 {2, 5}
 7 {3, 4}
 8 {3, 4, 5}
 9 {3, 4, 5, 7}
10 {3, 5, 6}

Table 17 Association Rules Obtained from the Data in Table 16

Rule Support Confidence

1->2 20% 66%
1->3 20% 66%
3->4 40% 66%
4->3 40% 66%
5->3 30% 60%
5->4 20% 66%
4, 5->3 20% 66%
3, 5->4 20% 66%

Table 18 Association Rules for Year 2001

Rule can be Wrong number
applied, but of line
deduction references Premise set
incorrect given incorrect Support Confidence

X 70% 90%
 X 70% 80%
X 65% 84%
 X 65% 82%
 X 70% 83%
 X 70% 88%
 X X 60% 87%
X 60% 78%
 X 60% 72%
X X 60% 92%
 X 60% 76%
X X 60% 87%

Rule can be Wrong number
applied, but of line
deduction references Premise set
incorrect given incorrect Support Confidence

 X 70% 90%
X 70% 80%
 X 65% 84%
X 65% 82%
 X 70% 83%
 X 70% 88%
X 60% 87%
 X X 60% 78%
X X 60% 72%
 X 60% 92%
X X 60% 76%
 X 60% 87%

Table 19 Association Rules for Year 2002

Rule can be Wrong number
applied, but of line
deduction references Premise set
incorrect given incorrect Support Confidence

X 65% 87%
 X 65% 80%
X 61% 82%
 X 61% 79%
 X 67% 83%
 X 67% 87%

Rule can be Wrong number
applied, but of line
deduction references Premise set
incorrect given incorrect Support Confidence

 X 65% 87%
X 65% 80%
 X 61% 82%
X 61% 79%
 X 67% 83%
 X 67% 87%


We thank Romain Buquet, Vincent Chagny, Gregory Debord, Raphael Di Cicco and Olivier Fayau for having implemented the association rule algorithm and extracted the association rules for mistakes. We thank Samir Belkacem for having performed the analysis with SODAS.


Abraham, D., Crawford, L., Lesta, L., Merceron, A., & Yacef, K. (2001). The logic tutor: A multimedia presentation. Interactive Multimedia Electronic Journal of Computer-Enhanced Learning, 3(2).

Agrawal, R. & Srikant, R. (1994). Fast Algorithms for Mining Association Rules. In Proceedings of the 20th International Conference on Very Large Databases, (pp. 487-499), Santiago, Chile.

Aguado, B., Merceron, A. & Voisard, A. (2003). Extracting Information from Structured Exercises. In Proceedings of the 4th International Conference on Information Technology Based Higher Education and Training ITHET03, Marrakech, Morocco.

Anderson, J. R., Corbett, A. T., Koedinger, K. R., & Pelletier, R. (1995). Cognitive tutors: Lessons learned. The Journal of the Learning Sciences, 4(2), 167-207.

Calvo, R., & Grandbastien, M. (2003). Towards intelligent learning management systems, workshop proceedings. Sydney, Australia, University of Sydney.

CHIC [Online]. Available:

Croy, M.J. (1989). CAI and empirical explorations of deductive proof construction. The Computers and Philosophy Newsletter, 4, 111-127.

Croy, M.J. (1999). Graphic interface design and deductive proof construction. Journal of Computers in Mathematics and Science Teaching, 18(4), 371-386.

Diday, E. (2000). Analysis of symbolic data: Exploratory methods for extracting statistical information from complex data. Heidelberg, Germany: Springer-Verlag.

Forster, A. (2002, October). Online teaching and learning. SYNERGY. Sydney, Australia.

Gras, R., Almouloud, S., Bailleuil, M., Larher, A., Polo, M., Ratsimba-Rajohn, H. & Totohasina, A. (1996). L'implication statistique. Nouvelle methode exploratoire de donnees. Application a la didactique. Grenoble, France, La pensee sauvage.

Gras, R., Briand, H., Peter, P. & Philippe, J. (1997). Implicative statistical analysis. In Proceedings of the Fifth Conference of the International Federation of Classification Societies (IFCS-96), (pp. 412-419). Kobe, Japan: Springer-Verlag.

Jean, S. (2000). Pepite: un systeme d'assistance au diagnostic de competences. Unpublished doctoral dissertation University of Le Mans, Le Mans.

Kinshuk, Patel, A., Oppermann, R. & Russell, D. (2001). Role of Human Teacher in Web-based Intelligent Tutoring Systems. Journal of Distance Learning, 6(1), 26-35.

Kinshuk, Hong, H., & Patel, A. (2001). Human Teacher in Intelligent Tutoring System: A Forgotten Entity! In H. R. Okamoto, Kinshuk, & J. Klus (Eds.) Advanced learning technology: Issues, achievements and challenges. Los Alamitos, CA: IEEE Computer Society.

Leroux, P., Vivet, M., & Brezillon, P. (1996). Cooperation between a pedagogical assistant, a group of learners and a teacher. In Proceedings of the European Conference on Al in Education, (pp. 379-385), Lisbon, Portugal.

Lesta, L., & Yacef, K. (2002). An intelligent teaching-assistant system for logic. In S. Cerri & F. Paraguo (Eds.), International Conference on Intelligent Tutoring Systems (ITS'02), Biarritz, France. Berlin: Springer-Verlag.

Merceron, A., & Yacef, K. (2003). A web-based tutoring tool with mining facilities to improve learning and teaching. In F. Verdejo & U. Hoppe (Eds.), Proceedings of the 11th International Conference on Artificial Intelligence in Education., Sydney, Australia. Burke, VA: IOS Press.

Scheines, R., & Sieg, W. (1994). Computer environments for proof construction. Interactive Learning Environments, 4(2), 159-169.

Smail, C., & Hussmann, S. (2003). The implementation and evaluation of an individualised, web-based, formative and summative assessment software tool for large classes. In Proceedings of ITHET'03, (pp. 87-92), Marrakech, Morrocco.

SODAS [Online]. Available:

Vasilakos, T., Devedzic, V., Kinshuk, & Pedrycz, W. (2004). Computational intelligence in web-based education. Journal of Interactive Learning Research, 15(4), 299-318. Special Issue.

Virvou, M., & Moundridou, M. (2001). Adding an instructor modelling component to the architecture of ITS authoring tools. International Journal of Artificial Intelligence in Education, 12, 185-211.

Vivet, M. (1992). Uses of ITS: Which role for the teacher? In E. Costa (Ed.), New directions for intelligent tutoring systems. Berlin, Heidelberg, New York: Springer-Verlag.

Yacef, K. (2002). Intelligent teaching assistant systems. In Kinshuk (Ed.), Proceedings of the International Conference on Computers in Education (ICCE'02), Auckland, New Zealand.

Yacef, K. (2003). Some thoughts about the synergetic effects of integrating ITS and LMS technologies together to the service of education. In R. Calvo & M. Grandbastien (Eds.), Proceedings of Towards Intelligent Learning Management Systems, held in conjunction with AIED'03, (pp. 174-182), University of Sydney, Australia.

Yacef, K. (2003). Experiment and evaluation results of the Logic-ITA. Technical report 542. School of Information Technologies, University of Sydney.


ESILV--Pole Universitaire Leonard de Vinci, France


University of Sydney, Australia
COPYRIGHT 2004 Association for the Advancement of Computing in Education (AACE)
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2004, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

Article Details
Printer friendly Cite/link Email Feedback
Author:Yacef, Kalina
Publication:Journal of Interactive Learning Research
Geographic Code:8AUST
Date:Dec 22, 2004
Previous Article:Computational intelligence in web-based education: a tutorial.
Next Article:A framework for concept-based digital course libraries.

Related Articles
Intelligent Systems/Tools in Training and Lifelong Learning.
Understanding the gap between an AmericaReads program and the tutoring sessions: The nesting of challenges.
A case evaluation in Internet assisted laboratory teaching.
Data mining technology for the evaluation of learning content interaction.
Computational intelligence in web-based education: a tutorial.
DB-suite: experiences with three intelligent, web-based database tutors.
Effect of a Socratic animated agent on student performance in a computer-simulated disassembly process.
Reliability and factor structure of the Attitude Toward Tutoring Agent Scale (ATTAS).
Supporting the active learning process.

Terms of use | Privacy policy | Copyright © 2022 Farlex, Inc. | Feedback | For webmasters |