Printer Friendly

"Structural effectiveness for concept extraction through conditional probability".

INTRODUCTION

A concept is a comprehensive entity which allows identifying from a set of queries that appear together repeatedly (Bruno M.Fonseca et al. 2005).A concept might describe a synonym relation with words or it could describe a specific semantic relationship or more generic information with words. This indicates that concept keywords can be identified. Textual documents from the web are mostly extracted through keywords. Conceptual extraction can however be achieved, if not to the fullest extent but to an acceptable level, through concept keywords (Masaru Ohba et al 2005, Sergio Guadarrama & Marta Garrido, 2006). How to distinguish plain keywords from concept keywords? Concept keyword is a word that represents a key concept which is used to comprehend the subject content. Such concept keywords could be tagged with documents for identification/extraction purposes. However pure concept keywords alone do not represent concepts in textual documents. Concept to text generation can be achieved through structuring the textual content through setting the facts into a coherent text (Mirella Lapata, 2003). Structuring thus refers to how the information within a written text of a document is organized. To have a hierarchical structure, it is recommended to assume a tree like structure as an analogy during the structuring processes. For example the components (or words) may be related and expressed like leaves of the tree and expressing the content and the nodes by specifying how this content might be grouped through metaphorical (rhetorical) representation. Such representation may include contrast, sequencing and elaboration of the content. Of course, more than one possible tree representation might be needed if the domain content has large numbers of facts and metaphorical relations. It is demonstrated that sentences may be represented by a set of words with informative features like verb and its subject or a noun and its modifier etc. This paper attempts to demonstrate the merits of three chosen categories of structuring textual and domain dependent documents, so as to extract the concepts as accurately as possible. The paper elaborates experimental procedures with three selected categories for structuring the textual documents of a selected domain namely 'C Language'. The three categories of structuring a document are selected and designed in such a way, that the categorization leads from ' pure structuring' to ' ill structuring 'of the document through 'concept conditional structuring' (variations in number of words representing a concept); a novelty exhibited by the research work. The research findings will be useful to mobile learning environment, where instructions need to be small and crisp (Subhatul Marjan, 2014). In mobile learning environment, user learner might send short messages with broken sentences with domain as well as learning (instructional style) concept words or in other words the content sent would be ill structured. Hence there is a justification for the study on three different structuring arrangements. The proposed experimental results would determine the effectiveness of extracting the concepts, through the intended comparative study between these three types of structuring. Though pure structuring and ill structuring can be logically visible to a reader of the document, the conditional structuring (proposed by the research) may be achieved through representing the four types of (nature of) expressions namely 'factual', 'procedural', 'problem solving' and 'conceptual (perceiving)'. In fact all the four representations are conceptual in one form or other. To find out the relative presence of these four types, Naive Bayes conditional probability theory is applied (explained later). The efficiency is determined through analyzing the computational time required for extraction. In addition, the accuracy in extracting correct documents is also analyzed. The paper delimits its scope on 'concept' with these chosen four categories. The intended work is supported by literature study and this paper forms a part of a whole research program of the author(s). Conclusions are drawn from the comparative studies which will be of immense use to concept extraction research.

Literature Support for Problem Formulation:

The use of taxonomy of concept words for defining learning objectives (or comprehending the concept of any textual documents) in instructional materials has been suggested (Gagne Robert M. 1985). Therefore it is established that domain dependent concepts or learning concepts (of instructional materials) can be identified with the help of relevant concept keywords. A framework for contextual analysis of documents based on pedagogical issues has been documented (Omwenga and Rodrigues 2006). Multitude of instructional design theories have been adopted to assist learners that apply taxonomies of concept words (Hansson 2006).Concept keywords have been successfully proven to be useful for understanding concept documents with some objective measurements(Saleema Amershi et al--2009). Human-selected ideal concept keywords, could be tagged with documents and by using tf/idf (Term Frequency Inverse Document Frequency) they have produced an approximation of ideal or human selected ones, based on mere keywords (non conceptual).

This shows that contextual analysis for identifying concepts of textual documents is possible with the help of conceptually related keywords. In support of this, additional published works on commercial systems reveal that, 'Concept Net' a commercial site and an internet based capability that uses fuzzy logic for the purposes of comparing concepts that are expressed by words. 'Concept Net' is a structured resource as is 'Word Net' and the concepts can be described by words, but there are many different ways of doing it (Sergio Guadarrama& Marta Garrido, 2006). While 'ConceptNet' is meant for extracting commonsense knowledge, from web users, 'WordNet' is meant for organizing and categorizing concepts by a group of experts. This observation and the subsequent observation are important and relevant to our proposed work. 'Concept Net' does not have any ontology since it is not intended to be complete and sound, but fairly approximate. It includes a natural language analyzer. Another important observation of this published work is that ' ConceptNet' uses thematic grouping under parse sentences that would form into sets of concepts, such as : 'conceptually related to '; 'is a '; 'property of'; 'part of'; 'made of'; 'defined as'; 'capable of'; 'prerequisite of'; 'effect of'; 'used for '; 'desire of'; 'motivation of' etc under various heads like agents, events, spatial, affective, things etc. It is interesting to note that the comprehension of any domain dependent concept can well be achieved through these non domain specific groups of words. Thus a relationship between concepts and the sentences (set of words) have been tried out and established. It is thus clear that classification (or grouping) of texts is necessary for extracting information, and for understanding the concepts and also for transforming the text to produce summaries [Hammouda et al, 2004). It is further suggested that for feature extractions of text, rules may be applied for building relationships with words rather than using only pure words. Naive Bayes classifying technique has been applied for classification from the features of the textual documents. These classifications are useful for describing the domain of the text (content). But there is a problem with accuracy that may fall in some cases with negative example. In view of these literature supports, the research problem has been identified to design and classify documents into desired structural forms and by applying Naive Bayes conditional theory, determine whether the efficiency and accuracy of extraction of concepts is effective and validate the appropriate structuring procedure for concept extraction of textual documents.

Methodology:

An experimental work is proposed to represent selective samples of topics of 'C Language' as a case study (Kochan 1991). Three categories of documents (files) are designed under the three forms of structures namely i. Pure structure; ii. Structured with four conditional representations namely 'Factual', 'Procedural', 'Problem solving' and 'Conceptual'; and the third one--the 'Ill structured' with very minimal domain words (telegraphic words) kept in the document. These three designed documents were subjected to concept search through concept keywords used in a given conceptual short sentence (input to the proposed algorithm). Analysis includes CPU time consumed for searching and computation of probability values for successful extraction of concept words. Both independent probability for pure structured words and conditional probability (Naive Bayes) for conditionally structured words have been tried out. The results will show and demonstrate the best representation of structuring for efficiency and accuracy of concept extraction. Algorithm has been written using Java language.

Experimental setup:

Stage I:

The subject content (here it is 'C Language')is split up into selective topics. Each topic is further split into structured forms and grouped into one category, either of 'factual' or 'procedural' or 'problem solving' or 'conceptual'. Splitting up of topic according to any one of these four categories is done carefully so that each category (content)would be self contained representing only one exclusive category and will not be kept in any combined fashion. Even though this is not generally the case with real world documents, for the sake for experimental objective this was necessary. Or in other words, a reader of any one split up content would be sure that the particular content belongs only to one specific category. Such split up and structured contents are stored in separate files called 'Fully Structured Objects' (FSO).

Stage II:

These FSOs are further subjected to classification according to the four selected concept structures. The specific and pre-classified concept words which are related to these four categories (see Table 1.0), that are found in the FSOs are then tagged with a particular FSO. These tagged files are termed as 'Conditionally Structured Objects' (CSOs). Stemming is done and stop words are removed from these CSOs. Thus they (CSOs) would ultimately form into small, independent and semi structured forms.

Stage III:

The FSOs (of Stage I) are separately subjected to editing procedures so as to represent in ill structured forms for the sake of experiments. This is done by removing non-domain specific words and other unnecessary stop words. Such documents are further subject to stemming. Ultimately these files will only have domain words and they are termed as 'Ill Structured Objects' (ISO). In other words, ISOs will have only domain key words.

Representations of these three chosen structural forms are presented along with the required data in Table 2.0 of 10 chosen topics. The data shown in Table 2.0 are number of pure domain words; conditional concept words (selected from pre defined words--see Table 1.0) and the probability values (independent as well as Naive Bayes). Naive Bayes application is valid where association rules prevail (Kamruzzaman, et al. 2004). The dependability of association rules with Naive Bayes classifier has been proved by research on text classification of data mining. But this method ignores negative example for any specific class, the accuracy may fall in some cases. The negative representation in our selected categories may be minimal, as the concept words do not repeat in the selected four categories.

Experimental Procedure:

Users are asked to provide required concepts, in the form of keywords with domain specific and also conditional concepts (Ex. "How Pointers work?"). The algorithm is expected to fetch the particular FSO, namely "Working Principle of Pointers" for the sample input. The algorithm then analyses i. FSOs; ii. CSOs and iii. ISOs for successful extractions. The procedure uses independent probability values of domain words on FSOs and ISOs, but uses Naive Bayes probability values on CSOs. Table 1.0 presents the conditional concepts and the pre defined concept words (assumed by researcher herself, in addition to using those available from literature (Suriakala, M and Sambanthan, T. G,-2008). Note that the structure words are not exhaustive, but only samples. For the computation of conditional probability values many more words have been considered. The experiment is repeated several times with different inputs (cases). The extractions have been plotted along with the consumed CPU times for the extraction procedures of the three structured documents. The algorithm also analyses the input concept words and classify them into the four structural categories by comparing the pre classified concept words apart from domain dependent keywords. If not found in the pre defined categories, such input words are ignored. The objective of the experiment is to demonstrate the efficiency of formation or classification of the proposed structuring.

Key parameters for the analytical study include efficiency and accuracy. Efficiency is calculated in terms of computational time required for extracting FSOs, CSOs and ISOs, whereas accuracy is determined in terms of extracting the correct concept document file according to conceptual and domain words input by the user.

Independent and Conditional Probabilities for Concept Structuring:

As per Naive Baye'sconditional probability theorem, the probability of any instance of an event 'e' is represented as P(e) = Probability value of occurrence of event 'e'. Probability of event 'e' being in category 'C' = P (e|C). Probability of occurrence of a particular instance of the category [C.sub.i] = P ([C.sub.i]). Probability of generating event 'e' in a given category [C.sub.i] = P ([C.sub.i] | e). Applying Baye's theorem, the probability of instance 'e' of category 'Q' is computed as:

P (e|[C.sub.i]) = P(e) x P([C.sub.i]|e)/P([C.sub.i]) (1)

The event of equation (1) is for one document.

For all the objects (files), P (e|C) = [PI] N P(e). P(Ci|e)/i = 1 P(Ci) (2)

Where N in equation (2) is the total number of objects and Capitol Pi ([PI]) is the product of the values (elements) of each object.

RESULTS AND DISCUSSIONS

The computation of probability values (both independent for FSOs and ISOs and conditional for CSOs) is demonstrated below for a sample topic on "'int' Data type". The actual textual data are shown in Figures 1.0, 2.0 and 3.0 for factual, procedural and ill structured categories. Table 2.0 presents all computed values including processing time consumed by CPU for 10 chosen topics.

Fig. 1.0: Factual Category CSO.

What is 'int' data type? It is an integer constant. It is stated to
consist of a sequence of one or more digits. A minus sign
distinguishes that the value is negative. Values like 158,-10, and
0 are all valid examples. It should be noted that no embedded
spaces are allowed between digits. It should be underlined that no
commas are allowed even for large numbers.

Fig. 2.0: Procedural Category CSO.

Explain expression of 'int* data type.

An expression may consist of symbols* characters and constants. It
is illustrated that constants arc important to fully understand the
operations of 'C'. It is identified with two formats in 'C'. If the
first digit of the integer constant is zero, then the integer is
taken as labeled in octal notation that is in base 8. In that case,
the remaining digits of the value must be valid base-8 digits and
therefore must be from 0 through 7. It is again illustrated that if
the integer constant is preceded by a 0 and the letter x (capital
and lower cases are not distinguished! then it is labeled as
hexadecimal notation that is in base 16. In that case, the
remaining digits of the value must be valid base-16 digits and
therefore must be from 0 through 15. It is stressed that the
letters a through f (or A through F) are used for representing
digits 10 through 16.

Fig. 3.0: Ill Structured Document.

'int' data type.
integer constant. Consists of sequence of digits. Minus sign
means negative. No embedded spaces allowed. No commas allowed.
Expression may have symbols, characters and constants. There are
two formats. If the first digit is zero, then it is octal with
base 8. The remaining digits must be from O through 7. If the
first digit is O and the letter x (capital and lower case) then
it is hexadecimal with base 16. The remaining digits
must be O through 15. For digits 1O through 16 letters a through
f (or A. through F) are used.


For the topic "'int' data type", two documents (CSOs) have been designed for 'Factual' and 'Procedural' structures are shown in Fig. 1.0 and 2.0 as samples. An ill structured document (ISO) is also prepared for the total document (FSO) and it is shown in a Fig. 3.0. Total number of words in factual document (CSO) is 67 and in procedural document (CSO) is 164. The number of factual concept words in first document is (CSO) 5, while in the procedural document (CSO) is 8. Total number of domain words in the combined document (FSO) is 45. Total number of words in the combined document (FSO) is 231.Total number of words in ill structured document (ISO) is 98. Total number of domain words in the ISO is 39.

The independent probability value for FSO is 45 /231 = 0.195. The independent probability value for ISO is 39 / 98 = 0.459. The conditional probability value for factual CSO is calculated (1/67) * (5/25)

as: --; where the total number of non domain structure words is 25. Similarly, (5/67)

The conditional probability value for procedural CSO is calculated as: (1/164) * (8/38)

--; where the total number of non domain structure words is 38. (8/164)

Thus the conditional probability value for factual CSO is 0.04 and for procedural CSO is 0.026.

The experimental study is conducted with 10 topics and the results are tabulated in Table 2.0.

For the purpose of comparisons so as to prove the advantage of structuring the content, averages of the two probability values and the computational time taken for extractions are determined. Note that suitable total structure words in the category of factual, procedural, problem solving and conceptual have been considered with values 25, 40, 15 and 35 respectively. The average and consolidated results are presented below in Table 3.0.

Total probability value for CSO (for parallel the values are added and for serial it is multiplied): 0.17.The computational results for the comparative study are plotted and the average probability values for the three structures are shown in Fig.4.0.

It is observed from Fig.4.0 that even though the probability value of ISO is more than CSO, ISO cannot distinguish structural concept. Both FSO and ISO consider only domain words as concepts like pure keywords. Hence CSO's efficiency is demonstrated through Fig.4.0. Within the CSO, the individual average conditional probability values of the four structures are presented in Fig. 5.0. It is observed from the Fig. 5.0, that the probability of discovering 'Problem Solving' concept is more than others.

Fig.6.0 shows the average processing values (in ms) for all the cases. In an average the processing time consumed by CSO is less than FSO. Besides, users may choose any particular structural concept of his/her choice, which is possible with the cases of FSO and ISO. Hence the effectiveness of CSO is also validated.

Conclusions:

The experimental results clearly demonstrate that even though the computational time consumed for extracting ill structured documents is less compared with extracting other types of structured documents, the accuracy of domain concept extraction suffers a lot in this case. Besides, FSO and ISO cannot distinguish structural concept from mere domain. Further, both the FSO and ISO consider pure keywords and not concept. The experimental result is also vivid in showing that extracting concepts through pure domain words cannot be achieved accurately from fully structured documents unless the documents are tagged with concept words. Ill structured documents too cannot display meaningful concept. It is concluded that conditional structuring of documents with instructional concept words, whentagged with such documents will be more accurate and efficient in extracting concepts.

ARTICLE INFO

Article history:

Received 12 March 2015

Accepted 28 April 2015

Available online 1 June 2015

Corresponding Author: S. Florence Vijila, Research Scholar, Manonmaniam Sundaranar University, Tamil Nadu, India, Assistant Professor, CSI Ewart Women's Christian College, Melrosapuram, TamilNadu. E-mail: florencevijila@yahoo.com

REFERENCES

Bruno M. Fonseca, Paulo Braz Golgher, Bruno Possas, and Berthier A. Riberio-Neto, 2005. "Concept-based Interactive Query Expansion", ACM Conference on Information and Knowledge Management (CIKM), Bremen, Germany, pp: 696-703.

Gagne Robert, M., 1985. "The conditions of learning and theory of Instruction". 4th edition. New York: Holt, Rinehart, and Winston. xv, 361pages.

Hammouda, K., M. Kamal, 2004. "Efficient phrase-based document indexing for Web document clustering", IEEE Transactions on Knowledge and Data Engineering, 16(10): 1279-1296.

Hansson, H., 2006. 'The use of Net-Learning in Higher Education in the Nordic Countries', In Pre-information for the presentation, Kyoto, Japan.

Kamruzzaman, S., M. Farhana Haider and Ahmed RyadhHasan, 2004. ""Text Classification using Association Rule with a Hybrid Concept of Naive Bayes Classifier and Genetic Algorithm", Proc. 7th International Conference on Computer and Information Technology (ICCIT-2004), Dhaka, Bangladesh, pp: 682-687.

Kochan, Stephen, 1991. "Programming in C", CBS Publishers & Distributors, New Delhi.

Masaru Ohba, Katsuhiko Gondow, 2005. "Toward mining 'concept keywords' from identifiers in large software projects". ACM SIGSOFT Software Engineering Notes, 30(4): 1-5.

Mirella Lapata, 2003. Probabilistic Text Structuring: Experiments with Sentence Order, Proceedings of ACL-2003, Association for Computational Linguistics, Stroudsburg, PA, USA.

Omwenga, E.I. and A.J. Rodrigues, 2006. 'Towards an Education Evaluation Framework: Synchronous and Asynchronous e-Learning Cases', Journal of the Research Centre for Educational Technology, Kent, [online], http://www.rcetj.org/Default.aspx?type=art&id=4756

Saleema Amershi and Cristina Conati, 2009. "Combining Unsupervised and Supervised Classification to Build User Models for Exploratory Learning Environments", Journal of Educational Data Mining, Article, 2, 1(1), Fall.

Sergio Guadarrama, Marta Garrido, 2006. "Concept-Analyzer: A tool for analyzing fuzzy concepts", B. Reusch, editor, Computational Intelligence: Theory and Practice, 164: 353-366. Springer.

Subhatul Marjan, 2014. "Making Mobile Learning Implementation More Effective",

http://www.elearningserv.com/blog/making_mo bile_learning_implementation_more_effective/ 2004.

Suriakala, M. and T.G. Sambanthan, 2008. "Problem Centric Objectives for Conflicting Technical Courses", The Indian Journal of Technical Education, 31: 87-90.

(1) S. Florence Vijila and 2Dr. K. Nirmala

(1) Research Scholar, Manonmaniam Sundaranar University, Tamil Nadu, India, Assistant Professor, CSI Ewart Women's Christian College, Melrosapuram, Tamil Nadu

(2) Associate Prof. of Computer Applications, Quaid-e-Millath Govt. College for Women, Chennai, Tamil Nadu, India

Table 1.0: Pre-defined sample words and functionalities of the four
Structures.

Category   Structure         Input                What is it?
           Category

1          Factual           Facts             Knowledge of facts.
2          Procedural        Procedures;       Knowledge of how to
                               Algorithms,       perform a sequence
                               Processes         of operations.
3          Problem Solving   Heuristics;       How to develop a
                               Methods; ,        solution plan.
                               Techniques        Knowledge of problem
                                                 types, organizing
                                                 frameworks and
                                                 mental models.
4          Conceptual        Concepts;         Knowledge of problem
                               Schemas; Models   types, organizing
                                                 frameworks, mental
                                                 models.

Category   Structure            Concept Keywords (Samples)
           Category

1          Factual              list, what, note, define, tell,
                                  name, locate, identify,
                                  distinguish, acquire, write,
                                  underline, relate, state, recall,
                                  select, repeat, recognize,
                                  reproduce, measure, memorize.
2          Procedural           demonstrate, explain, how, write,
                                  detail, summarize, illustrate,
                                  interpret, contrast, predict,
                                  associate, distinguish, identify,
                                  show, label, collect, experiment,
                                  recite, classify, stress, discuss,
                                  select, compare, translate,
                                  prepare, change, rephrase,
                                  differentiate, draw, explain,
                                  estimate, fill in, choose, operate,
                                  perform, organize.
3          Problem Solving      apply, calculate,
                                  illustrate, solve, make
                                  use of, predict, how,
                                  construct, assess,
                                  practice, restructure,
                                  classify.
4          Conceptual           analyze, resolve, justify, infer,
                                  combine, integrate, why, plan,
                                  create, design, generalize, assess,
                                  decide, rank, grade, test,
                                  recommend, select, explain, judge,
                                  contrast, survey, examine,
                                  differentiate, investigate,
                                  compose, invent, improve, imagine,
                                  hypothesize, prove, predict,
                                  evaluate, rate.

Table 2.0: Computed results from the experimental studies.

Topic No.&      Document type               FSO
Title
                                  Total No.      No. of
                                  of Words    Domain Words

1. Data Types      Combined         1540          340
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

2. Variables       Combined          766          150
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

3. Expression      Combined         2030           89
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

4. Functions       Combined         1868           59
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

5. Looping         Combined         3402           87
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

6. 'if branch      Combined          988           34
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

7. Arrays          Combined         1240           86
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

8. Pointers        Combined         2340          534
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

9. Structures      Combined         1128           62
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

10. Files          Combined          786          127
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

Topic No.&      Document type              FSO
Title
                                  Independent   Extraction
                                  Probability   Processing
                                                 Time ms

1. Data Types      Combined          0.22           98
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

2. Variables       Combined          0.20           55
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

3. Expression      Combined          0.04          102
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

4. Functions       Combined          0.03           97
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

5. Looping         Combined          0.03          143
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

6. 'if branch      Combined          0.03           67
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

7. Arrays          Combined          0.07           84
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

8. Pointers        Combined          0.23          102
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

9. Structures      Combined          0.05           88
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

10. Files          Combined          0.16           64
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

Topic No.&      Document type              CSO
Title
                                  Total No.    No. of
                                  of Words    Structure
                                                Words

1. Data Types      Combined          --          --
                    Factual          632         20
                  Procedural        1128         38
                Problem Solving      434         12
                  Conceptual         232         18

2. Variables       Combined          --          --
                    Factual          320          9
                  Procedural         546         16
                Problem Solving      112          9
                  Conceptual         120         16

3. Expression      Combined          --          --
                    Factual          765         18
                  Procedural         844         32
                Problem Solving     1256         12
                  Conceptual         480         16

4. Functions       Combined          --          --
                    Factual          842         21
                  Procedural        1026         39
                Problem Solving     1440         15
                  Conceptual         640         18

5. Looping         Combined          --          --
                    Factual          854         19
                  Procedural        1240         23
                Problem Solving     2760         17
                  Conceptual        1026         19

6. 'if branch      Combined          --          --
                    Factual          240          8
                  Procedural         686         17
                Problem Solving      850          9
                  Conceptual         126          9

7. Arrays          Combined          --          --
                    Factual          642         17
                  Procedural         988         22
                Problem Solving     1020         10
                  Conceptual         230          6

8. Pointers        Combined          --          --
                    Factual          544         12
                  Procedural        1054         18
                Problem Solving     1604         10
                  Conceptual        1240         10

9. Structures      Combined          --          --
                    Factual          542         10
                  Procedural         888         13
                Problem Solving     1020         10
                  Conceptual         646          8

10. Files          Combined          --          --
                    Factual          241          6
                  Procedural         542         10
                Problem Solving      667          8
                  Conceptual         209          4

Topic No.&      Document type             CSO
Title
                                  Conditional   Extraction
                                  Probability   Processing
                                                 Time ms

1. Data Types      Combined           --            --
                    Factual          0.04           44
                  Procedural         0.03           80
                Problem Solving      0.06           32
                  Conceptual         0.03           18

2. Variables       Combined           --            --
                    Factual          0.04           30
                  Procedural         0.03           48
                Problem Solving      0.07           9
                  Conceptual         0.03           10

3. Expression      Combined           --            --
                    Factual          0.04           67
                  Procedural         0.03           70
                Problem Solving      0.07          102
                  Conceptual         0.03           33

4. Functions       Combined           --            --
                    Factual          0.04           68
                  Procedural         0.03           79
                Problem Solving      0.07           83
                  Conceptual         0.03           58

5. Looping         Combined           --            --
                    Factual          0.04           70
                  Procedural         0.03          101
                Problem Solving      0.07          198
                  Conceptual         0.03           92

6. 'if branch      Combined           --            --
                    Factual          0.04           19
                  Procedural         0.03           60
                Problem Solving      0.07           79
                  Conceptual         0.03           11

7. Arrays          Combined           --            --
                    Factual          0.04           55
                  Procedural         0.03           82
                Problem Solving      0.07           98
                  Conceptual         0.03           19

8. Pointers        Combined           --            --
                    Factual          0.04           48
                  Procedural         0.03           98
                Problem Solving      0.07          124
                  Conceptual         0.03          111

9. Structures      Combined           --            --
                    Factual          0.04           49
                  Procedural         0.03           74
                Problem Solving      0.07           87
                  Conceptual         0.03           52

10. Files          Combined           --            --
                    Factual          0.04           18
                  Procedural         0.03           40
                Problem Solving      0.07           53
                  Conceptual         0.03           14

Topic No.&      Document type              ISO
Title
                                  Total No.      No. of
                                  of Words    Domain Words

1. Data Types      Combined          722          276
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

2. Variables       Combined          302          102
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

3. Expression      Combined          868           53
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

4. Functions       Combined          898           54
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

5. Looping         Combined         1498           64
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

6. 'if branch      Combined          408           30
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

7. Arrays          Combined          553           76
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

8. Pointers        Combined         1002          384
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

9. Structures      Combined          504           38
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

10. Files          Combined          249           86
                    Factual          --            --
                  Procedural         --            --
                Problem Solving      --            --
                  Conceptual         --            --

Topic No.&      Document type             ISO
Title
                                  Independent   Extraction
                                  Probability   Processing
                                                 Time ms

1. Data Types      Combined          0.38           50
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

2. Variables       Combined          0.34           33
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

3. Expression      Combined          0.06           61
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

4. Functions       Combined          0.06           66
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

5. Looping         Combined          0.04           98
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

6. 'if branch      Combined          0.07           28
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

7. Arrays          Combined          0.14           37
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

8. Pointers        Combined          0.38           49
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

9. Structures      Combined          0.08           40
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

10. Files          Combined          0.35           38
                    Factual           --            --
                  Procedural          --            --
                Problem Solving       --            --
                  Conceptual          --            --

Table 3.0: Average Values for Comparisons.

Structure   Average Total   Average Total    Average Total
Type        No. of Words    Domain Words    Structure Words

FSO             1609            156.8             --
CSO (Fa)        562.2            --               14
CSO (Pr)        894.2            --              22.8
CSO (PS)       1116.3            --              11.2
CSO (Co)        494.9            --              12.4
ISO              700            116.3             --

Structure     Average       Average         Average
Type        Independent   Conditional   Processing Time
            Probability   Probability   for Extraction
               Value         Value           (ms)

FSO            0.106          --              90
CSO (Fa)        --           0.04            46.8
CSO (Pr)        --           0.03            73.2
CSO (PS)        --           0.07            86.5
CSO (Co)        --           0.03            41.8
ISO            0.19           --              50

Legend: Fa: Factual; Pr: Procedural; PS: Problem Solving;
Co: Conceptual.
COPYRIGHT 2015 American-Eurasian Network for Scientific Information
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2015 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Author:Vijila, S. Florence; Nirmala, K.
Publication:Advances in Natural and Applied Sciences
Article Type:Report
Date:Jun 15, 2015
Words:4846
Previous Article:Automatic battery replacement of robot.
Next Article:Modeling connected-path link dominating set in MANET.
Topics:

Terms of use | Privacy policy | Copyright © 2022 Farlex, Inc. | Feedback | For webmasters |