Printer Friendly
The Free Library
19,573,952 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

A neural network based software retrieval system with fuzzy-related thesaurus.


Abstract: The qualities of both the classification and retrieval queries have significant impacts on the retrieval performance of a software retrieval system. A classification scheme based on a Nested Self-Organising Map (NSOM NSOM Near-Field Scanning Optical Microscopy
NSOM Network and Space Operations and Maintenance
) and a query refinement mechanism based on a fuzzy-related thesaurus were proposed to promote the qualities. An NSOM consists of a top map and a set of nested maps. The retrieval on the top map maintains high recall while the retrieval on the nested maps enhances precision. A fuzzy-related thesaurus can be generated from an NSOM. The user can reformulate Verb 1. reformulate - formulate or develop again, of an improved theory or hypothesis
redevelop

formulate, explicate, develop - elaborate, as of theories and hypotheses; "Could you develop the ideas in your thesis"
 an improved query by adding terms or replacing an original query term with an appropriate term stored in the thesaurus. The experimental results reveal that both the NSOM and query refinement significantly improved the retrieval performance.

Keywords: Self-organising map, Software retrieval, Query refinement, Thesaurus.

I. Introduction

The performance of a retrieval system is usually measured by recall and precision. Recall is the proportion of relevant material retrieved, measuring how well a system retrieves all the relevant material. Precision is the proportion of retrieved material that is relevant, measuring how well the system retrieves only the relevant material. Recall and precision tend to be related inversely in·verse  
adj.
1. Reversed in order, nature, or effect.

2. Mathematics Of or relating to an inverse or an inverse function.

3. Archaic Turned upside down; inverted.

n.
1.
. When a search is broadened to achieve better recall, precision tends to go down and vice versa VICE VERSA. On the contrary; on opposite sides. . In the context of software retrieval, for a given query we want to find the most relevant software component for reusing it with minimum effort to adapt it to a new application. In this case, precision is more important than recall. However, if the recall is not maintained in a reasonable level some most relevant software components may be missed. Therefore, how to improve precision without excessive compromise of recall is crucial for software retrieval systems. To achieve such a retrieval performance an appropriate classification scheme must be developed.

Another factor that will significantly influence the retrieval performance is the quality of retrieval queries. Formulating precise and effective queries in information retrieval information retrieval

Recovery of information, especially in a database stored in a computer. Two main approaches are matching words in the query against the database index (keyword searching) and traversing the database using hypertext or hypermedia links.
 systems has always been a difficult task, even for experienced users [1]. When searching for information to solve a problem, people often do not have a clear idea of what information is needed. Searching for information may be regarded as a situation of irresolution ir·res·o·lute  
adj.
1. Unsure of how to act or proceed; undecided.

2. Lacking in resolution; indecisive.



ir·res
 or an anomalous a·nom·a·lous  
adj.
1. Deviating from the normal or common order, form, or rule.

2. Equivocal, as in classification or nature.
 state of knowledge, in which users believe that the knowledge that can help solve their problem exists, but they are unable to characterise Verb 1. characterise - be characteristic of; "What characterizes a Venetian painting?"
characterize

differentiate, distinguish, mark - be a distinctive feature, attribute, or trait; sometimes in a very positive sense; "His modesty distinguishes him from his
 the problem or articulate their information needs adequately. Therefore, ill-defined queries are very common in retrieval systems and a query refinement mechanism is necessary to help promote retrieval performance.

There has been a large amount of effort devoted to finding suitable approaches to building software retrieval systems [2-4]. However, it was concluded by Mili et al. [5] this issue has not been satisfactorily solved. This paper will present a neural network neural network or neural computing, computer architecture modeled upon the human brain's interconnected system of neurons. Neural networks imitate the brain's ability to sort out patterns and learn from trial and error, discerning and extracting  based software retrieval system with a fuzzy-related thesaurus to enhance retrieval precision without excessively compromising recall. Both the classification scheme and the fuzzy-related thesaurus based query refinement are based on an unsupervised neural network, the Self-Organising Map (SOM) [6].

SOM has been extensively used in document classification [7-9]. Such classifications are usually coarse-grained and cannot accommodate high precision in information retrieval [8]. We developed a sophisticated neural network architecture, called Nested Self-Organising Map (NSOM), to achieve an optimal balance between recall and precision. The NSOM based classification will be done in two levels and the accuracy of the classification will be enhanced from the first course-grained level to the second fine-grained level. The coarse-grained classification at the first level is used to maintain a high level of recall and the precision will be improved by the fine-grained classifications at the second level.

Query refinement is an essential information retrieval tool that interactively recommends new terms See suggestions for new terms.  related to a particular query for the user to improve the quality of the query. When the user interacts with a retrieval system, the system provides term excerpts that are considered relevant to a particular user query. The user can then reformulate the query by adding terms from the excerpts or replacing an original query term with a related term stored in the excerpts. This is called thesaurus-based query refinement. A fuzzy-related thesaurus is used in this retrieval system.

Several retrieval experiments have been done and very promising results have been observed. The NSOM based retrieval improved precision in comparison with other retrieval systems. The query refinement worked well for the ill-defined queries and a significant improvement on both recall and precision was obtained.

The remainder of the paper is organised as follows. Section II presents an overview of the retrieval system. Section III describes how to represent a natural language query A query expressed by typing English, French or any other spoken language in a normal manner. For example, "how many sales reps sold more than a million dollars in any eastern state in January?" In order to allow for spoken queries, both a voice recognition system and natural language  for the retrieval. Section IV presents a sample NSOM based retrieval session to show how the retrieval system works. Section V discusses the query refinement mechanism. The experimental results are presented in Section VI. Section VII concludes the paper.

II. Overview of the Software Retrieval Systems

This software retrieval system is based on software textual tex·tu·al  
adj.
Of, relating to, or conforming to a text.



textu·al·ly adv.
 documentation associated with software components rather than on the components themselves. The most significant characteristic of software components is their functionality for the purpose of reuse reuse - Using code developed for one application program in another application. Traditionally achieved using program libraries. Object-oriented programming offers reusability of code via its techniques of inheritance and genericity. . Most software components contain textual descriptions of their functionality in the form of system descriptions, operation manuals or user documents etc. Because the major interest in software retrieval is the functionality of the required components, these natural language documents can be used as the surrogates of the software components in the process of software classification and retrieval.

The system consists of a number of modules, the representation scheme using automatic indexing approach to transform the software documents into feature vectors In pattern recognition and machine learning, a feature vector is an n-dimensional vector of numerical features that represent some object. Many algorithms in machine learning require a numerical representation of objects, since such representations facilitate processing and , the NSOM based classification scheme classifying the feature vectors, and the retrieval mechanism with a fuzzy-related thesaurus.

The representation scheme uses an automatic free-text indexing method to identify the features associated with each component. An automatic indexing method is used to identify the indices associated with a software document collection, called a corpus. Each index can be considered as a feature belonging to the document from which it is identified. The total number of the indices obtained from a corpus is the dimension of the feature vector. The feature space containing a number of feature vectors will be presented to the SOM as its input data.

A SOM can learn from its input data. Each input stimulus elicits a localised localised - localisation  response. This corresponds to a non-linear projection of the input data onto the network that makes the most important semantic See semantics. See also Symantec.  relationships among the input data items geometrically ge·o·met·ric   also ge·o·met·ri·cal
adj.
1.
a. Of or relating to geometry and its methods and principles.

b. Increasing or decreasing in a geometric progression.

2.
 explicit [6]. It is this property of SOM that makes it useful for classification. An NSOM consists of a number of SOMs and they are organised in two levels, a single map at the top level and a number of nested maps at the second level. Each nested map contains a subset A group of commands or functions that do not include all the capabilities of the original specification. Software or hardware components designed for the subset will also work with the original.  of the original document collection. A coarse-grained classification on the top map will support high recall and the fine-grained classifications on the nested maps will accommodate high precision.

The retrieval mechanism enables the user to express queries in natural language without the need of understanding the inner working of the retrieval mechanism. A domain dependent fuzzy-related thesaurus is developed for query refinement to help improve the retrieval performance for ill-defined queries. The details of the representation and classification schemes have been reported in [10-11]. This paper will concentrate on the retrieval mechanism. A typical retrieval process is presented in Figure 1.

[FIGURE 1 OMITTED]

III. Query Representation

The first step in retrieval is the formulation formulation /for·mu·la·tion/ (for?mu-la´shun) the act or product of formulating.

American Law Institute Formulation
 of queries. Traditionally, queries are specified by the user according to according to
prep.
1. As stated or indicated by; on the authority of: according to historians.

2. In keeping with: according to instructions.

3.
 an authorised Adj. 1. authorised - endowed with authority
authorized

lawful - conformable to or allowed by law; "lawful methods of dissent"

legitimate - of marriages and offspring; recognized as lawful
 formalism Formalism
 or Russian Formalism

Russian school of literary criticism that flourished from 1914 to 1928. Making use of the linguistic theories of Ferdinand de Saussure, Formalists were concerned with what technical devices make a literary text literary, apart
. It is usually required that the user should have good knowledge about the formalism and the inner working of the software library, which costs much user effort spent in formulating queries. This system accepts natural language queries to minimise the user effort in query formulation. The queries will undergo an indexing process first, and then will be represented as query vectors based on the indexing results.

The query indexing includes deleting stop words Stop words, or stopwords, is the name given to words which are filtered out prior to, or after, processing of natural language data (text). Hans Peter Luhn, one of the pioneers in information retrieval, is credited with coining the phrase and using the concept in his design. , stemming, replacing single terms with concept classes, and phrase formation. The specific methods used in the indexing process are the same as used in software document indexing and have been described in detail in [11]. As a result of indexing, queries are represented by a set of features. Assume that there is an n-dimensional document feature space to which a query will be mapped. The query will also be represented by an n-dimensional vector [q.sub.i] = [[[[sigma].sub.i1], [[sigma].sub.i2], ..., [[sigma].sub.ij], ..., [[sigma].sub.in]].sup.T]. Each element of the query vector corresponds to the presence or absence of a certain feature in the document feature space. If all the features assigned as·sign  
tr.v. as·signed, as·sign·ing, as·signs
1. To set apart for a particular purpose; designate: assigned a day for the inspection.

2.
 to a query are included in the document feature space, then the query vector will be formed easily by simply assigning as·sign  
tr.v. as·signed, as·sign·ing, as·signs
1. To set apart for a particular purpose; designate: assigned a day for the inspection.

2.
 the weights of these features to the corresponding elements and assigning zero to all the other elements in the query vector. Unfortunately, this is not always the case. The following steps may be used to deal with query features that are not included in the document feature space:

Step 1: A domain-dependent dictionary is employed to find a synonymous document feature for a given query feature.

Step 2: Query features that fail to find synonymous document features in Step 1 will be discarded dis·card  
v. dis·card·ed, dis·card·ing, dis·cards

v.tr.
1. To throw away; reject.

2.
a. To throw out (a playing card) from one's hand.

b.
 because these features are considered irrelevant to the components stored in the system.

The formed query vector can then be submitted to the retrieval system and the system will search the desired component(s) by projecting the query vector onto a pre-established NSOM to find the potential retrieval candidates.

IV. NSOM-based Retrieval

The architecture of the NSOM is shown in Figure 2. It consists of a top map (TM) and a number of nested maps (NM) (only one example is shown). The TM contains a whole software component collection and the NMs are software component maps of the sub-collections of the whole set of components. Assuming there is a certain sub-area of the TM whose centroid centroid

In geometry, the centre of mass of a two-dimensional figure or three-dimensional solid. Thus the centroid of a two-dimensional figure represents the point at which it could be balanced if it were cut out of, for example, sheet metal.
 is node c (the hexagon enclosed en·close   also in·close
tr.v. en·closed, en·clos·ing, en·clos·es
1. To surround on all sides; close in.

2. To fence in so as to prevent common use: enclosed the pasture.
 by dotted lines in Figure 2), there will be a number of documents mapped within this sub-area. The number of features associated with this sub document collection is much smaller than the full collection because of the small size of the sub-collection. Therefore, the sub-collection can be represented by a set of feature vectors with a much lower dimension. These feature vectors will be used to train a NM. On completion of the training, a more elaborate map of the sub-collection will be formed on the NM. The detailed algorithm algorithm (ăl`gərĭth'əm) or algorism (–rĭz'əm) [for Al-Khowarizmi], a clearly defined procedure for obtaining the solution to a general type of problem, often numerical.  of how to divide the top map into a number of NMs has been presented in [10].

[FIGURE 2 OMITTED]

This architecture enables a two-step retrieval process. The first step of the retrieval process conducts a coarse-grained retrieval on the TM while the second step performs a fine-grained retrieval on the NMs. First, the query vector will be mapped onto the TM. The location of the query vector on the TM will determine the corresponding NM. The query vector will then be mapped onto the NM and the matched cluster will be found on the NM. The components located in the matched cluster will be ranked according to their similarity Similarity is some degree of symmetry in either analogy and resemblance between two or more concepts or objects. The notion of similarity rests either on exact or approximate repetitions of patterns in the compared items.  to the query. The ranked component list will be returned to the user as the retrieval candidates.

A sample retrieval session presented here is based on an NSOM containing all the manual pages in the first section of the Unix User Manual. The TM is shown in Figure 3 where the number associated with each neuron neuron, specialized cell in animals that, as a unit of the nervous system, carries information by receiving and transmitting electrical impulses.
neuron
 or nerve cell

Any of the cells of the nervous system.
 indicates how many components reside at the neuron. Assume that a user query "How to transfer files over Internet?" is issued and the corresponding query vector is projected onto a node, called winning node (marked as "c" in Figure 3), on the TM. The matched cluster is the small hexagon plotted in solid lines in Figure 3 where 14 components reside. The retrieval targets for the query are ftp and tftp which are included in the cluster. If the retrieval procedure stops here, a high recall (1.00) and a poor precision (0.14) will be obtained.

[FIGURE 3 OMITTED]

A corresponding NM (shown in Figure 4) is constructed based on a sub-collection containing 53 components located in the area enclosed by the dotted lines in Figure 3. The original query will then be mapped onto the NM and a matched cluster enclosed by the dotted lines in Figure 4 is found. Components located in the cluster are tftp, telnet and ftp. The retrieval recall is 1 and precision is 0.666. Comparing this result with the result achieved at the TM, a significant improvement of precision is observed.

[FIGURE 4 OMITTED]

The enhanced retrieval performance achieved by the NSOM is under an assumption that the user queries are defined adequately. However, as discussed earlier, formulating precise and effective queries has always been a difficult task and ill-defined queries will have negative impact on the retrieval performance. A query refinement mechanism is provided to help the user overcome this difficulty.

V. Query Refinement with a Fuzzy-Related Thesaurus

For any given query feature thesaurus-based refinement will provide temporary feature excerpts in which the stored words or phrases have various kinds of relationships with the query feature. A relational thesaurus is usually used for this purpose [12]. However, a relational thesaurus contains limited feature relationships, such as synonyms, generic-specific, and whole-part, among the terms. In reality the relationships among the features are more complex than those are identified in the thesaurus. Explicitly identifying these complex feature relationships is a difficult job. But for the purpose of the query refinement for retrieval if different groups of features can be found and each group of features contribute to characterise a certain cluster these features can be considered close related. Assume that the user issues an ill-defined query containing only a few features that are relevant to the desired cluster. If the user can expand the query with the features in the feature group that characterise the desired cluster it is very likely the desired cluster will be targeted by the refined query. Fortunately, we can identify these feature groups using the trained SOMs. Software components located in a certain cluster on a map have similar functionality. This functionality are characterised by the most highly weighted features associated with the cluster. In other words Adv. 1. in other words - otherwise stated; "in other words, we are broke"
put differently
, highly weighted features existing in a cluster describe similar concepts. Thus, these features can be considered as fuzzy-related.

For a given query feature, its fuzzy-related features contained in an NSOM can be dynamically obtained. A given query feature can be mapped onto the TM and a winning node will be identified. The top 20 highly weighted features associated with the cluster are eligible for collecting into a fuzzy-related thesaurus. These highly weighted features will be provided to the user for expanding the query or replacing an original feature with a fuzzy-related new feature to improve the quality of the query.

VI. Retrieval Experiments

Several retrieval experiments have been done to assess the performance of the retrieval system. The measures used in the experiments are recall and precision. The hypotheses to be tested in the experiments are:

1. The NSOM based classification can enhance retrieval performance in comparison with other retrieval systems.

2. The thesaurus based query refinement can improve the retrieval results for ill-defined queries.

3. Incorporating the query refinement into an NSOM based retrieval system will further enhance the retrieval performance for ill-defined queries.

The first experiment for testing hypothesis 1 was intended to assess the effectiveness of the NSOM based classification only without involving the query refinement mechanism. The NSOM used in the experiment is the one presented in Section IV that contains all the manual pages in the first section of the Unix User Manual. First, the NSOM was compared with Guru, which is a software retrieval system considered capable of achieving a better-than-average retrieval performance [13]. Then, NSOM was compared with a publicly available full-text retrieval system--Personal Librarian (1) A person who works in the data library and keeps track of the tapes and disks that are stored and logged out for use. Also known as a "file librarian" or "media librarian." See data library.

(2) See CA-Librarian.
 (PL). It was observed that this system achieved an improvement of 4.59% on recall and 16.60% on precision in comparison with Guru, and an improvement of 35.87% on recall and 52.24% on precision in comparison with PL. The details of the experiment have been reported in [10].

A second experiment for testing hypothesis 2 was intended to test the fuzzy-related thesaurus based query refinement only without involving NSOM based classification. The thesaurus has been applied to a small collection containing only 97 Unix manual pages (operating system) Unix manual page - (Or "man page") A part of Unix's extensive on-line documentation. To read a manual page, type

man [-s
]

at a shell prompt, e.g. "man ftp" (the section number can usually be omitted).  classified on a single SOM and promising results have been reported in [14].

This paper will present the third experiment that focuses on the hypothesis 3. The retrievals will be done in two different procedures. One procedure will use thesaurus based query refinement but the other will not use the query refinement. The retrieval results obtained from the two procedures will be compared. Salton [15] stated three requirements that any representative test procedure must satisfy:

1. The queries, used for test purposes, must be user search requests actually submitted to and processed by the system.

2. The test collection must consist of documents originally included in the library, chosen in such a way that any advance knowledge concerning the retrievability of any given component by either system is effectively ignored.

3. The number of components of retrieval candidates selected by the two systems must correspond to the same cut-off cut-off Anesthesiology The point at which elongation of the carbon chain of the 1-alkanol family of anesthetics results in a precipitous drop in the anesthetic potential of these agents–eg, at > 12 carbons in length, there is little anesthetic activity, .

The first requirement was satisfied because the query set used in the experiment was collected from the users. The author conducted a survey from several Unix users at Southern Cross University, Australia, and collected a number of queries for retrieving Unix manual pages. Among the collected queries, 14 poorly defined queries were selected for the experiment.

For the second requirement, we used the test collection corresponding exactly to Section 1 of the Unix User Manual, i.e. the NSOM presented in Section IV. No advance knowledge has been used when choosing the documents but simply used the whole collection of the Section 1 of the Unix manuals.

As far as the third requirement is concerned, a retrieval cut-off in NSOM-based retrieval is determined by a cluster distance, dc. When a query is projected onto a map, the winning node and the cluster distance will specify a certain area on the map. Components located in such an area will be selected as retrieval candidates. For example, the area enclosed by the dotted lines in Figure 4 is the cluster found for the query "How to transfer files over Internet?". The winning node for the query is marked as c' and the cluster distance is 1. As retrieval precision is much more crucial than retrieval recall in software retrievals, a small cluster will achieve higher precision than a big cluster. The same cluster distance dc = 1 is chosen for both retrieval procedures. The retrieval results achieved by the two procedures can then be compared.

The experiment consists of the following steps:

* The queries were submitted to the system without any query refinement. The average recall and precision of the retrieval results were obtained.

* The queries were given to 3 subjects randomly selected from 6 volunteers and they were asked to use the thesaurus to refine the queries and submit them to the system to get the retrieval results.

* For each query, average recall and average precision based on the three subjects' retrieval results were calculated.

* An overall average recall and precision for all the queries were calculated. The retrieval performance is compared in Table 1.

The retrieval result for normal queries shown in Table 1 was obtained in the first experiment. Comparing this result with the retrieval result of the ill-defined queries (without query refinement) a large decline of both recall and precision in the ill-defined queries was observed. This means the quality of the queries has significant impact on the retrieval performance. For the same ill-defined query set the retrieval results after the fuzzy-related thesaurus based query refinement have been improved substantially. The recall and precision are improved by 32% and 75% respectively.

The three subjects endorsed that the thesaurus based query refinement is capable of enlightening en·light·en  
tr.v. en·light·ened, en·light·en·ing, en·light·ens
1. To give spiritual or intellectual insight to:
 them to choose appropriate terms for the refinement. The thesaurus provides intuitive information about the relevant terms for a given query to help the user find out the desired terms if the user suffers from the experience knows as "I cannot explain what I want, but I'll recognize it if I see it".

VII. Conclusions

In this paper, the NSOM based software retrieval system with a fuzzy-related thesaurus based query refinement is discussed. The problem of coarse-grained classification occurring in the previous SOM-based applications is isolated on the TM and these clusters can be fine-grained on the NMs. As a result, the NSOM based retrieval can achieve better retrieval performance. The retrieval on the TM maintains a high level of recall, and the retrieval on the NMs enhances precision. A trained NSOM classifies not only the software components but also the features associated with the components. Based on the classified features a fuzzy-related thesaurus can be generated to accommodate query refinements. The user can reformulate a query by adding terms to or replacing a query term from the original query with an appropriate term stored in the thesaurus.

Experimental results were compared and discussed. The results reveal that the NSOM based retrieval enhanced retrieval performance in comparison with Guru. Guru's retrieval performance was believed to be more than satisfactory and better than the average information retrieval systems [13]. To test the effectiveness of the fuzzy-related thesaurus based query refinement, retrieval results with query refinements and without query refinement for the same set of ill-defined queries are collected. It was observed that the retrieval system was capable of achieving a much more effective retrieval performance using the fuzzy-related thesaurus based query refinement. The improvements of the recall and precision achieved by the query refinement are 32% and 75% respectively.

References

[1] B. Velez, R. Weiss, M. Sheldon and D. Gifford. "Fast and Effective Query Refinement". In Proceedings of the 20th Annual International ACM (Association for Computing Machinery, New York, www.acm.org) A membership organization founded in 1947 dedicated to advancing the arts and sciences of information processing. In addition to awards and publications, ACM also maintains special interest groups (SIGs) in the computer field.  SIGIR SIGIR Special Interest Group on Information Retrieval (Association for Computing Machinery)
SIGIR Special Inspector General for Iraq Reconstruction
 Conference on Research and Development in Information Retrieval, pp. 6-15, 1997.

[2] R. Prieto-Diaz. "Implementing Faceted Classification A faceted classification system allows the assignment of multiple classifications to an object, enabling the classifications to be ordered in multiple ways, rather than in a single, pre-determined, taxonomic order.  for Software Reuse The ability to use software routines over again in new applications. This is one of the benefits of object technology. See object-oriented programming. ", Communications of the ACM (publication) Communications of the ACM - (CACM) A monthly publication by the Association for Computing Machinery sent to all members. CACM is an influential publication that keeps computer science professionals up to date on developments. , 34(5), pp. 88-97, 1991.

[3] Y. Maarek, D. Berry Berry, former province, France
Berry (bĕrē`), former province, central France. Bourges, the capital, and Châteauroux are the chief towns.
 and G. Kaiser. "An Information Retrieval Approach for Automatically Construction of Software Libraries", IEEE Transactions on Software Engineering The IEEE Transactions on Software Engineering (TSE) is a monthly journal published by the IEEE Computer Society. It contains peer-reviewed articles and other contribitions in the area of software engineering by computer scientists, covering theoretical results and empirical studies. , 17 (8), pp. 800-813.

[4] T. Isakowitz and R. Kauffman. "Supporting Search for Reusable re·use  
tr.v. re·used, re·us·ing, re·us·es
To use again, especially after salvaging or special treatment or processing.



re·us
 Software Object", IEEE Transactions on Software Engineering, 22 (6), pp. 407-423, 1992.

[5] A. Mili, R. Mili and R. Mittermeir. "A Survey of Software Reuse Libraries", Annals an·nals  
pl.n.
1. A chronological record of the events of successive years.

2. A descriptive account or record; a history: "the short and simple annals of the poor" 
 of Software Engineering, 5, pp. 349-414, 1998.

[6] T. Kohonen (person) T. Kohonen - A researcher at the University of Helsinki who has been studying neural networks for many years with the idea of modelling as closely as possible the behaviour of biological systems. . Self-Organising Maps, Spring-Verlag, Berlin, 1997.

[7] T. Kohonen. "Self-Organisation of Very Large Document Collections: State of the Art". In Proceedings of the 8th International Conference on Artificial Neural Networks (artificial intelligence) artificial neural network - (ANN, commonly just "neural network" or "neural net") A network of many very simple processors ("units" or "neurons"), each possibly having a (small amount of) local memory. , Springer springer

a North American term commonly used to describe heifers close to term with their first calf.
, Skovde, Sweden pp. 55-74, 1998.

[8] R. Orwig, H. Chen and J. Nunamaker. "A Graphical, Self-Organising Approach to Classifying Electronic Meeting Output", Journal of the American Society for Information Science, 48 (2), pp. 157-170, 1997.

[9] X. Lin, D. Soergel and G. Marchionini. "A Self-Organising Semantic Map for Information Retrieval". In Proceedings of the 14th Annual International ACM/SIGIR Conference on Research and Development in Information Retrieval, Chicago, pp. 262-269, 1991.

[10] H. Ye. "A Self-Organized Software Library". In Neural Networks Applications in Information Technology and Web Engineering, Borneo Publishing, 2005.

[11] H. Ye, and B. W. N. Lo. "Toward a Self-Structuring Software Library", IEE IEE Institution of Electrical Engineers
IEE Independent Educational Evaluation
IEE Initial Environmental Examination
IEE Initial Environmental Evaluation
IEE Idiopathic Eosinophilic Esophagitis
IEE Institute of Entrepreneurial Excellence
IEE Interim Expendable Emitter
 Proceedings--Software, 148 (2) pp. 45-55, 2001.

[12] G. Salton. Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer, Addison-Wesley, 1989.

[13] Y. Maarek. "Software Library Construction from an IR Perspective", SIGIR Forum, 25 (2), pp. 8-18, 1991.

[14] H. Ye and H. Liu. "A Fuzzy-Related Thesaurus for Query Refinement", Neural neural /neu·ral/ (noor´al)
1. pertaining to a nerve or to the nerves.

2. situated in the region of the spinal axis, as the neural arch.


neu·ral
adj.
1.
 Processing Letters, 19 (2), pp. 97-107, 2004.

[15] G. Salton. "Recent Studies in Automatic Text Analysis and Document Retrieval The ability to search for documents by keywords and other attributes such as date and author. It implies that the documents have been indexed on all pertinent fields and that keywords have been chosen based upon title and textual content. See document imaging and document management system. .", JACM JACM Journal of the Association for Computing Machinery
JACM Just Another Code Monkey
, 20 (2), pp. 258-278, 1973.

Author Biography

Dr Huilin Ye is a Senior Lecturer senior lecturer
n. Chiefly British
A university teacher, especially one ranking next below a reader.
 at the School of Electrical Engineering electrical engineering: see engineering.
electrical engineering

Branch of engineering concerned with the practical applications of electricity in all its forms, including those of electronics.
 and Computer Science, the University of Newcastle, Australia The university has enrolled approximately 17,000 full-time students (including more than 14,600 undergraduates) and about 9,000 part-time students.

Historically, the university is known for its educational innovation which is, in part, due to a sharpened nexus between teaching and
. She received a BEng at Harbin Engineering University Coordinates:

Harbin Engineering University (Simplified Chinese: 哈尔滨工程大学 
, China in 1982 and a PhD in Software Engineering at Southern Cross University, Australia in 2001. Her research interests include software classification and retrieval, neural network applications, feature modelling in software product lines Software Product Lines, or software product line development, refers to engineering methods, tools and techniques for creating a collection of similar software systems from a shared set of software assets using a common means of production. , and data mining.

Huilin Ye

School of Electrical Engineering and Computer Science The University of Newcastle University of Newcastle can refer to:
  • Newcastle University, a university in the United Kingdom.
  • The University of Newcastle, a university in New South Wales, Australia
, Callaghan, NSW NSW New South Wales

Noun 1. NSW - the agency that provides units to conduct unconventional and counter-guerilla warfare
Naval Special Warfare
 2308, Australia Huilin.Ye@newcastle.edu.au
Table 1. Comparison of retrieval results

                               Average   Average
Retrieval                      recall    precision

normal queries                 0.866     0.743
ill-defined queries without    0.573     0.362
  query refinement
ill-defined queries with the   0.756     0.632
thesaurus based refinement
COPYRIGHT 2007 Research India Publications
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2007 Gale, Cengage Learning. All rights reserved.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Author:Ye, Huilin
Publication:International Journal of Computational Intelligence Research
Geographic Code:1USA
Date:Jan 1, 2007
Words:4135
Previous Article:Eigenblock approach for face recognition.
Next Article:Evolution of organizational adaptability: application of Hexie management theory.
Topics:



Related Articles
A neural network - could it work for you?
The AI factory; how artificial intelligence will create 'smart plants.' (Cover Story)
Neural-net neighbors learn from each other.
Automatic flat dies gain artificial intelligence.
Decision support software for tax.
Artificial intelligence in accounting and business.
Intellectual Access to Images.
Computational hybrid system based on neural-fuzzy techniques & intelligent software agents to assist Colombian electricity free market.
Classification of fuzzy-based information using improved backpropagation algorithm of artificial neural networks.
Integrated intelligent systems for engineering design.

Terms of use | Copyright © 2012 Farlex, Inc. | Feedback | For webmasters | Submit articles