Printer Friendly
The Free Library
5,677,147 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

In plain English, please; communicating with computers using natural language processing.


IN PLAIN ENGLISH Plain English (sometimes known, more broadly, as plain language) is a communication style that focuses on considering the audience's needs when writing. It recommends avoiding unnecessary words and avoiding jargon, technical terms, and long and ambiguous sentences. , PLEASE

Communicating with computers using natural language processing Natural language processing

Computer analysis and generation of natural language text. The goal is to enable natural languages, such as English, French, or Japanese, to serve either as the medium through which users interact with computer systems such as
.

A natural language processor is a computer program that permits the computer to understand human language. "Understand" in this context means the computer can accept ordinary human language input through the keyboard and then perform the required computing tasks. Hence, the program transforms everyday written language into a form usable by a computer.

Human languages are complex and incorporate many features that produce ambiguity: different meanings for the same words and different meanings produced by sentence structures, idioms and metaphors. Many problems must be solved in transforming them for use by a computer. The foremost is converting potentially ambiguous input into an unambiguous statement. A natural language processor must deal with more than one potential interpretation of input.

Over the years, this has been a stumbling block stum·bling block
n.
An obstacle or impediment.


stumbling block
Noun

any obstacle that prevents something from taking place or progressing

Noun 1.
. Many researchers claimed their natural language programs could perform impressive feats of understanding. Although commercial software vendors advertised their products as "programs that understand you so you don't have to understand them," the claims often exceeded the capabilities. The programs frequently appeared to perform well under highly structured experimental conditions but failed when confronted with real world tasks.

Recently, however, two developments contributed to progress in practical applications of natural language processing technology.

* The dramatic increases in computer speed and memory capacity of computers and the cost-effectiveness of computing have made it possible to apply greater capabilities to the task.

* A shift toward simplicity--from trying to emulate human understanding of language to achieving a capacity to understand sufficient for carrying out practical tasks.

Businesses are not likely to see natural language processors with full human understanding of language in the near future. Researchers pursuing this goal are producing techniques with some practical applications in peripheral aspects of language processing
For the processing of language by computers, see Natural language processing.


Language processing refers to the way human beings process speech or writing and understand it as language.
 (such as word recognition), but they still appear to be quite far from emulating the higher cognitive functions needed to duplicate real human understanding.

The American Institute of CPAs information technology research subcommittee has been studying this field and has prepared a management advisory services advisory services

advisory services provided to the public, in their capacity as owners and managers of animals, are an important part of veterinary science. They may be provided by government bureaux, by commercial companies who deal in pharmaceuticals or animals or animal
 special report on natural language processing--An Introduction to Natural Language Processing, to be published in April. This article was adapted from that report.

ADVANTAGES

Using ordinary human language to input and execute computing functions reduces a barrier to computer use. Consequently, productivity will be improved by

* Increasing users' abilities to handle complex tasks. Natural language allows tasks to be combined with a common language thread.

* Transforming text into a knowledge base. Natural language processors can scan large amounts of text to extract meaningful information and create a database.

* Expediting access to computer data. Natural language allows one to query existing databases or request reports without knowing exact coding, spellings or syntax.

* Extending the use of data to a larger population. Different people can use different terms to obtain the same results.

* Decreasing user training requirements. Natural language processors reduce the time required to learn new applications. They also benefit infrequent users by eliminating the time spent relearning re·learn·ing
n.
The process of regaining a skill or ability that has been partially or entirely lost.



re·learn v.
 specific commands.

Natural language processing technology is being used in several practical applications including

* Structured conversational systems.

* Interactive training systems.

* Interfaces to databases.

* Knowledge acquisition from texts.

STRUCTURED CONVERSATIONAL SYSTEMS

Cognitive Systems, Inc. (CSI CSI Crime Scene Investigator
CSI CompuServe, Inc.
CSI Commodity Systems, Inc.
CSI Commodity Systems Inc. (Boca Raton, FL)
CSI Crime Scene Investigation (CBS TV show)
CSI Christian Schools International
), a company founded by artificial intelligence researcher Dr. Roger Schank Roger Schank (* 1946) is president and CEO of Socratic Arts, and a leading visionary in artificial intelligence. Career
Schank was formerly professor of computer science and psychology at Yale University and director of the Yale Artificial Intelligence Project.
, has produced several structured conversational systems that provide advice to users through conversational dialogue in specific areas.

Quick-Quoter. One of these systems is Quick-Quoter, a homeowners' insurance sales system that gives customers information, ranging from estimated rebuilding costs and premium quotes to detailed coverage explanations, in ordinary English The phrase ordinary English, like ordinary language, is often used in philosophy and logic to distinguish between ordinary, unsurprising uses of terms and their more specialized uses in theorizing, or jargon. . The system works by leading customers through a conversation oriented toward gathering the information needed to make a premium quote. Along the way, it describes basic coverages, endorsements and personal property schedules. Customers can interrupt the process at any time by asking questions in their own words:

"What is the difference between guest medical and liability coverage?"

Quick-Quoter explains these insurance terms.

"Am I covered for hurricanes?"

Quick-Quoter provides details on specific coverage points.

"I don't know Don't know (DK, DKed)

"Don't know the trade." A Street expression used whenever one party lacks knowledge of a trade or receives conflicting instructions from the other party.
 how far I am from a fire station. Can I guess?"

Quick-Quoter helps answer confusing underwriting questions.

StreetSmart. Another system is StreetSmart, a stock portfolio advisory product that enables retail brokerage customers to get advice and information in ordinary English on the 1,600 Value Line stocks. The demonstration version observed by the subcommittee offered the opinions of five "experts" whose viewpoints range from a conservative contrarian's to an aggressive risktaker's. A user can submit a portfolio to one or all of the experts for evaluation, or ask StreetSmart questions in his or her own words:

"How am I doing?"

StreetSmart provides a historical performance evaluation Performance evaluation

The assessment of a manager's results, which involves, first, determining whether the money manager added value by outperforming the established benchmark (performance measurement) and, second, determining how the money manager achieved the calculated return
 for the portfolio.

"What do you think of my portfolio?"

StreetSmart does a critique of the user's portfolio.

"What do you recommend?"

StreetSmart calls on an "expert" for recommendations.

"What is the P/E P/E

See: Price/earnings ratio
 of EXXON?"

StreetSmart retrieves the price-earnings ratio Price-earnings ratio

Shows the multiple of earnings at which a stock sells. Determined by dividing current stock price by current earnings per share (adjusted for stock splits).
 of EXXON from the Value Line database.

INTERACTIVE TRAINING SYSTEMS

CSI also has combined natural language processing with computer graphics for interactive training systems. These systems differ from conventional computer-based instructional systems in using natural language processing technology to engage learners in a Socratic dialogue Socratic dialogue (Greek Σωκρατικός λόγος or Σωκρατικός διάλογος  with the system. The Socratic method Socratic method Education A teaching philosophy that differs from the traditional format as instruction is in the form of problem-solving and testing of hypotheses. See Layer cake education, Spoon feeding.  is regarded by some as a superior approach to teaching because its question-and-answer format requires the active participation of students in the learning process. There are five major components of these systems.

* A domain model contains the system's knowledge of the subject.

* A natural language interface understands student questions and responses.

* Computer graphics provide a realistic simulation of a work environment.

* A tutoring model guides the dialogue and selects appropriate teaching strategies.

* A student model evaluates progress and learning difficulties.

CSI says natural language processing technology is the key element in the system. By enabling students to ask questions in their own words, the technology provides the dynamic exchange of a Socratic dialogue.

One example of an interactive training system is Teller Trainer, a system for training new hires to master the basic skills and information needed for the job of bank teller A bank teller is an employee of a bank who deals directly with most customers. In some places this employee is known as a cashier.

Tellers are considered a "front line" in the banking business.
. The system asks a student to perform a typical banking task, such as depositing a check in a savings account Savings Account

A deposit account intended for funds that are expected to stay in for the short term. A savings account offers lower returns than the market rates.

Notes:
 and getting back $50 in cash. The student learns through a combination of doing (moving money, filling out forms, accessing balance information, etc., on the graphically simulated teller station) and asking questions ("How do I deposit checks?"). The system responds conversationally to these questions, either directly or through analogies that help students figure out the correct procedures for themselves.

INTERFACES TO DATABASES

The majority of practical applications of natural language processing technology are in interfacing with databases. An interface is a language or code that permits a user or program to communicate with other programs.

Natural language interfaces with databases require far less training than structured command languages and other conventional interfaces, and they can be used rapidly and accurately in situations in which the user needs to access data in unanticipated ways. This area has been of great interest because it can improve the productivity of people who need computer-stored information to make decisions. Whether or not users are computer literate computer literacy
n.
The ability to operate a computer and to understand the language used in working with a specific system or systems.



computer literate adj.
, through an effective natural language interface they can access information in a database in seconds or minutes instead of relying on traditional, more time-consuming channels of information gathering. They can ask questions without having to stop and think of computer commands or search documentation.

Some products of this type are listed in exhibit 1 at right. They vary somewhat in their underlying methodologies and the extent of their capabilities but, for the most part, they share basic linguistic knowledge and query processing characteristics.

Linguistic knowledge. The products have enough built-in linguistic knowledge--words and grammar--to permit the computer to understand user input. The lexicon, or vocabulary, consists of a predefined base vocabulary plus a domain specific vocabulary (terms specific to a given database and additional terms defined by the user). The more highly capable products come with large base vocabularies and automatically incorporate the database field names in the domain-specific vocabulary. In addition, these systems can perform certain functions (such as retrieval and sorting) when asked simple questions by the user. See LINGUISTIC KNOWLEDGE on page 48 for functions that can be performed with certain questions or commands.

Query processing steps. The products process a query in the following steps:

* The user asks a question in ordinary English. For example, "Who are the New York New York, state, United States
New York, Middle Atlantic state of the United States. It is bordered by Vermont, Massachusetts, Connecticut, and the Atlantic Ocean (E), New Jersey and Pennsylvania (S), Lakes Erie and Ontario and the Canadian province of
 employees?"

* If clarification is needed, the system asks for it. For example, "Do you mean New York City New York City: see New York, city.
New York City

City (pop., 2000: 8,008,278), southeastern New York, at the mouth of the Hudson River. The largest city in the U.S.
 or New York State?"

* If the query contains an unknown word, the system allows the user to define or edit it. For an example, see the screens in exhibits 2 and 3 System presents options to define or edit an unknown word and System permits user to define synonyms for databases field names

* The system processes the query using the information in its grammar and lexicon. It then tries to produce one or more interpretations.

* The system states its understanding of the query and presents it to the user for verification. Because it is designed to accept a wide variety of inputs and the language entered could be ambiguous, before proceeding the system shows the user its plan for dealing with each query. At this point ambiguities are discovered and resolved: The user can either confirm the system's interpretation or request that it go on to another one. For an example, see exhibit 4 System permits user to approve or disapprove dis·ap·prove  
v. dis·ap·proved, dis·ap·prov·ing, dis·ap·proves

v.tr.
1. To have an unfavorable opinion of; condemn.

2. To refuse to approve; reject.

v.intr.
 interpretation.

* When an interpretation has been verified, the corresponding database commands are executed.

* The underlying database system retrieves the answer and displays the results.

One available product differs significantly from the others because it is not connected to a particular database management system and its query processing does not result in direct execution of database commands. Parlane, the natural language database interface system produced by BBN (BBN Technologies, Cambridge, MA, www.bbn.com) A consulting firm that participated in the development of some of the most extensive networks in the world, including ARPANET, which evolved into the Internet. It was founded in 1948 as a consulting service in acoustics by Dr.  Laboratories (BBN), translates natural language input into structured query language See SQL.

Structured Query Language - SQL
 (SQL SQL
 in full Structured Query Language.

Computer programming language used for retrieving records or parts of records in databases and performing various calculations before displaying the results.
), the standard command language used with many database management systems, and the SQL commands are then applied to the underlying database. Hence, BBN's natural language interface could be applicable with the many different relational database relational database

Database in which all data are represented in tabular form. The description of a particular entity is provided by the set of its attribute values, stored as one row or record of the table, called a tuple.
 systems that support SQL.

KNOWLEDGE ACQUISITION FROM TEXTS

ATRANS, a commercial product, and ELOISE ELOISE European Land-Ocean Interaction Studies , a system developed as an exploratory project for the Securities and Exchange Commission, are examples of natural language processing applications Noun 1. natural language processing application - an application program that deals with natural language text
natural language processor

application program, applications programme, application - a program that gives a computer instructions that provide the
 in acquiring knowledge from texts.

ATRANS. This is a system for automatic analysis and formatting of texts from messages containing instructions to carry out funds transfers, which may have been sent in various formats. The system separates a funds transfer from other types of messages, extracts all the components of the transfer (for example, amount, dates, parties, etc.), identifies the account numbers for the relevant transfer parties and rewrites the result in standard formats for funds transfer messages. The system reads the message left to right the way a human operator would, extracting all information relevant to the money transfers contained in the message. The work is performed by a parser A routine that analyzes a continuous flow of text-based input and breaks it into its constituent parts. See parse.

(language) parser - An algorithm or program to determine the syntactic structure of a sentence or string of symbols in some language.
, which breaks down the messages and words so they can be converted into machine language.

This system has been in commercial use by a major corporation for more than two years. The user reportedly has realized significant productivity improvements. The manual processing of money transfer payments requires approximately four to eight minutes per transfer; the average human operator is capable of completing 100 transfers per day. Using ATRANS, processing time is improved to 20 seconds per transfer and operator productivity is increased to 500 transfers per day.

ELOISE. The SEC developed EDGAR Edgar or Eadgar (both: ĕd`gər), 943?–975, king of the English (959–75), son of Edmund, king of Wessex. In 957 the Mercians and Northumbrians rebelled against Edgar's brother Edwy and chose Edgar as their king.  (electronic data gathering, analysis and retrieval), a system to permit companies to submit SEC filings electronically. In 1985, during the pilot test of the EDGAR system, a subsystem was implemented using natural language processing technology to read the text of proxy statements and index the documents as an aid to subsequent analysis. This subsystem is called ELOISE, which stands for English language English language, member of the West Germanic group of the Germanic subfamily of the Indo-European family of languages (see Germanic languages). Spoken by about 470 million people throughout the world, English is the official language of about 45 nations.  oriented indexing system for EDGAR. Its goal was to index the documents filed with the SEC by concepts contained in them so analysts can readily obtain answers to questions such as, "Which of this year's proxy statements contain bylaws The rules and regulations enacted by an association or a corporation to provide a framework for its operation and management.

Bylaws may specify the qualifications, rights, and liabilities of membership, and the powers, duties, and grounds for the dissolution of an
 changes to create a new class of stock?" Since the indexing is by concepts rather than keywords, ELOISE must identify the concepts underlying the text by using two knowledge bases:

* An English language knowledge base containing knowledge about English grammar English grammar is a body of rules specifying how meanings are created in English. There are many accounts of the grammar, which tend to fall into two groups: the descriptivist , sentence structure and meanings of words and phrases Words and Phrases®

A multivolume set of law books published by West Group containing thousands of judicial definitions of words and phrases, arranged alphabetically, from 1658 to the present.
.

* An SEC knowledge base containing specific knowledge about items of interest to the SEC as well as vocabulary unique to proxy statements.

OPPORTUNITIES FOR THE ACCOUNTING PROFESSION

Given the nature of accounting work and the current capabilities of natural language processing technology, five applications are likely to become common in the future.

General research. Access to online databases is a common application today. Natural language interfaces will allow easier use and promote the development of even larger text-oriented databases.

Data selection. Traditional methods of extracting data require knowledge of computer programming and file layout Same as record layout.  to make a selection. Natural language would eliminate these requirements by allowing users to specify selection terms using human language.

Intelligent tutors. Natural language allows computer-based training See CBT.

(application) Computer-Based Training - (CBT) Training (of humans) done by interaction with a computer. The programs and data used in CBT are known as "courseware."
 courses to include more effective question-and-answer sessions. Trainees are able to interact with the tutor using ordinary language instead of structured codes.

Document interpretation. Natural language processors provide the ability to scan a variety of documents such as newspapers, correspondence and manuals. The result could be an interpretation of the underlying concepts gathered from any single fact or the intuitive linkage of multiple facts contained in the text.

Analytical review Noun 1. analytical review - an auditing procedure based on ratios among accounts and tries to identify significant changes
limited review, review - (accounting) a service (less exhaustive than an audit) that provides some assurance to interested parties as to the
. Natural language could enhance the analytical review process by extracting information from multiple sources such as accounting records, corporate minutes and other databases.

LINGUISTIC KNOWLEDGE
Function performed               Command or question
Selective retrieval              "List the managers with salaries
                                 over $40,000."
Sorting                          "Show the managers sorted
                                 alphabetically within branch
                                 office."
Mathematical calculations        "What is the total of Miller's and
                                 Johnson's salaries?"
Aggregating operations --        "Give me the average salary of
maximum, minimum, count, etc.    the managers in each branch
                                 office."
Date computations                "Who will become 65 in 1992?"
Selective updating of multiple   "Increase the salaries of
data records                     managers in the Chicago office
                                 by 10%."


[Exhibits 1 to 4 Omitted]

KARL G. KING, CPA (Computer Press Association, Landing, NJ) An earlier membership organization founded in 1983 that promoted excellence in computer journalism. Its annual awards honored outstanding examples in print, broadcast and electronic media. The CPA disbanded in 2000. , is a partner of Crowe, Chizek and Company, South Bend, Indiana This article is about the city in Indiana, US. For other uses of the name South Bend, see South Bend (disambiguation).
South Bend is a city in St. Joseph County, Indiana, United States.
. He was chairman of the American Institute of CPAs information technology research subcommittee when the report An Introduction to Natural Language Processing was written. RAYMOND W. ELLIOTT is a partner of Coopers & Lybrand, New York. He is currently chairman of the technology research subcommittee. Both are contributing editors to the Journal's Micros/Technology department.
COPYRIGHT 1990 American Institute of CPA's
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 1990, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Author:Elliott, Raymond W.
Publication:Journal of Accountancy
Date:Mar 1, 1990
Words:2450
Previous Article:Responsive. (Should the FASB be Neutral or Responsive?)
Next Article:Word crunching: a primer for accountants.
Topics:



Related Articles
Conversing with computers naturally. (Natural Access System)
Staying out of hog water: tips for writers.
Writing development and second language acquisition in young children.
Negotiating with foreign language-speaking subjects.
"...But can they read and write?" (promoting literacy)
SEC proposes mandating plain English.(Brief Article)
Sacre bleu! English as a global lingua franca? Why English is rapidly achieving worldwide status.
Plain English on trial.
A WORD ON DISCLOSURE DOCUMENTS: SIMPLIFY!(BUSINESS)
Clear language.(LETTERS)(Letter to the editor)

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles