Printer Friendly

Voice recognition technology for persons who have motoric disabilities.

John has quadriplegia. Yet John controls his stereo, opens his refrigerator, turns on his ceiling fan, and manages his architectural firm by using just his voice and a computer. Although this sounds like science fiction, voice recognition technology (VRT) exists today and makes such applications possible.

VRT may provide an important opportunity for accommodating the physically challenged by furnishing an acceptable, if not exceptional, alternative for people who are unable to use a computer keyboard. Some proponents have even suggested that technology may serve as an electronic bill of rights to the physically challenged (Lazzaro, 1990).

Technology, however, is not a panacea. Technology is a tool that must be understood so it may be used correctly to provide an efficient and effective solution for a given problem. To this end, this paper will describe how VRT works, as well as various applications, considerations, and resources for VRT in light of its role for supporting the physically challenged.

The Growth of Adaptive Technology

The Americans with Disabilities Act (ADA) of 1990 was designed to reduce barriers to educate, rehabilitate, and employ individuals with disabilities. The law provides the equivalent of civil rights protection to individuals with disabilities against discrimination and guarantees equal opportunity in terms of employment, education, public accommodations, transportation, government services, and telecommunications.

By law, organizations are required to provide "reasonable accommodation" to make existing facilities accessible to individuals, or to make any modification or adjustment to a job or work environment that would allow a qualified individual with a disability to perform the job.

Adaptive technology (AT) may provide a key that opens doors of opportunity for many individuals with disabilities. Adaptive technology may be an enhancement or modification of information technology (IT) that allows individuals with disabilities to have access to a computer and information, or AT may be designed specifically for use by individuals with disabilities (Lazzaro, 1990). In short, adaptive technology is hardware or software that allows individuals with disabilities to use a computer (Brown, 1992).

For example, individuals who are blind may use speech synthesizers which provide auditory screen reading. Individuals with low vision can use software that magnifies the print on the screen. For individuals with neurologic disabilities several types of adaptations exist. These include word prediction software, software that allows simultaneous keystrokes to be entered consecutively, large keypads, headsticks, and voice recognition software.

Organizations may find that adaptive technology, especially voice recognition software, provides a cost-effective solution for supporting the ADA's directive for "reasonable accommodation". According to the President's Committee on Employment of People with Disabilities, for example, the estimated average cost for furnishing adaptive technology for an employee is less than $1,000 (Rifkin, 1991, p. 25). By providing the needed adaptive technology, employers may remove an individual from disability income and provide the individual with the opportunity for a productive life; thus, the employer may gain a loyal, dedicated, and productive worker while society benefits from the individual's talents, skills, and increased purchasing power. Additionally, the individual contributes to the economy rather than relying on public support which makes adaptive technology a good investment for all.

How Computers Recognize the Human Voice

Voice recognition technology allows a computer to respond to voice commands by sending text to the console of a personal computer much like a standard keyboard. The VRT system may be used with different types of software such as the computer's operating system, word processing, spreadsheets, and database management systems. Commercially available products show improved performance while decreasing in price. Several products exist today that can understand a limited vocabulary of clearly and separately enunciated words.

To use a VRT system the individual must "train" the system to recognize his or her voice. To train the system the user repeats selected words a number of times. A microphone picks up the individual's speech that is in analog or wave form, and the speech waves are broken down into patterns of binary digits by a digital signal processor to represent the vocal sounds of human speech. These binary patterns are compared to a table consisting of the binary patterns of valid words. If a match is found, the spoken word is accepted by the computer (Marchewka & Goette, 1992). Today, the best systems are about 98% accurate, and many universities and businesses are currently looking at ways to improve this technology (Evans, 1988).

Types of VRT Systems

There are three basic types of voice recognition systems: discrete-utterance, connected-word, and continuous speech. Discrete-utterance is the most widely available, and the individual must pause between the uttered words. Pausing between words may become tedious, especially when multiple sentences are required; however, these single utterances are more easily recognized and require less complicated computer hardware than connected-word or continuous speech systems.

Connected-word systems, on the other hand, allow for short utterances of multiple words but definite beginning and ending points are required. This type of VRT system allows the individual to say several words before pausing. Since connected-word systems follow natural speech patterns more closely, complex matching algorithms are required and the level of accuracy is reduced.

The most complex voice recognition systems allow for true continuous speech. The ultimate goal of VRT is to have a computer that recognizes speech similar to the way humans recognize speech. Although products to support true continuous speech are still being developed, products do exist that allow for script read input and one sentence utterance (Clements, 1987).

VRT systems also vary in terms of vocabulary size. Smaller vocabularies are easier to train, while larger vocabulary systems require highly complex recognition algorithms. Moreover, if connected-word systems use large vocabularies, the possible word orderings increase exponentially and require more sophisticated hardware (Clements, 1987).

In addition, VRT systems are either speaker-dependent or speaker-independent. Speaker-dependent systems require an individual to train a vocabulary or list of specific words that are used by the system. In short, the system must be told who is using the system. With large vocabularies, speaker training time becomes a problem since a tradeoff exists between the amount of word training required and the accuracy of recognition. On the other hand, individuals with speech impediments can train speaker-dependent systems to understand their pronunciations.

Speaker-independent systems have models for word recognition built into the system. Training each word in the vocabulary is not required, but a large degree of accuracy may be lost. Phone companies are experimenting with VRT applications that are speaker-independent and only allow for a limited vocabulary.

Speaker-adaptive systems provide a compromise between speaker-independent and speaker-dependent systems. Adaptive systems include basic word models. An individual trains only the certain words (or utterances) that comprise the necessary syllables to make up other words. The system adapts the original models to the speaker's voice as the speaker uses the system.

Systems vary in the amount of background noise filtered. With some VRT systems problems arise if there are even two or three people talking in the background while someone is using the system. Machinery noises also can affect the recognition accuracy of VRT systems.

Systems also have differing degrees of robustness, which is the ability of the system to recognize variances in speech. The performance of the system may decrease when the voice becomes fatigued or when emotion is reflected in the voice. For example, the system may have a lower accuracy rate for recognizing words when the user shouts, speaks angrily, delays utterances, or has a cold.

Experiences with VRT

Parts of this paper were written and edited using Dragon Systems, Inc.'s product called DragonDictate [TM]. The system may be used with most microcomputer software and has a 30,000 word vocabulary. The system is speaker-adaptive and requires some training for each individual user.

As with most speaker-adaptive systems, an individual trains several common word models that consist of various commands and character-type utterances including system commands, the alphabet, and numbers. Once these basic models are trained, the user may use most of the popular word processing applications.

When the system encounters a word it does not recognize, the system displays a menu of several likely choices. If a word is not on the menu, the user may start spelling the word using the international alphabet (e.g., alpha, bravo, charlie). Each time a letter is spoken, an on-line dictionary creates a new menu of possible choices that match the letters making up the new word. If the individual sees the word on the menu, he/she may choose a corresponding number to complete the word. The system then builds a model based on the spoken utterance and word name.

Like most new software packages, the individual will spend some time becoming familiar with the system. After a relatively short learning curve, the individual should find him/herself using the VRT package at an acceptable word-per-minute rate.

VRT Applications

There has been unparalled growth in adaptive technology for individuals with disabilities since 1985. Although VRT may be used in many applications for both individuals with and without disabilities, it may play a key role in adaptive technology. For example, VRT plays a special role at the Computer Evaluation and Learning Lab (CELL) Program in Palo Alto, California. CELL is a program sponsored by the Veterans Administration to help disabled veterans lead satisfying and productive lives (Anonymous, 1991).

In 1988, Prab Robots, Inc. announced the first voice controlled robotic workstation to be sold to persons with quadriplegia. The system can load and mn various computer software, as well as answer the phone by using voice commands (Mangelsdorf, 1988). In addition, Terri and David Ward are the state-appointed resident curators of the Toll House Future Home, a fully automated computer-driven house. For example, a voice command can be used to activate security cameras, turn up the volume on the stereo, change the setting of the thermostat, or operate the oven. This house combines VRT with other technologies to give individuals with quadriplegia complete control. Since David Ward has quadriplegia, he plans to make full use of these features (Johnson, 1990).

Businesses are also beginning to recognize the benefits of VRT. Boeing reports success with VRT applications concerning a computer programmer who became quadriplegic because of an accident. The programmer's supervisor claims that there has been a "tremendous positive change in the programmer's personality," and that he is now more productive than the other non-handicapped programmers (Clements, 1987).

VRT Resources

Before investing in any type of technology, the individual should weigh several considerations such as:

* Task

* Characteristics of the individual (e.g., disability limitations of the individual)

* Cost and Benefits

* Training

* Level of Support

For example, it is important to match the technology to the task that must be accomplished. If the technology is not suited for a particular task, then even the best technology will not succeed. To successfully match the individual with the technology the abilities and limitations of each must be understood. Either the individual using the technology, the business employing the individual, a charitable organization, or the government may pay for adaptive technology. The alternatives for purchasing software and hardware, training the user, and providing maintenance and upgrades should be explored.

The Job Accommodation Network (JAN) is one of many organizations that exist to aid people with questions about adaptive technology. In addition, IBM established a National Support Center for Persons with Disabilities to serve as an information clearinghouse, and Apple Computer founded a Worldwide Disability Solutions Group. Table 1 provides information on organizations and VRT products. Individuals with disabilities may obtain information about available resources by contacting these agencies. Moreover, publications, such as Closing The Gap (612-248-3294), can provide additional information about adaptive technology products and services.

Conclusion

VRT has potential for many applications and is in varying stages of development. Cost and performance are important criteria, but convenience, reliability, robustness, and technology constraints are prominent considerations. Problems of VRT include silence intervals, synchronization, vocabulary size, multiple speakers, and background noise (Clements, 1987).

Voice recognition technology may be a consideration for the accommodation of the physically challenged. VRT provides an acceptable, if not exceptional, alternative to keyboard entry for people who are unable to use a keyboard. VRT may provide individuals with disabilities greater opportunities for employment. This not only benefits the individual, but benefits society as well.

References

Anonymous. (1991, May). Occupational therapy and computers bring independence to veterans, Vanguard, 37(6), p. 11.

Brown, C. (1992). Assistive technology computers and persons with disabilities. Communications of the ACM, 35(5), 36-44.

Clements, M. A. (1987). Voice recognition systems can be designed to serve a variety of purposes. Industrial Engineering, 19(9), 44-47.

Evans, R. (1988). Computers that finally talk and listen: Has the time finally come? International Management (UK) (Europe Edition), 43, 53-56.

Johnson, M. (1990, March 26). For the disabled, home is where the future is. Computerworld, p. 20.

Lazzaro, J. J. (1990, August). Opening doors for the disabled. BYTE, pp. 258-268.

Mangelsdorf, M. E. (1988, December). Catch-22. Inc., p. 22.

Marchewka, J. T. & Goette, T. (1992). Will computers have ears? Implications of speech recognition technology. Business Forum, 17(2), 26-29.

Rifkin, G. (1991, February 11). Technology offers disabled a chance to make their mark. Computerworld, p. 25.

Trace Research and Development Center. (1990, January). Speech input systems. Trace Quick Sheets - #32.

Tanya Goette, Kennesaw State College, Department of Decision Sciences and Business Law, P.O. Box 444, Marietta, GA 30061.
COPYRIGHT 1994 National Rehabilitation Association
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 1994, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

Article Details
Printer friendly Cite/link Email Feedback
Author:Marchewka, Jack T.
Publication:The Journal of Rehabilitation
Date:Apr 1, 1994
Words:2221
Previous Article:Perceptions of rehabilitation counselors regarding Native American healing practices.
Next Article:Application of the Menninger Return-to-Work Scale among injured workers in a production plant.
Topics:


Related Articles
Washington report.
Computer revolution: beyond automation to empowerment.
Speaking to your computer, naturally.
Brave New Yackety-Yak.
THE Accessible ASSOCIATION.
NIST SPONSORS 2000 NIST SPEAKER RECOGNITION EVALUATION WORKSHOP.
WORK BY NIST RESEARCHER SUGGESTS PERFORMANCE BREAKTHROUGH IN SPEAKER RECOGNITION.
Disability and spirituality: a reciprocal relationship with implications for the rehabilitation process. (Disability and Spirituality).
High quality employment. (Cover Story).

Terms of use | Copyright © 2018 Farlex, Inc. | Feedback | For webmasters