Muscle tension dysphonia in patients who use computerized speech recognition systems.Abstract The use of speech recognition systems as a replacement for other types of transcription systems is increasing rapidly, partly because many people are unable to use conventional keyboards as a result of upper-extremity repetitive strain injury repetitive strain injury: see repetitive stress injury. See RSI. repetitive strain injury - overuse strain injury (RSI (Repetitive Strain Injury) Ailments of the hands, neck, back and eyes due to computer use. The remedy for RSI is frequent breaks which should include stretching or yoga postures. ). However, the frequent or continuous use of such systems can cause muscle tension dysphonia dysphonia /dys·pho·nia/ (-fo´ne-ah) a voice impairment or speech disorder.dysphon´ic dys·pho·ni·a n. Difficulty in speaking, usually evidenced by hoarseness. in some patients. The scientific literature suggests that there is an association between upper-extremity RSI and muscle tension dysphonia. We present a retrospective case series of five patients with workplace upper-extremity RSI who developed muscle tension dysphonia soon after they began using discrete computerized speech recognition software. The diagnosis of dysphonia was based on laryngovideostroboscopy, acoustic analyses, and voice load testing Load testing is the process of creating demand on a system or device and measuring its response. In mechanical systems it refers to the testing of a system to certify it under the appropriate regulations (LOLER in the UK - Lifting Operations and Lifting Equipment . All patients had normal voice when using everyday speech, but speaking into the computer resulted in the rapid onset of aperiodicity, strain, and a decrease in fundamental frequency. In three of the five patients, laryngovideostroboscopy showed posterior glottic overapproximation, but no other abnormalities. Treatment was centered on voice therapy and avoidance of long periods of using computerized speech recognition systems. The condition of three of the five patients improved with therapy. We conclude that computer speech recognition programs can lead to the onset of muscle tension dysphonia in some patients. These patients can be successfully treated with voice therapy. Introduction Computerized speech recognition systems transmit voice input to a microphone and convert it into written text. Various systems differ with respect to the style of speech input. Discrete systems require distinct enunciation enunciation (inun´sēā´sh n an auxiliary function of teeth, particularly those in the anterior sector of the dental arch; the formation of sounds of each word with a short pause between words. Continuous systems do not require such a pause and thus allow for a more natural rate and flow of speech input. Speech recognition systems are assumed to be useful replacements for a variety of manual transcription systems, particularly for patients who have repetitive strain injury (RSI) of the upper extremity upper extremity n. The shoulder, arm, forearm, wrist, or hand. Also called superior limb, thoracic limb. , a condition often related to computer keyboard use. However, frequent or continuous use of speech recognition systems can cause muscle tension dysphonia, which is another form of RSI. (1) In this article, we describe our retrospective review retrospective review, a posttreatment assessment of services on a case-by-case or aggregate basis after the services have been performed. of the cases of five patients with workplace RSI who developed muscle tension dysphonia soon after they began using a computerized speech recognition system. Although an association between workplace RSI and muscle tension dysphonia has been suggested previously, (2) to our knowledge, our report is the first to present objective evidence of vocal dysfunction and to describe treatment options. Patients and methods Two of the authors (K.I. and T.B.)--both speech pathologists--retrospectively reviewed the outpatient medical records of the five patients and noted epidemiologic characteristics, signs and symptoms, and related factors (table). The five patients--three men and two women--ranged in age from 33 to 53 years. All five had a history of upper-extremity RSI severe enough to limit use of a keyboard, and all had used a variety of discrete speech recognition systems to continue their work. One patient (patient 2) had gastroesophageal reflux disease gastroesophageal reflux disease (GERD) Disorder characterized by frequent passage of gastric contents from the stomach back into the esophagus. Symptoms of GERD may include heartburn, coughing, frequent clearing of the throat, and difficulty in swallowing. and a history of vocal abuse; none of the others had a history of voice pathology, and none smoked or habitually drank alcohol. All patients had been evaluated by objective and perceptual means. Laryngovideostrnboscopy was performed on four of these patients during phonation pho·na·tion n. The utterance of sounds through the use of the vocal cords; vocalization. pho na·to to examine the symmetry and morphology of the vocal folds The vocal folds, also known popularly as vocal cords, are composed of twin infoldings of mucous membrane stretched horizontally across the larynx. They vibrate, modulating the flow of air being expelled from the lungs during phonation. and
surrounding structures and to measure the periodicity periodicity /pe·ri·o·dic·i·ty/ (per?e-ah-dis´i-te) recurrence at regular intervals of time. pe·ri·o·dic·i·ty n. 1. of the mucosal waveform. Laryngovideostroboscopy was performed with a stroboscopy unit (Kay Elemetrics; Lincoln Park Lincoln Park, city (1990 pop. 41,832), Wayne co., SE Mich., a suburb adjacent to Detroit, on the Detroit River; inc. 1921. It is a residential community in an area marked by a significant decline in industry. , N.J.) and either a 70[degrees] or 90[degrees] rigid laryngoscope la·ryn·go·scope n. A tubular endoscope that is inserted through the mouth and into the larynx and that is used for examining the interior of the larynx. la·ryn , which was passed transorally. Acoustic analysis of fundamental frequency was performed with a Computerized Speech Lab system (Model 4100; Kay Elemetrics). This analysis was performed on the patient's habitual voice as well as the "computer voice," which the patient produced when using a voice recognition system. Three of the five patients also underwent a voice load test, which evaluates voice over time in 5-minute intervals while the patient speaks all voiced segments repeatedly without breaks. (3) Results All five patients developed symptoms of dysphonia within 2 to 8 weeks after they began using a voice recognition system. The typical symptom pattern included hoarseness, which progressed to a strained and fatigued voice and in some cases proceeded to temporary aphonia aphonia /apho·nia/ (a-fo´ne-ah) loss of voice; inability to produce vocal sounds. a·pho·ni·a n. . In addition, all patients complained of progressive odynophonia. Objective data revealed several key similarities among patients. First, each patient's computer voice tended to differ from his or her normal speaking voice in both pitch and quality. The computer voice's fundamental frequency was 5 to 30% lower than the normal speaking voice, and the computer voices had a monotonous quality. Second, in most patients, vocal function was relatively normal during natural speaking, but it quickly deteriorated into dysphonia when the computer voice was used. Results of the voice load test showed normal values normal values pl.n. A set of laboratory test values used to characterize apparently healthy individuals, now replaced by reference values. at the beginning of the test, but progressive strain and aperiodicity became evident during 8 minutes of continuous speaking. In fact, patient 1 was unable to complete the voice load test because of aphonia and odynophonia. Laryngovideostroboscopy was performed on all but one patient (patient 5), whose fibromyalgia fibromyalgia Chronic syndrome that is characterized by musculoskeletal pain, often at multiple sites. The cause is unknown. A significant number of persons with fibromyalgia also have mental disorders, especially depression. precluded even the modest amount of neck flexion flexion /flex·ion/ (flek´shun) the act of bending or the condition of being bent. flex·ion n. 1. The act of bending a joint or limb in the body by the action of flexors. 2. needed to undergo the test. In all patients, there was a slight over-approximation of the posterior glottis glottis /glot·tis/ (glot´is) pl. glot´tides [Gr.] the vocal apparatus of the larynx, consisting of the true vocal cords and the opening between them.glot´tal glot·tis n. pl. , with or without arytenoid arytenoid /ar·y·te·noid/ (ar?i-te´noid) shaped like a jug or pitcher, as arytenoid cartilage. ar·y·te·noid n. 1. overlap, during phonation. Patient 4 also had a small anterior glottic glot·tic adj. 1. Of or relating to the tongue. 2. Of or relating to the glottis. glottic pertaining to (1) the glottis, or (2) the tongue. gap and mild vocal fold vocal fold n. See vocal cord. hypervascularity. Following therapy (discussed below), three of the five patients (patients 1, 2, and 3) experienced improvement; the other two (patients 4 and 5) experienced persistent symptoms despite more than 1 year of therapy. Two patients (patients 3 and 5) switched from using a discrete speech recognition system (Dragon Dictate; Lernout and Hauspie; Burlington, Mass.) to using a continuous system (Dragon Naturally Speaking; Lernout and Hauspie). Patient 3 had been unable to tolerate the discrete system for longer than a few minutes, but he was able to use the continuous system up to a total of 3.5 hours per day without strain. Patient 5 showed no improvement after switching systems Switching systems (communications) The assemblies of switching and control devices provided so that any station in a communications system may be connected as desired with any other station. . Discussion Speech recognition systems have advanced markedly since the commercial debut of discrete systems in the early 1990s. In particular, speech recognition technology took a major step forward in 1997 when continuous speech recognition was introduced. (4) Use of these systems is increasing rapidly. The newest generation of systems is claimed to be highly accurate (>95%), enables transcription at approximately normal speech rates (100 to 125 words/min), and is compatible with the latest personal computers. (4) One notable drawback of both the discrete and continuous speech recognition systems is that they require that patients enunciate in an expressionless, monotonous voice with a low pitch and at increased volume to consistently produce high rates of transcription accuracy. The results of previous studies have suggested that such a speech style results in muscle fatigue and eventual injury in susceptible persons (5,6) and is a basis for workers' compensation workers' compensation, payment by employers for some part of the cost of injuries, or in some cases of occupational diseases, received by employees in the course of their work. claims. (3) The pattern of posterior glottic overapproximation that we found in our study is similar to the type 3 "supraglottic anteroposterior anteroposterior /an·tero·pos·te·ri·or/ (-pos-ter´e-er) directed from the front toward the back. an·ter·o·pos·te·ri·or adj. Abbr. AP 1. Relating to both front and back. contraction" described by Morrison and Rammage, who classified different types of muscle tension dysphonia on the basis of morphology. (6) The type 3 pattern--sometimes referred to as the "Bogart-Bacall syndrome Bogart-Bacall Syndrome is a vocal misuse disorder. Persons who speak or sing out of their normal range can experience vocal fatigue, which is one cause of dysphonia. " in reference to the two well-known film actors (7)--causes effortful voicing and rapid fatigue in affected patients when they speak in a low-pitched voice. Treatment of muscle tension dysphonia includes long-term voice therapy and modification of the work environment to provide for frequent periods of voice rest and to limit total speaking time. Patients who continue to use voice recognition systems are trained to avoid using low-pitched, monotonous speech. Complete voice rest is often impractical for patients who have comorbid RSI, which itself limits their ability to communicate. Additional therapeutic goals include the elimination of hard glottal glot·tal adj. Of or relating to the glottis. glottal (glot´ attack, balance of oral and nasal resonance, forward placement of voicing, and development of relaxed phonation and articulation. We do not know whether patients who are susceptible to RSI of the upper extremity are also more susceptible to muscle tension dysphonia. The issue was studied by Kambeyanda et al, who used a survey instrument to document the presence of both RSI and dysphonia in patients who used discrete speech recognition systems. (2) Although they found a statistically significant relationship between the two conditions, they also found that the absence of RSI did not preclude the development of dysphonia and, conversely, that many patients with RSI reported no voice problems. Neither do we know whether future speech recognition systems will be able to prevent the onset of dysphonia by accurately recognizing speech delivered at a normal tone, pitch, and rate. As computer voice technology continues to evolve, this unique cause of muscle tension dysphonia may become just an historical medical footnote. Nonetheless, given the current state of speech recognition technology, clinicians should be aware of this particular form of RSI.
Table. Patient characteristics and selected results of objective
testing for muscle tension dysphonia
Age/ Speech recognition
Pt. sex system Associated RSI * [F.sub.O] *
for normal
voice (Hz)
1 33/M Dragon Dictate (discrete) Tendinitis in 104
the wrists
2 34/F IBM Via Voice (discrete) Carpal tunnel 309
syndrome
3 36/M Dragon Dictate (discrete) Carpal tunnel 128
for 1 yr, then Dragon syndrome
Naturally
Speaking (continuous)
4 45/F Dragon Dictate (discrete) Carpal tunnel N/A
syndrome
5 53/M Dragon Dictate (discrete) Fibromyalgia 102 to 105
for 1 yr, then Dragon
Naturally
Speaking (continuous)
[F.sub.O]
for
computer Perceptaul score on
Pt. voice (Hz) voice load test ([dagger]) LVS * findings
1 83 to 85 0 at start; patient was unable PO *; the arytenoid
to finish because of aphonia overlapped the
and odynophia posterior glottis
2 220 0 at start, 3+ aperiodicity at PO
finish
3 121 0 at start, 3+ strain at PO; minor over-
finish lapping of the
arytenoids during
phonation; pos-
terior overpressure
4 N/A Not performed because of PO; slight anterior
dysphonia onset within 5 min glottic gap; mild
of starting on the discrete hypervascularity
system
5 98 Not performed because of None; patient was
dysphonia onset within 15 min unable to flex his
of starting on the discrete neck
system
* RSI = repetitive strain injury; [F.sub.O] = voice output frequency;
LVS = larvngovideostroboscopy; PO = posterior overapproximation.
([dagger]) Scale of 0 (least severe dysfunction) to 5 (most severe
dysfunction).
Acknowledgment The authors recognize the Kaiser Foundation The mission of the Kaiser Foundation is to assist individuals and communities in preventing and reducing the harm associated with problem substance use and addictive behaviours. External links
References (1.) Haxer MJ, Guinn LW, Hogikyan ND. Use of speech recognition software: A vocal endurance test for the new millennium? J Voice 200l:15:231-6. (2.) Kambeyanda D, Siager L, Cronk S. Potential problems associated with use of speech recognition products. Assist Technol 1997;9: 95-101. (3.) Izdebski K. Manace ED, Harris JS. The challenge of determining work-related voice/speech disabilities in California. A multi-disciplinary laryngology laryngology /lar·yn·gol·o·gy/ (-gol´ah-je) the branch of medicine dealing with the throat, pharynx, larynx, nasopharynx, and tracheobronchial tree. lar·yn·gol·o·gy n. and voice pathology evaluation. In: Dejonckere PH, ed. Occupational Voice: Care and Cure. The Hague, Netherlands: Kugler; Monroe, N.Y.: Library Research Associates, 2001:149-54. (4.) Zafar A. Overhage JM, McDonald CJ. Continuous speech recognition for clinicians. J Am Med Inform Assoc 1999:6:195-204. (5.) Stemple Stem´ple n. 1. (Mining) A crossbar of wood in a shaft, serving as a step. JC. Stanley J, Lee L. Objective measures of voice production in normal subjects following prolonged w)ice use. J Voice 1995;9:127-33. (6.) Morrison MD, Rammage LA. Muscle misuse voice disorders: Description and classification. Acta Otolaryngol 1993;113:428-34, (7.) Koufman JA, Blalock PD. Classification and approach to patients with functional voice disorders. Ann Otol Rhinol Laryngol 1982; 91:372-7. David E.L. Olson, MD Raul M. Cruz, MD Krzysztof lzdcbski, PhD Tracey Baldwin, MA-CCC-S >From the Department of Head and Neck Surgery, Kaiser Permanente Medical Center, Oakland, Calif. (Dr. Olson, Dr. Cruz, and Ms. Baldwin), and the Pacific Voice and Speech Foundation, and the Department of Otolaryngology-Head and Neck Surgery, University of California The University of California has a combined student body of more than 191,000 students, over 1,340,000 living alumni, and a combined systemwide and campus endowment of just over $7.3 billion (8th largest in the United States). Medical Center, San Francisco (Dr. Izdebski). Reprint requests: David E.L. Olson, MD, ENT ENT ears, nose, and throat (otorhinolaryngology). ENT abbr. ear, nose, and throat ENT ear, nose and throat. ENT Ears, nose & throat; formally, otorhinolaryngology Associates SW, 404 Yauger Way, #150, Olympia, WA 98502. Phone: (360) 357-6314; fax: (360) 705-3745: e mail: budelsky@pacbell.net Originally presented at the Bay Area Residents' Research Symposium: June 2, 2000; Berkeley, Calif. |
|
||||||||||||||||||

na·to
Printer friendly
Cite/link
Email
Feedback
Reader Opinion