Printer Friendly

CARNEGIE MELLON DEMONSTRATES SPEECH TRANSLATION SYSTEM IN VIDEO CONFERENCE WITH RESEARCH PARTNERS IN GERMANY AND JAPAN

 PITTSBURGH, Jan. 28 /PRNewswire/ -- Scientists from Carnegie Mellon University's Center for Machine Translation, in conjunction with research partners in Germany and Japan, today conducted a first-ever international video conference using computer speech translation to translate among spoken English, Japanese and German.
 The research group, known as the "Consortium for Speech Translation Research," (CSTAR), includes ATR Interpreting Telephony Laboratories (Kyoto, Japan), Carnegie Mellon, Siemens, A.G. (Munich, Germany), and the University of Karlsruhe (Karlsruhe, Germany).
 In today's demonstration, researchers in Pittsburgh, Kyoto and Munich conversed with each other about registering for an international conference in their own languages via their speech translation systems. Each partner's system accepts a spoken input sentence, recognizes it, translates it into each of the two other partners' languages and resynthesizes it (speaks it aloud) on the other side.
 Dialogues among the three partners are demonstrated by connecting the three countries' systems via telephone data lines. Allowing for time differences, three similar demonstrations were held in Pittsburgh, Munich and Kyoto at different times during the day.
 Carnegie Mellon's system, Janus, is a speaker-independent, continuous speech translation system that can translate from either English or German into German, English and Japanese. Its 500-word vocabulary is geared to the conference registration domain.
 Janus produces translation in approximately 1.5 times real time, which means that the 2.5 second sentence in one language takes approximately 3.5 seconds to be recognized, translated and resynthesized in another. The system operates on a single Hewlett-Packard Co. 9000 Series 720 workstation, with 64 megabytes of memory, an analog-to-digital converter and a DECtalk speech synthesis device. All other modules and components are run in software only. Communication among sites is performed via data modems using standard telephone lines, as well as video conferencing connections via PictureTel.
 The system works by understanding spoken input in a particular language, translating it into another and synthesizing speech in the target language. Carnegie Mellon senior research scientist Alex Waibel, who directs the projects at Carnegie Mellon and Karlsruhe University, combines neural networks with other, more traditional technologies such as knowledge-based and statistical processing techniques, to help the system accomplish this task. He said neural networks deliver a high degree of accuracy, efficiency and portability, because they allow portions of the system to be learned or adapted to other domains and languages.
 When a user speaks into the Janus system, the speech signal is digitized in a computer-readable form and pre-processed. Using this front-end signal analysis, the recognizer detects the sounds in the input stream by applying neural network-based classification strategies. It reviews the top 50 sentence hypotheses so the translation module can select the most plausible sentence.
 This sentence analysis is performed by three alternative strategies -- a syntactic parser, a neural network-based parser and a semantic parser. All three analysis modules work to determine the "Interlingua," or language-independent meaning and intent of the spoken input sentence. Using this interlingua representation, an output can subsequently be generated in each of several output languages. The speech translation process is then completed by transmitting the translated message and synthesizing the spoken utterance in the target language.
 Waibel said future research efforts at Carnegie Mellon will focus on making Janus more robust against spontaneous speech with its interruptions, disfluencies and broken sentences, enlarging the system's vocabulary and adding new domains and languages. An important element of this future research will be automatic learning of speech and language knowledge so the system can automatically learn new vocabularies, new domains and new languages through interaction with the user.
 Waibel also plans to incorporate multi-modal capabilities into Janus, including gesture, lip and face recognition, as well as hand modeling, character recognition and eye tracking to make cross-language human-to-human communication more effective.
 Research into speech recognition and translation has been underway for more than 30 years. Only recently have advances in computing achieved performance levels that make applications such as speech translation feasible.
 Waibel sees two kinds of markets emerging for this technology: portable speech translation devices useful for travel, meetings, sales presentations and services offered by centralized providers like telephone, cable and cellular communication organizations.
 He predicts voice-activated dictionaries and phrase books may become available within the next five years, and spontaneous speech translation devices initially limited to common domains such as travel planning, hotel reservations, restaurant ordering around the turn of the century.
 -0- 1/28/93
 /CONTACT: Anne Watzman of Carnegie Mellon University, 412-268-2900/


CO: Carnegie Mellon University; ATR Interpreting Telephony
 Laboratories; Siemens, A.G.; University of Karlsruhe ST: Pennsylvania IN: CPR SU:


CD-MK -- PG001 -- 0131 01/28/93 09:59 EST
COPYRIGHT 1993 PR Newswire Association LLC
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 1993 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Publication:PR Newswire
Date:Jan 28, 1993
Words:760
Previous Article:WARNER-LAMBERT AND JOUVEINAL S.A. FORM RESEARCH-BASED ALLIANCE
Next Article:ERA NETWORK REAL ESTATE BECOMES THE LEADING EDGE IN REAL ESTATE FRANCHISING IN CALIFORNIA
Topics:


Related Articles
Speech for export.
SPEECH-DRIVEN, WEARABLE COMPUTERS MOBILIZE THE WORK FORCE
ISLIP Media, Inc. Formed to Develop Digital Media Library Software And Services
Carnegie Mellon Scientists to Demonstrate Spontaneous Speech Translation in Six Languages.
The Post Files.
U.S. Department of Homeland Security Announces Partnership with Carnegie Mellon's CERT Coordination Center.
Carnegie Mellon Will Induct Four Robots into Newly Established Robot Hall of Fame.
Carnegie Mellon to Develop Business, Computer Science Programs for Branch Campus in the Arabian Gulf Nation of Qatar.
Carnegie Mellon to Demonstrate Breakthroughs In Cross Lingual Communication and Speech-to-Speech Translation.

Terms of use | Copyright © 2016 Farlex, Inc. | Feedback | For webmasters