NIST hosts workshop on language recognition.NIST (National Institute of Standards & Technology, Washington, DC, www.nist.gov) The standards-defining agency of the U.S. government, formerly the National Bureau of Standards. It is one of three agencies that fall under the Technology Administration (www.technology. hosted the 2003 NIST Language Recognition Workshop at NIST in April 2003. Held in cooperation with Department of Defense (DoD) sponsors, the workshop reviewed the recent evaluation of language recognition research systems in this area. Six sites representing organizations from around the world participated in the evaluation demonstrating current state-of-the-art capabilities for detection of the languages used in segments of conversational telephone speech. The participants were MIT MIT - Massachusetts Institute of Technology Lincoln Laboratory, the OGI School of Science and Engineering The OGI School of Science and Engineering, located in Hillsboro, Oregon, United States is one of the four schools of the Oregon Health and Science University (OHSU). Until June 2001, it functioned independently as a public graduate school, the Oregon Graduate Institute ( of the Oregon Health & Science University working in collaboration with the Institute of Acoustics The Institute of Acoustics (IOA) is a British professional engineering institution founded in 1974. It is licensed by the Engineering Council UK to assess candidates for inclusion on ECUK's Register of professional Engineers. of the Chinese Academy of Sciences The Chinese Academy of Sciences (CAS) (Simplified Chinese: 中国科学院; Pinyin: Zhōngguó Kēxuéyuàn), formerly known as Academia Sinica , the Speech Research Lab of Queensland University of Technology, R523 (DoD), the Department of Electrical Engineering of the University of Washington, and a collaboration of the Institut de Recherche en Informatique de Toulouse and the Laboratoire Dynamique du Langage (Lyon).In the evaluation, each system was presented with numerous test segments of conversational speech with durations of approximately 3 s, 10 s, or 30 s. The system had to decide for each of 12 target languages whether the speech segment was in that particular language. The target languages were Arabic, English, Farsi, French, German, Hindi, Japanese, Korean, Mandarin, Spanish, Tamil, and Vietnamese. The test segments came from previously collected corpora corpora plural form of corpus. corpora albicantia see corpus albicans. corpora arenacea sandy or gritty bodies, found in the pineal body; appear to be of glial or stromal origin; have the structure of of telephone conversations in each of these languages as well as in Russian. NIST researchers gave presentations summarizing the overall performance results and analyzing how performance varied with segment duration, speaker gender, and the languages being tested. One surprising finding was that language detection performance generally was more superior on female speech than on male speech. NIST conducted the last such evaluation and workshop in 1996. Two of the participating sites in 2003, MIT and OGI OGI Oregon Graduate Institute OGI Office of Geographic Information OGI Ola Grimsby Institute OGI Overlapping Global Interval , also participated in the 1996 evaluation. Each of these sites had results this year that were considerably superior to their performance seven years earlier. More information about the 2003 NIST Language Recognition Evaluation is available at www.nist.gov/ speech/tests/lang/index.htm. CONTACT: Alvin Martin, (301) 975-3169; alvin. martin@nist.gov or Mark Przybocki, (301) 975-3347; mark.przybocki@nist.gov. |
|
||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion