Processing Power of Voice Recognition Technologies Requires Enhancement for Continuous Speech Recognition.PALO ALTO Palo Alto, city, California Palo Alto (păl`ō ăl`tō), city (1990 pop. 55,900), Santa Clara co., W Calif.; inc. 1894. Although primarily residential, Palo Alto has aerospace, electronics, and advanced research industries. , Calif. -- The real-time speech recognition technology currently found in voice portals, consumes immense processing power and was considered unviable until recently. The computation-intensive hidden Markov model A hidden Markov model (HMM) is a statistical model in which the system being modeled is assumed to be a Markov process with unknown parameters, and the challenge is to determine the hidden parameters from the observable parameters. (HMM HMM heavy meromyosin. ) technology of the mid-1980s improved the ability to recognize word relationships and ultimately led to the development of powerful speech-recognition applications. For systems to understand and respond to continuous speech, manufacturers have to rely on large amount of processing power. This can be cost prohibitive. When the users speak at natural speed, it becomes difficult to associate specific sounds with particular words. Since users do not pause between words, processing naturally spoken phrases in real time can be tricky. If you are interested in a virtual brochure, which provides system integrators and other industry participants with an overview of the latest analysis of the Advances in Voice Recognition Technology, then send an e-mail to Mireya Castilla - Corporate Communications Corporate communications is the process of facilitating information and knowledge exchanges with internal and key external groups and individuals that have a direct relationship with an enterprise. at mireya.castilla@frost.com with the following information: your full name, company name, title, telephone number, e-mail address See Internet address. e-mail address - electronic mail address , city, state, and country. We will send you the information via e-mail upon receipt of the above information. "Predominantly software-only engines demand more processing power than can be provided by traditional digital signal processing See DSP. Digital Signal Processing - (DSP) Computer manipulation of analog signals (commonly sound or image) which have been converted to digital form (sampled). (DSP (1) (Digital Signal Processor) A special-purpose CPU used for digital signal processing applications (see definition #2 below). It provides ultra-fast instruction sequences, such as shift and add, and multiply and add, which are commonly used in math-intensive ) boards," says Frost & Sullivan Research Analyst Arjun Chokkappan. "These boards are used in the interactive voice recognition (IVR (Interactive Voice Response) An automated telephone information system that speaks to the caller with a combination of fixed voice menus and data extracted from databases in real time. ) systems and they need additional processors to supplement the IVR processing power and support and manage the system." Nortel's modern speech-processing platform integrates technologies into a range of the media processing See media control. server (MPS) platforms. Configured MPS systems with additional speech servers decrease the response time of a voice recognition solution. The speech server is a speech-processing platform within an IVR/media processing platform offering choices, investment protection and scalability. The advanced system software developed on this platform integrates with industry-standard components to offer the advantages of open architecture systems. "The design employs high-performance processors that plug into a separate resource subsystem integrated into the core operating architecture of the IVR/media server platform," notes Chokkappan. "This approach provides a cost-effective and scalable resource for running advanced speech recognition and analysis." Voice recognition systems also need to make allowances for the diverse enunciations and intonations of the same word by different people. The resultant issues of interpreting speech variability have led to the development of complex pattern analysis. Apart from accents, voice recognition systems have trouble filtering out background noise, especially from calls made by mobile phone users. Although better microphones have remedied this issue to a small extent, wind, murmurs, and music still require proper isolation from the voice. To sort out these concerns, ScanSoft introduced the OpenSpeech(TM) Recognizer (OSR OSR Orchestre de La Suisse Romande OSR OEM Service Release OSR Ontario Student Record OSR Office of School Readiness (various locales) OSR Office of Scientific Research OSR Overseas Service Ribbon OSR Ohio State Reformatory ), a speech recognition solution for telephony applications. A prominent feature of this solution is its ability to enable applications in understanding a range of words and phrases Words and Phrases® A multivolume set of law books published by West Group containing thousands of judicial definitions of words and phrases, arranged alphabetically, from 1658 to the present. without requiring highly complex grammar rules. "It can also help applications automatically adapt recognizer parameters to optimize performance, separate speech from background noise through superior endpointing and speech detection algorithms, and provide unmatched scalability using the patented Finite State Transducer A finite state transducer (FST) is a finite state machine with two tapes: an input tape and an output tape. Contrast this with an ordinary finite state automaton, which has a single tape. technology," explains Chokkappan. Advances in Voice Recognition Technology is part of the Information Communication Technology vertical subscription service, providing an overview of advances in voice recognition technology. It examines technology and applications viewpoints, evaluates technologies such as Speech Application Language Tags For other meanings of the word salt or acronym "SALT", see salt (disambiguation). Speech Application Language Tags (SALT) is an XML based markup language that is used in HTML and XHTML pages to add voice recognition capabilities to web based applications. (SALT) and Voice Extensible Markup Language See XML. (language, text) Extensible Markup Language - (XML) An initiative from the W3C defining an "extremely simple" dialect of SGML suitable for use on the World-Wide Web. http://w3.org/XML/. (VoiceXML), and assesses key innovations and technologies in voice recognition technology. The research enables companies to align their positioning strategies to benefit from the changing technologies. Executive summaries and analyst interviews are available to the press. Technical Insights is an international technology analysis business that produces a variety of technical news alerts, newsletters, and research services. Frost & Sullivan, a global growth consulting company, has been partnering with clients to support the development of innovative strategies for more than 40 years. The company's industry expertise integrates growth consulting, growth partnership services, and corporate management training to identify and develop opportunities. Frost & Sullivan serves an extensive clientele that includes Global 1000 companies, emerging companies, and the investment community by providing comprehensive industry coverage that reflects a unique global perspective and combines ongoing analysis of markets, technologies, econometrics, and demographics. For more information, visit www.frost.com. Advances in Voice Recognition Technology D357 Keywords in this release: voice recognition technology, Speech Application Language Tags, SALT, Voice Extensible Markup Language, VoiceXML, Interactive Voice Response, IVR, Hidden Markov Model, HMM, digital signal processing, DSP, media processing server, MPS, OpenSpeech(TM) Recognizer, OSR, Finite State Transducer |
|
||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion