Printer Friendly

In beta.

Five companies joined the "In Beta" session to demonstrate technology that wasn't quite ready for full-production release but was certainly interesting, enlightening, and useful. These products ran the gamut, from in-the-weeds technology to business management. Some products were in pre-alpha release, while others were in beta, but all received a great deal of attention from show attendees.

* Deutsche Telekom Laboratories showed its pre-beta tool, Multimodal Application Builder. If you've ever tried to build a multimodal application, you know just how difficult it can be to juggle all of the moving parts.

The goal of this tool is to generate 80 percent of a multimodal application's code automatically, and to create output that is compatible with the standards work of the World Wide Web Consortium's Multimodal Working Group. The tool is built on top of the industry-standard Eclipse development editor and accepts drag-and-drop graphics and/or XML descriptions as input; the output is CCXML, SCXML, speech recognition grammars, HTML and/or Flash for graphical interactions, etc. Developers then add the back-end integration that lets the application manipulate the data.

* Loquendo showed a pre-alpha tool--actually, a research-in-progress tool--that it hopes will be able to add emphasis into its Loquendo TTS output. Adding emphasis to TTS can provide important cues to the listener. Emphasis Director, working in conjunction with Loquendo's TTS Director, gives voice user interface designers an easy way to experiment with TTS emphasis.

The tool itself is very simple to use: Users enter text, highlight the portion they want emphasized, adjust the emphasis level, and then the tool will play audio with and without added emphasis. The ultimate output of the tool is marked-up text in Loquendo's format. In the future, Loquendo intends to add other emotions to the tool.

* Lyrix provides an interesting business service, currently in beta. Limitations of cell phone technology mean most people who use cell phones for business use their personal phones, and their companies reimburse them for business calls. This creates a significant bookkeeping burden on employees and the company, and employee errors generate significant costs.

Lyrix Mobiso offers a software-as-a-service solution for use with data-capable mobile phones. It uses a central directory of phone numbers and a central "reach" number. The system compensates employees for inbound and outbound business calls; personal calls remain private. The system also integrates with customer relationship management software applications, such as SugarCRM.

* RebelVox showed its pre-alpha re-engineering of the basic telephone call. Instead of dialing a number and then waiting for the other person to either pick up or for the call to go to voicemail, the user simply selects the other person's telephone number from his smartphone's screen and begins to speak. The recipient can break in and listen to the call live; or, if he chooses not to take the call, the call is saved as a recording.

RebelVox intends to integrate the underlying platform into other products and to offer a full multimodal application programming interface. Some interesting unanswered questions about this platform still linger: For example, if someone calls me, and I break into the call toward the end, then is it possible to hear what was already said so the person who calls me doesn't have to repeat himself? Because this platform transforms how we think about phone calls, I think it's safe to say we will see some interesting emergent behavior and new telephone manners in the future.

* Voxeo's Tropo platform is aimed squarely at ordinary Internet developers who want to add telephony, automatic speech recognition (ASR), and text-to-speech (TTS) capabilities to their applications. To reel them in, Tropo provides Internet-friendly technical and business interfaces.

On the one hand, Tropo can accept calls, run scripts, and provide touch-tone recognition, ASR, and TTS--the features we all expect from a typical hosted environment. But Tropo incorporates many features that Internet developers expect from services, such as extensive interfaces to control the application and support for multiple languages, including Python, PHP, Ruby, JavaScript, and Groovy. Tropo also includes a pleasant surprise: multimodal interfaces. Tropo includes text and instant messaging as input and output modes that share the ASR grammars and TTS text. The business model is also typical of Internet services: Businesses purchase credit in advance and are charged a fixed, per-minute rate without contracts or long-term commitments.


Deutsche Telekom BONN, GERMANY




COPYRIGHT 2010 Information Today, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2010 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:SpeechTEK Hands-On
Author:Yudkowsky, Moshe
Publication:Speech Technology Magazine
Geographic Code:1USA
Date:Jan 1, 2010
Previous Article:Text-to-speech.
Next Article:Buyers guide.

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters