Voicexml versus salt: selecting a voice application standard.When it comes to speech application standards, it seems we've been asking all the wrong questions. The VXML See VoiceXML. versus SALT debate is currently a hot topic in the IT conference rooms of organizations that rely on efficient, cost-effective contact centers. Phrases like "intense competition" and "battle royale You can help Wikipedia by removing peacock terms. " are bouncing around the trade press. Rival consortiums are at work writing specs (SPECificationS) The details of the components built into a device. See specification. and generating headlines, and some of the biggest names in technology have entered the VXML versus SALT fray. So, which speech standard will win, VXML or SALT? A good question, but possibly a wrong question. Given the real reasons organizations deploy speech-enabled technologies and the fundamental nature of technology standards themselves, our focus should be more on the application than on the application standard. Ask yourself this: how often does someone using an accounting program or other application know or care whether that software was written in C or in Java-based code? When was the last time a customer hung up the phone and said, "Hey, that was the best VoiceXML application I have ever heard"? The fact is, it's easy enough to get caught up in the debate over which standard is superior and which will dominate. Standards matter. But what matters most to the end user, and therefore what should matter most to contact centers and their system suppliers, is the quality and performance of the application itself. With the end user in mind, now may be the time to ask some different and more relevant questions about VXML and SALT. The answers may surprise you. The Future Of Speech One thing is certain: speech recognition is the future of voice automation and a very important part of a customer or employee self-service strategy. In fact, some industry analyst firms indicate that over 80 percent of all customer interaction is still done with a telephone call. Interactive voice response (IVR (Interactive Voice Response) An automated telephone information system that speaks to the caller with a combination of fixed voice menus and data extracted from databases in real time. ) is now a foundation technology of the customer service marketplace and is common in the contact centers of companies in financial services The examples and perspective in this article or section may not represent a worldwide view of the subject. Please [ improve this article] or discuss the issue on the talk page. , insurance, telecommunications and a wide range of other industries. As customer-oriented organizations seek to drive both service performance and cost efficiencies, the next major wave of investment is the incorporation of speech-based IVR solutions. The entire concept of IVR systems is being transformed by new and powerful speech-enabling innovations such as speech recognition, text-to-speech and speaker verification. By incorporating speech technologies into their voice automation systems, enterprises are increasing productivity, cost efficiency and customer satisfaction. But, as we have seen in so many other technology environments, the move to speech-driven automation has sparked an intense discussion over the relative merits and viability of the standards that underlie this still-emerging technology: Voice eXtensible Markup Language See XML. (language, text) Extensible Markup Language - (XML) An initiative from the W3C defining an "extremely simple" dialect of SGML suitable for use on the World-Wide Web. http://w3.org/XML/. (VXML) and Speech Applications Language Tabs (SALT). These two evolving standards are making headlines as industry analysts, development groups and IT vendors jockey for position in the growing speech-enabled marketplace. Here are snapshot views of what their respective forums have to say about each. VoiceXML First published in 2000 by a consortium of 500 companies under the auspices of the VoiceXML Forum, VoiceXML has been described as the HTML HTML in full HyperText Markup Language Markup language derived from SGML that is used to prepare hypertext documents. Relatively easy for nonprogrammers to master, HTML is the language used for documents on the World Wide Web. of the voice Web. VXML is an open, standard markup language markup language Standard text-encoding system consisting of a set of symbols inserted in a text document to control its structure, formatting, or the relationship among its parts. The most widely used markup languages are SGML, HTML, and XML. for voice applications. Originally developed for telephony applications, VXML harnesses the large Web infrastructure created for HTML to simplify the development and implementation of voice applications. Control of the VoiceXML standard has been given to the Word Wide Web Consortium (W3C (World Wide Web Consortium, www.w3.org) An international industry consortium founded in 1994 by Tim Berners-Lee to develop standards for the Web. It is hosted in the U.S. by the Computer Science and Artificial Intelligence Laboratory (CSAIL) at MIT (www.csail.mit.edu/index.php). ), and that group published the VoiceXML 2.0 version upon which a number of product solutions are now based. The VoiceXML Forum says VXML takes advantage of several industry trends, including the growth of the World Wide Web and the migration of the Web beyond the desktop computer, as well as improvements in computer-based speech recognition and text-to-speech synthesis. SALT As described by the SALT Forum, SALT extends existing Web markup languages
in full Extensible Markup Language. Markup language developed to be a simplified and more structural version of SGML. It incorporates features of HTML (e.g., hypertext linking), but is designed to overcome some of HTML's limitations. to enable multimodal Two or more modes of operation. The term is used to refer to a myriad of functions and conditions in which two or more different methods, processes or forms of delivery are used. On the Web, it refers to asking for something one way and receiving the answer another; for example requesting and telephony access to the Web. The SALT 1.0 specification enables multimodal and telephony-enabled access to information, applications and Web services (1) Loosely, any online service delivered over the Web. Such usage appears in articles from non-technical sources, but not in IT-oriented publications, because definition #2 below describes the correct use of the term. from personal computers, telephones, tablet PCs and wireless personal digital assistants. This powerful multimodal access will allow end users to interact with applications in a number of ways, such as audio, speech and synthesized speech, plain text, mouse or keyboard, video or graphics. The SALT 1.0 specification is currently under consideration within the World Wide Web Consortium. This is what the parties in each standard have to say about themselves. But how should managers of contact centers evaluate the relative pros and cons pros and cons Noun, pl the advantages and disadvantages of a situation [Latin pro for + con(tra) against] of SALT and VoiceXML? Five Questions To Ask If you are an IT manager responsible for the performance of a contact center, Web infrastructure or any other form of customer or employee self-service, and you see speech-enabled automation as a natural part of your user interface, which standard is right for you? Here are five questions that go straight to the heart of the VXML versus SALT debate. Ask them, and get the right answers, before you make a decision on speech technology standards. 1. What is your current Web infrastructure? Is your existing Web infrastructure built on J2EE (Java 2 Platform, Enterprise Edition) A platform from Sun for building distributed enterprise applications. J2EE services are performed in the middle tier between the user's machine and the enterprise's databases and legacy information systems. or .NET? If you are developing applications in the .NET environment, the Microsoft Speech Server The Microsoft Speech Server is a product from Microsoft designed to allow the authoring and deployment of IVR applications incorporating Speech Recognition, Speech Synthesis and DTMF. SALT browser provides a very clean, seamless integration An addition of a new application, routine or device that works smoothly with the existing system. It implies that the new feature or program can be installed and used without problems. Contrast with "transparent," which implies that there is no discernible change after installation. . The Microsoft Speech SDK (Software Developer's Kit) See developer's toolkit and Windows SDK. SDK - Software Developers Kit (or "Software Development Kit"). provides speech development tools and ASP.NET components that integrate into the Microsoft Visual Studio Microsoft Visual Studio is Microsoft's flagship software development product for computer programmers. It centers on an integrated development environment which lets programmers create standalone applications, web sites, web applications, and web services that run on any platforms .NET development environment and the .NET application server. So for contact centers or Web development groups with an existing .NET Web environment, SALT is the obvious choice. Companies that have adopted the J2EE Web infrastructure may have an easier time developing VoiceXML applications. Technically, VoiceXML and SALT browsers will work with any Web server. However, the development tools that are included by VoiceXML vendors are usually Java-based, while the tools included with the Microsoft SALT browser will obviously be tied to .NET. Java developers can still take advantage of the Microsoft Speech Server and development tools by using a .NET server with Web services that communicate with back-end J2EE components for data access and business transactions. While it is certainly possible to make SALT work in a Java environment, many J2EE-based organizations will probably choose VoiceXML. [ILLUSTRATION OMITTED] 2. Are speed and vendor support important to you? If you want to deploy an open standards Specifications for hardware and software that are developed by a standards organization or a consortium involved in supporting a standard. Available to the public for developing compliant products, open standards imply "open systems;" that an existing component in a system can be replaced speech-enabled voice automation application rapidly on a proven technology platform, then VoiceXML provides an advantage in terms of time-to-market and a diversity of vendor offerings. Compared to the relatively new SALT standard, the more mature VoiceXML has been under development for several years and is now in its second major specification release. Additionally, product support for VoiceXML has been introduced by most (if not all) IVR vendors in the marketplace. Organizations can leverage VoiceXML to immediately deploy an open-standards IVR with full integration to the call center, PBX (Private Branch eXchange) An inhouse telephone switching system that interconnects telephone extensions to each other as well as to the outside telephone network (PSTN). , ACDs and CTI (Computer Telephone Integration) Combining data with voice systems in order to enhance telephone services. For example, automatic number identification (ANI) allows a caller's records to be retrieved from the database while the call is routed to the appropriate party. , and enjoy the technical support and service of established system suppliers. Vendors that support both the VoiceXML and SALT standards give companies an added degree of flexibility; they can deploy now using established VXML solutions and, if conditions warrant, migrate smoothly to SALT-based applications at some point in the future. In fact, a single customer or employee interaction could include interaction with both standards seamlessly within a single call. 3. Do you need multimodal access? If multimodal access by devices including mobile phones and wireless PDAs, in addition to traditional telephony and Web browsers The following is a list of web browsers. Historical Historically important browsers In order of release:
This doesn't mean you can't voice-enable a Web site using VoiceXML. X+V (or XHTML plus Voice) extends the VoiceXML specification by adding multimodal attributes. However, if multimodal access is a central issue in your deployment, SALT may be your best option given its more granular level control of multimodal events and the fact that this capability was built into the requirements of its design from day one. For example, the New York City Department of Education The New York City Department of Education is the branch of municipal government in New York City that manages the city's public school system. The school system these schools form is the largest system in the United States. Over 1. (NYC NYC abbr. New York City NYC New York City DOE) boasts the largest school system in the country with over one million students. To optimize the children's educational experience by addressing parental concerns and encouraging parental involvement, the NYC DOE is using a speech-enabled application on a SALT-powered platform to enable parents to check things such as their child's attendance record, course grades and lunch menu for the day. Much of this information is already available to the parents via the NYC DOE Web site, but they are using speech technologies to enable round-the-clock accessibility to the information for parents that don't have consistent access to a computer. 4. Which standard will best support your existing infrastructure? The standard you select must support your existing technology infrastructure. However, the plain truth is that both VoiceXML and SALT are equally inadequate in their ability to integrate into a call center environment. VoiceXML and SALT are presentation-layer specifications, meaning they address the user interface (voice and multimodal), but do not address integration or back-end functionality requirements. At best, standards provide a baseline framework for such things as the hardware platforms Each hardware platform, or CPU family, has a unique machine language. All software presented to the computer for execution must be in the binary coded machine language of that CPU. Following is a list of the major hardware platforms in existence today. See platform. (Intel, Windows, Linux), telephony integration (ISDN ISDN in full Integrated Services Digital Network Digital telecommunications network that operates over standard copper telephone wires or other media. , SS7, SIP) and the voice user interface (VoiceXML and SALT). But standards do not encompass all of the components needed to integrate and deploy a voice automation solution. If we consider technologies such as call control, CTI and legacy host integration, we see that standards do not cover every crucial element necessary for a successful voice automation solution. Components that also include tools for operating, maintaining and administering the systems and tools for developing and debugging (programming) debugging - The process of attempting to determine the cause of the symptoms of malfunctions in a program or other system. These symptoms may be detected during testing or use by real users. applications are needed to support the successful lifecycle of a solutions deployment and are not adequately addressed by the standards themselves. To create a workable solution, you need all of these elements--some of which are supported by an open standard. However, in the end, it's up to the solutions vendor to provide all of the product components that are needed to develop, maintain and report on a voice application. In fact, an organization can rely on open standards to build, perhaps, half of an open IVR solution, and the rest must be supplied by a vendor. Because neither SALT nor VXML provide all of the features needed for an IVR solution, organizations that must deploy these solutions in the context of the larger call center environment may wish to seek out solutions that support both SALT and VXML. There are no agreed upon Adj. 1. agreed upon - constituted or contracted by stipulation or agreement; "stipulatory obligations" stipulatory noncontroversial, uncontroversial - not likely to arouse controversy standards for call control, CTI or data/host integration, for example, which means various vendors will deploy very unique solutions. To ensure optimum flexibility, it is important to support either your preferred standard or both standards, so you can support the elements of your existing or planned infrastructure under that standard. 5. Which standard will win in the contact center market? The answer is, we just don't know Don't know (DK, DKed) "Don't know the trade." A Street expression used whenever one party lacks knowledge of a trade or receives conflicting instructions from the other party. and in reality we don't need to know. There will always be new and competing ideas on standards, and while either SALT or VXML may one day emerge as the dominant player, they may coexist equally for a significant period of time. In fact, there's even talk that the two standards may one day come together into one. The other thing of which we can be certain is that standards such as VXML and SALT will continue to evolve, and that new standards will be created to address new functionality in the future. That's why it makes no sense to delay the launch of a flexible, cost-efficient voice automation solution until the standards sort themselves out. Standards In Perspective It is easy enough to get caught up in the debate over the relative advantages of one standard or another. Standards matter. What matters more is that you get the speech-enabled application right. In the end, customers interface with and react to applications and not the standards that help enable those applications. By keeping your focus on the quality and efficiency of the customer interaction, and the wider set of automated voice technologies needed to support more natural and effective communications, you can put the ongoing debate over standards in its proper perspective. Organizations invest in voice recognition solutions to improve the quality of customer interactions. Standards are just a means to that more important end. If you are interested in purchasing reprints of this article (in either print or HTML format), please visit Reprint reprint An individually bound copy of an article in a journal or science communication Management Services online at www.reprintbuyer.com or contact a representative via e-mail at reprints@tmcnet.com or by phone at 800-290-5460. For information and subscriptions, visit www.TMCnet.com or call 203-852-6800. BY George T. Platt, Intervoice George Platt is senior vice president for Business Development and Marketing for Intervoice (www.intervoice.com). |
|
||||||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion