Printer Friendly

XML helps untangle the Web.

XML stands for eXtensible Markup Language. XML grew out of SGML and HTML, but while HTML allows users to define the presentation of dat--what it looks like on a browser screen--XML allows users to define the meaning of data. XML allows document designers to classify discrete pieces of data and store them in central database repositories, essentially turning unstructured documents to structured documents. Major technology vendors such as Microsoft, Sun, and Oracle are supporting XML with new products and technologies designed to take advantage of its dramatic ability to structure many document types for content management, data conversion projects, supply chain management, and messaging.

Unlike HTML, XML tags are dynamic and user-generated. Document designers can represent the tags right in the document, called a "well-formed" document. They can also define more complex tag definitions in linked documents called DTDs (Document Type Definition), or by more complex XML schemas, which describe the structure of data and allows document designers to label distinct data elements. Tags are not predefined, but may be freely defined by users, departments, organizations, and even industries. Car manufacturers, chemists, genealogists, and real-estate brokers are among the industry and professional groups that are actively using common DTDs, which serve as a type of lexicon for related XML documents. XML and its resulting flavors are mechanisms for communities of interest to create their own ontology: a common system codifying the concepts that are meaningful to that community.

XML is not itself a product, but is a language that undergirds other applications, databases, and processes in the marketplace. For example, the Aberdeen Group reports that thanks to improvements in EAI, Internet global connectivity, and reduced transaction costs via the Web, XML-based collaboration underpins a multi-billion-dollar e-Commerce and B-to-B marketplace.

New Technologies

Mike Hogan of POET Software lists a selection of new XML-enabled technologies, including Internet search engines, e-commerce, objects, EDI, data re-purposing, and content personalization.

Internet Search Engines. Search engines can access contextual information instead of just keywords. For example, when performing a search on "Oracle," the user will not have to contend with multiple pages of questionable psychic links. Instead, the search can be narrowed down to fields tagged "database vendors." There are certainly challenges to wide acceptance: older browsers don't recognize XML pages, each industry will have to set up its own standard structure, and Web site content providers must tag the pages according to standardized structures. In addition, search engine indexing applications will have to hold tag information as metadata and must learn each standard structure in the collection of indexed documents, and search interfaces will have to display the options so structural information is available to searchers.

Electronic Commerce. E-commerce suffers because of the bewildering variety of online product information, pricing, payment options, and check out routines. Although the dot-com industry birthed several shopping tools using intelligent agents, they were generally unhelpful because, as Hogan commented, "They have an even harder time than humans in trying to make sense of the digital morass presented by HTML." XML repositories can allow on-line stores to present product information in a standard, structured format, independent of page design. XML has a particular impact here because numbers are notoriously difficult to structure by context.

Self-describing BLOBs and Distributed Object File Systems. BLOB (not the 1950s horror movie) stands for Binary Large OBject. BLOBs may be any type of file format, including Word or PowerPoint documents, images, CORBA objects, and so on. File systems maintain just small amounts of information about their files, which might include size, author, creation and modification dates, and file location, but very little else. But an ODBMS (object database management system) can use an XML linking mechanism to bind a BLOB with a set of descriptive XML metadata, rendering it readable to the database without the need for conversion. For example, this would allow a CRM system to not only display information about a customer's buying habits, but also intelligently locate and present customer e-mail dealing with a particular issue or product.

Electronic Data Interchange (EDI). EDI provides standard message formats and element dictionaries that enable businesses to exchange data via any electronic messaging service. XML combined with EDI provides a standard framework to exchange different types of data such as invoices, healthcare claims, and project status. There are multiple exchange possibilities, including transactions, Application Program Interfaces (APIs), web automation, database portals, catalogs, workflow documents, or messages. The recipients can then search, decode, manipulate, and display the resulting information using a variety of applications.

Data Re-purposing. Tom Rhoton, Director of Product Marketing at WhizBang! Labs, believes that re-purposing is XML's greatest contribution. When documents are broken into elements, they lend themselves to very efficient content searches. And by using XML, users can not only extract relevant information much more easily; they can also efficiently plug the information into their own web pages, documents, and presentations. This will allow companies to reuse certain content pieces over and over again in a variety of ways, making content creation much more cost effective.

Content Personalization. Content personalization, which includes intelligent pull, agent accumulation, and push, today depends on some level of human interaction to filter and present information. But XML would enable agents or search engines to automatically filter information to extract only the desired new information, which the user could then extract, format, and deliver into a variety of different document types.

XML Storage Requirements

XML-defined data and related documents are stored in repositories, most commonly relational databases. However, its storage requirements are not perfectly suited to that environment and require special handling. Most of the major relational databases have XML capability, such as Oracle 9i and Microsoft SQL 2000. However, XML's complex linking features and other capabilities strain the resources of traditional RDBMS. XML Linking alone is extensive and demanding--a work in progress of the Web Consortium, it adds functionality for high-function hypertext and hypermedia.

According to Hogan, object databases are ideal to handle XML data since they handle objects in their native forms and can manage XML's hierarchical tree navigation and rich linking. Object databases can also handle arbitrary, variable-length data types and interrelated data, crucial to the various data types linked within structured XML content. In addition to standard XML data storage, the ideal repositories would offer tightly integrated XML-specific tree navigation, version tracking, arbitrary link management, import/export, the ability to publish structured content on the Web, and support for object-oriented as well as scripting programming languages.

XML does present its challenges. Aberdeen warns that XML-based collaboration is not completely developed, lacking applications that can easily and quickly transform data between the multiplying number of XML schemas, traditional e-Business formats like electronic data interchange (EDI), and back-office data stores typically stored in relational databases. And while some industry consortiums have successfully developed common XML definitions for inter-company data exchanges, other industries still struggle over adopting XML standards for their business.

Will XML transform the Internet from a massive collection of slippery data into an intelligent transport mechanism? It has the potential. And for the many people who struggle with the ungainly mess we call the Web, it can't happen too soon.
COPYRIGHT 2002 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2002, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Internet
Author:Chudnow, Christine
Publication:Computer Technology Review
Date:Jan 1, 2002
Previous Article:SCSI "ease of use profiles" help system designers specify mechanical and electrical interfaces.
Next Article:Uncovering the total cost of ownership of storage management.

Related Articles
THE XML files.
Enterprise Data Access Using XML.
XML for Content and E-Commerce.
Windows Server 2003-Microsoft viewpoint. (Software Intelligence).
Introducing XML web services. (Teach-In).
"PC annoyances: how to fix the most annoying things about your personal computer".
"XML Hacks": XML Secrets from the Gurus to You.
Vivian Untangled.

Terms of use | Privacy policy | Copyright © 2020 Farlex, Inc. | Feedback | For webmasters