Printer Friendly

Document (re)Presentation: Object-orientation, Visual Language, and XML.


* Argues that document analysis and design can integrate ideas from modern text theory into object-oriented thinking

* Demonstrates how object-orientation and visual language may be used to map text structures onto perceptual object configurations

With the increased use of XML (Extensible Markup Language) and derived markup languages, the production of technical documentation will, to a great extent, become a software engineering enterprise based on the so-called object-oriented paradigm. One of the main advantages of object-orientation is that it provides document designers and tool developers with a framework for addressing conceptual as well as technological issues in document analysis and design in a consistent way based on set theoretical principles.

However, the premise of this article is that object orientation alone, despite its useful formal foundation, is not quite sufficient to describe communicative aspects of documents, in particular the functional and rhetorical roles of visual design. Instead, the article makes the case for an approach to document analysis and design that seeks to integrate ideas from modern text theory into object-oriented thinking. By combining object orientation with theoretical notions about text, it will eventually be possible to create a common formal vocabulary for thinking and talking about digital document architectures and, more narrowly, the relationship between markup, information design, and meaning.

This article focuses on what we may call document (re)presentation, the transformation of marked-up XML files in plain text into customized, formatted user documents (and vice versa). Thus, its main aim is to demonstrate how the combination of object-orientation and Horn's notions of visual language morphology, syntax, semantics, and pragmatics may be used to analyze and describe the mapping of formally expressed text structures onto perceptual object configurations. I hope that by integrating these two perspectives, we will have a more complete set of analytic tools for dealing with digital documentation in its various guises. In addition, a brief attempt is made to raise the question of whether or how greatly the coupling of object-orientation and visual language might also be exploited more directly for design purposes in an XML context.


XML is becoming a key element in many organizations' strategies to cope with the increasing demands for updated, flexible, and usable information products. XML and XML technologies should help them

* Ensure consistency in, and across, documents

* Publish information content in multiple media

* Reuse information modules across documents

* Facilitate information retrieval and processing

* Customize information content

Using XML essentially means marking up or tagging document content in a standardized, explicit, and unambiguous fashion. This coded information may then be used for various purposes: it may be queried or filtered; it may be rendered in various forms and modes; and it may he transformed into new information structures. But XML only adds real value to an organization's information assets if it is based on a thorough analysis of what needs the enrichment of data is going to fulfill (processing, presentation, management, and so forth).

Document production with XML may benefit from an object-oriented approach to document production. An object-oriented view of documents is one that defines documents as information systems consisting of objects that form a whole according to some plan or purpose (defini-

[less than]procedure author="Jones" version="1"[greater than]

[less than]title[greater than]How to copy text[less than]title[greater than]

[less than]step[greater than]

[less than]instruction[greater than]Select the text you would like to copy[less than]/instruction[greater than]

[less than]/step[greater than]

[less than]step[greater than]

[less than]instruction[greater than]Choose

[less than]screen.object[greater than]Copy[less than]/screen.object[greater than] from the [less than]screen.object[greater than]Edit[less than]/screen.object[greater than]

menu[less than]/instruction[greater than] tions in this section are taken from Martin and Odell 1998). And an object-oriented approach to the development of such systems entails analysis, design, and construction. Analysis takes as input real-world documents and returns conceptual models of them, while document design is the mapping of conceptual models onto implementation models. Construction is the process of creating working systems using design models.

The goal of object-oriented document analysis and design of technical communication is to produce document models that identify, classify, and formally describe

* Documentation objects (What types of objects do we find in various technical genres, and what properties do those object types have?)

* Object relationships and structures (What relationships do we find among objects, and how do we describe the internal structure of those objects?)

* Transformation of objects (How can objects be transformed into new objects, and to what extent may objects be aggregated in new--possibly virtual-- structures?)

* Object behavior (How can document objects respond to external stimuli--for instance, mouse clicks?)

The output of these modeling activities is used in the construction process to generate actual (online) documentation for real users.

As an implementation formalism, XML lends itself to object-oriented document analysis, design, and construction since XML inherently takes an object-oriented view of information. In XML, a document is a hierarchically structured tree of labeled content objects to which further information values may be assigned. Consider, for instance the simple example of a familiar text type in technical communication marked up in XML shown in Figure 1.

The root of the tree is the object called procedure. This element contains four daughter objects, or elements, namely title, step, step, and tip. The two step elements contain subelements (instruction, result, screen shot), some of which, in turn, themselves may contain additional objects (screen.object). Start and end tags ([less than][greater than] [less than]/[greater than]) signal the beginning and the end of a document element. Last, attribute values are attached to the root of the tree to provide useful metainformation about the document as a whole (author and version) and to the screen. shot element to locate the position of a particular graphics file on the network.

Object-oriented document analysis, design, and construction based on XML will normally include the five following phases.

1. Document analysis

Document analysis comprises a study of legacy documentation and often an analysis of future needs for design changes or enhanced functionality. The deliverable of this phase is usually a formal report describing the content and structure of existing publication types, and a list of possible changes to be made in future versions of these documents.

2. Markup model selection or development

The tags that should be available to writers must be identified. Therefore, a markup model or vocabulary must be selected or developed. A markup model, also known as a markup language, is normally defined in a Document Type Definition (DTD), a prescriptive genre model specifying what content objects a certain document class may, or must, consist of; the order in which these elements are permitted to occur; and any additional properties that they might possess. Document objects may be modeled on the basis of

* Their structural function in the text (for example, chapter, section, heading, list item)

* The way they are presented (for example, bold, italics, indentation)

* Their data type (for example, rule, graphics, animation, text)

* Their meaning in the real world (for example, date, author, abstract, procedure, screen.object)

In a generic markup language like XHTML, tags exist to mark up all four kinds of objects, For example:

* Structural: [less than]p[greater than] (paragraph), [less than]body[greater than] [less than]div[greater than] (division)

* Presentational: [less than]b[greater than] (bold), [less than]i[greater than](italics), [less than]br[greater than] (break)

* Data-oriented: [less than]img[greater than] (image), [less than]hr[greater than] (horizontal rule)

* Semantic: [less than]address[greater than], [less than]dt[greater than] (definition term)

In XML vocabularies developed for specific purposes or special business areas, more descriptive semantic tags are needed. For instance, as shown above, a markup language for user software documentation would contain tags to mark up information objects such as procedural discourse constituents, keyboard functions, and screen objects.

3. Documentation writing and markup

This phase involves the construction of so-called document instances, marked up XML documents in plain text, using the selected markup language. The markup model here functions as a kind of template indicating to authors what tags are available for insertion in the text they are working on. Document instances are also known as abstractions because they contain only structured data and usually no information about how the data will be rendered. This separation of content and presentation is very much at the heart of XML-based publishing.

4. Style sheet design

Styling--more generally, information design--is best seen as an activity separate from writing and one that is accomplished through the use of style sheets. In its simplest form, a stylesheet is a set of rules that assigns styles--layout and typography--to the components of an XML document, but much more sophisticated forms of styling are possible. Style sheets may totally reorder the content elements of an XML document; they may insert additional text, graphics, or even multimedia objects into a document; and they may attach programs (scripts) to document objects to make them interactive.

Currently, two types of style sheets are supported by XML: Cascading Style Sheets (CSS) and style sheets constructed in XSLT (Extensible Stylesheet Language Transformations). While CSS style sheets are said to merely decorate XML files, XSLT style sheets transform XML document structures into new information structures in XML or other formats, such as HTML or PDF. Therefore, XSLT style sheets are sometimes referred to as transformation sheets (see Kay 2000).

5. Rendition of abstractions

Finally, abstractions are linked to style sheets to create renditions, documents designed to be viewed, read, or listened to by real users. Since style sheets themselves are reusable documents, multiple document instances may share the same stylesheet, and one document instance may be linked to different style sheets and in this way facilitate customized information dissemination.


In object-oriented terminology, the transformation of abstractions into renditions can be defined as the mapping of one object structure onto another. More precisely, document (re)presentation involves mappings from formally specified object configurations to perceptual object configurations. In fact, XSLT style sheets do not directly map a formal object structure onto a perceptual object configuration but rather from one formal object structure to another, which is then rendered as a perceptual entity.

In such mappings,

* Objects may be retained, deleted, or generated.

* Objects of one data type may be transformed into another (that is, text to image).

* Object relations may be preserved or changed.

* Objects are eventually formatted.

The relation between document representations and presentations is a many-to-many relation: one abstraction may be rendered in a number of ways, and one rendition may realize many underlying markup structures.

Object mapping not only raises the fundamental question of how logical object structures relate to perceptual object structures but also the more practical issue of how we go about writing rules (style sheets) that appropriately transform underlying, formally expressed text structures onto visual designs, and vice versa. For instance, how do we ensure that the style sheets we produce result in document designs that are not only valid visual manifestations of the data structures they are meant to reflect but also rhetorically appropriate in the communication contexts for which they are intended? Such issues cannot be addressed with reference to object-oriented concepts alone. What is also needed is a theoretical framework that provides insights into the perceptual and rhetorical dimensions of document design--and in particular, the interaction of verbal and visual elements.


One such framework might be Horn's theory of visual language (Horn 1998). Visual language is the integration of words, images, and shapes into unified functional communication units. Visual language, of course, has a long history, but, to be sure, its significance as a communication vehicle has become enormous in today's world of complexity and rapid change.

In his analysis of the properties of visual language, Horn draws on well-established categories in linguistics, namely morphology, syntax, semantics, and pragmatics.

Expanding the scope of these notions, Horn demonstrates how visual language, like ordinary language, involves a set of basic elements (morphology), rules for combining these elements (syntax), the meaning of combined elements (semantics) and the actual use of elements for specific communicative and rhetorical purposes (pragmatics). And the ability to "speak visual language" presupposes knowledge of all four components.

As far as morphology is concerned, Horn operates with a typology of visual language primitives. The typology contains three major classes of morphological elements:

* Words (linguistic units which can be combined into phrases, sentences, and blocks of text)

* Shapes (abstract gestalts like points, lines, forms, and sometimes spaces)

* Images (objects that resemble entities in the natural world)

These elements have various visible properties such as color, size, thickness, texture, orientation, and transparency.

Visual language syntax defines the constraints on combinations of words, shapes, and images in two- or three-dimensional space. In exploring how verbal and visual elements may be arranged in spatial patterns, Horn makes extensive use of insights from gestalt psychology, one of the most influential sources in the 20th century for studying principles of human perception. In particular, Horn draws on the gestalt principles of

* Figure-ground segregation (Visual objects are always perceived against a background.)

* Proximity (Visual objects closest together are perceived as a group.)

* Similarity (We tend to group together visual objects that are similar in terms of form, size, color, and so forth.)

* Common region (Visual objects enclosed by a line are perceived as a group.)

* Connectedness (Visual objects connected by a line are perceived as a group.)

* Good continuation (We tend to group together visual objects appearing to be direct continuations of each other.)

* Closure (We tend to organize visual objects into closed structures rather than open ones.)

Another important component of visual language syntax is the construct of visual topologies--recurring, recognizable spatial patterns of morphological elements. Visual topologies include matrix structures and networks, and are usually manifested as diagrams of different kinds (tables, tree structures, flowcharts, Euler diagrams, and so forth).

Visual language semantics is about the meaning of composite visual language elements: What meanings emerge when words, shapes, and images are aggregated in specific spatial patterns, and what semantic functions do individual elements serve in such clusters? Visual language semantics covers both the semantics of what we may call content communication units and meanings arising from element clusters whose primary purpose is organizational or navigational.

Visual language pragmatics denotes the application of visual language to solve communication problems in different contexts and areas, that is, document design practices. Here the focus is on the question of who uses visual language, and for what purposes, and on how the interaction of verbal and visual elements contributes to communication effectiveness in specific situations.


How can we use the combination of object-oriented and visual language concepts to analyze and describe particular instances of abstraction-to-rendition mappings? Consider once again the procedure in XML presented in Figure 1.

This procedure might be rendered in many ways, including those shown in Figure 2, Figure 3, Figure 4, and Figure 5.

Figure 2 is a fairly straightforward rendering of the underlying abstraction. All element content has been extracted and output in the rendition in the same order as in the underlying structure. In the case of the result and tip elements, the tag names have been incorporated to indicate the communicative purpose of the object.

All conversions are text-to-text except one, namely the mapping of the empty element [less than]screen. shot file ="n:\graphics\copy.gif" /[greater than] to an image (in fact, a mapping from one morphology type to another). From a visual language semantics point of view, the image expands the meaning of the preceding text object by visualizing the entities that it refers to (the copy command and its position in the menu system).

Object relations in the text are primarily signaled through the use of the gestalt principle of similarity. Notably, identical formatting is applied to objects of the same category.

In Figure 3, the conversion has resulted in several object transformations.

1. The screen.shot element has been deleted, or left unrealized if you like. This figure demonstrates that some objects are obligatory--that is, they must always be realized--while others are optional--that is, they need not be mapped. Thus, using terminology from another text-theoretical framework, namely rhetorical structure theory (see Mann and Thompson 1988), we may distinguish between nuclear and satellite rendering objects.

2. A new organizational structure has been imposed. In Figure 1, the result and tip objects are not on the same hierarchical level: Result is the daughter of a step element, while tip is a sister of that step element. In Figure 3, however, the result and tip objects are aggregated into one perceptual object. This is done by invoking the principles of similarity and proximity: the two objects are designed similarly and placed close to one another.

3. The order of the tip and result elements has been changed. In terms of semantic relations, this does not seem illogical. Because the tip element really signifies an alternative to the instruction, there is no reason why the two should be placed apart. (Whether or not the design actually has an effect on the communication quality or usability of the procedure is a matter that must be empirically tested.)

In Figure 4, the procedure is mapped onto a table. This conversion also entails a change in perceived organizational structure, this time explainable by the gestalt principle of common region. Enclosed in the same table cell, the last three elements are perceived as one unit, a relationship not found in the abstract representation of the procedure in Figure 1.

Finally, in Figure 5, the tabular approach is used once again. Although the rendition in Figure 5 contains almost the same information as that in Figure 4 and manifests the same kind of visual topology, it does not seem as communicatively or rhetorically appropriate. There are two reasons why. First, the typeface chosen is not suitable for a functional text type such as a procedure; it is not readable enough, and it has the wrong kind of genre connotations. Second, we are not used to having an essentially linear type of discourse presented in a table that also guides our reading in a horizontal direction. What we have here could be called an example of a violation of visual pragmatics conventions.


I have argued above that the combination of object-orientation and visual language provides a set of analytic tools for describing the relationship between XML documents and their realizations. The question is, however, whether, or to what extent, the integration of these perspectives might also be exploited more directly for design purposes in a document production paradigm based on XML. This issue is briefly, and somewhat tentatively, touched on in this closing section.

It may be said that rendering XML documents is tantamount to applying visual language to logical data structures through style sheets. The idea can be illustrated by Figure 6.

In this model the markup language specified in the DTD is the basic component: it defines the text objects--or markup constructs--of a given class of documents and their potential combinations. The markup language functions as a model or template for the generation of abstractions.

An abstraction is an explicit, hierarchically organized configuration of labeled text objects that can be transformed into a rendition, a configuration of perceptual (verbal, visual, or aural) objects. However, the transformation need not be direct. An abstraction can be mapped onto a representation of a rendition, a coded version of the rendition. This is the case when, for example, an XML document is converted to XHTML to accommodate existing browser technology; when an XML document is transformed into VOXML, an XML-based language for representing aural renditions; or when the XML vector graphics format SVG is used to (re)present visual renderings of underlying data structures.

The great advantage of having this extra layer is that it enables us to represent renditions in exactly the same way as we represent abstractions: as XML objects. Morphological elements like text blocks, graphics, and pictures, as well as visual typologies such as tables and diagrams, can be represented as XML element structures and their visual properties as attributes. Working with the same formal notation means that we can be not only much more precise in the way we define and control rendering, but also much more flexible in the way that we embed logical structures in perceptual structures. For instance, in mapping from one XML vocabulary to another, we might choose to realize the content of an abstraction object as an integrated part of an image rather than as a self-contained, formatted block of text, or present a certain set of data in an interactive chart rather than in a static tabular arrangement.

Mapping is done through style sheets. Style sheets constitute functions that take abstractions as input and deliver renditions, or representations of renditions, as output. As implementation models, style sheets are subject to and constrained by visual language principles and conventions. Current style sheets, however, reflect visual language only in a very indirect and implicit way. A typical stylesheet simply applies low level properties such as typeface, typesize, background color, position, and so forth to an object structure, and does not, as such, convey the rationale of the specific design choices it contains.

It might be argued that a potentially more interesting and challenging application of style sheets would be to use them to capture and formalize the principles and conventions of visual language morphology, syntax, semantics, and pragmatics in rule-based form, and to make that knowledge available and operational in document design--that is, style sheets as design knowledge bases.

For instance, in addition to specifying how certain objects and object types should be formatted and spatially arranged, style sheets might also contain more general rules that explicitly take into account factors such as the information content of the objects, the contexts in which they occur, their communicative purpose, the genre conventions of the document class to which they belong, and the media in which they are to be published.

Applying visual language to object types in procedural discourse, for example, might generate rules such as these:

* A visual pragmatics rule could define permitted visual topologies for various (sub)types of procedure.

* A morphological rule might specify whether results of user actions should be realized as text strings, screen shots, or a combination of both depending on who the audience is or which type of output device the procedure is displayed on.

* A generic visual syntax rule might invoke the gestalt principle of proximity to consistently signal a visual distinction between an object denoting the result of an individual user instruction in a procedure and an object signifying the result of an entire procedure:


OBJECT = "Result"





This rule would ensure that no matter how much leading or white space is generated above the result of an instruction, there must always be twice that amount above the result of a procedure. The rule would also ensure spatial consistency across morphological realizations. That is to say, text strings (verbal descriptions) and images (screen shots) manifesting the same kind of underlying object type would always be treated the same way in terms of assigned white space above.

But do we really need "smart" style sheets like this? Well, there is little doubt that much publishing in the future will be of a semi-automatic nature: Vast quantities of document data will reside in databases and will be published on the fly in a growing number of media (paper, Web, handheld devices) and for many different communicative purposes. One problem with this scenario, it would seem, is that rhetorically appropriate presentation formats cannot realistically be predefined for the whole range of possible outputs. Instead, more intelligent document processing is called for. And intelligent document processing in turn requires access to explicit, formalized knowledge about documents, their structure and content, and the way users interact with them. Style sheets coupling object-orientation and visual language through XML might prove to be a powerful tool for capturing part of that knowledge in a theoretically sound way.


If we are to create genuinely flexible and user-friendly information products, it is crucial that equal importance is attached to the technological, perceptual, and rhetorical aspects of document design. As has been suggested above, one way of making progress in this area might be to exploit the synergy of object-oriented thinking, modern text theory, visual language, and XML. If we can successfully integrate these elements at various levels, we may be able to take the first step toward making document design a truly knowledge-based engineering enterprise.


I am grateful for the comments and suggestions of Jonathan Price, guest editor for this special issue, and of the two anonymous reviewers of this article.

LARS JOHNSEN is an associate professor at the University of Southern Denmark as well as an external lecturer at the Copenhagen Business School. His background is in modern languages and linguistics, but for the past 5 years, his work has mainly been within the fields of technical communication and information technology. He teaches courses and does research in document analysis and design, and is particularly interested in the integration of formal and text-theoretical approaches to information design.


Horn, R. E. 1998. Visual language: Global communication for the 21st century. Bainbridge Island, WA: MacroVU, Inc.

Kay, M. 2000. XSLT. Programmer's reference. Chicago, IL: Wrox Press Ltd.

Mann, W. C., and S. A. Thompson. 1988. "Rhetorical structure theory: Towards a functional theory of text organization." Text 8, no. 3:234-281.

Martin, J., and J. J. Odell. 1998. Object-oriented methods: A foundation. Upper Saddle River, NJ: Prentice Hall, PTR.


Boumphrey, F., C. Greer, D. Ragget, J. Ragget, S. Schnitzenbaumer, and T. Wugofski. 2000. Beginning XHTML. Chicago, IL: Wrox Press Ltd.

Campbell, K. S. 1995. Coherence, continuity, and cohesion. Hillsdale, NJ: Lawrence Erlbaurn Associates, Inc.

Goldfarb, C. F., and P. Prescod. 2000. The XML handbook. Upper Saddle River, NJ: Prentice Hall, PTR.

Johnsen, L. 1999. "Designing perceptually cohesive text objects in technical communication." In Document design: Linking writers' goals to readers' needs. Proceedings of the First International Conference on Document Design, ed. A. Maes, H. Hoeken, L. Noordman, and W. Spooren. Tilburg, Netherlands, 17-18 December, pp. 79-87.

Price, J. 1997. "Introduction: Special issue on structuring complex information for electronic publication." IEEE transactions on professional communication 40, no. 2:69-77.

Schriver, K. A. 1997. Dynamics in document design. New York. NY: John Wiley & Sons, Inc.
COPYRIGHT 2001 Society for Technical Communication
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2001 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Publication:Technical Communication
Date:Feb 1, 2001
Previous Article:The Possibilities Are Wireless: Designing and Delivering Information in the Wireless Space.
Next Article:Modeling Information for Three-dimensional Space: Lessons Learned from Museum Exhibit Design.

Terms of use | Privacy policy | Copyright © 2021 Farlex, Inc. | Feedback | For webmasters |