The need for interoperability arises in many contexts. Generally, the desire to combine programs written in different languages springs from the availability of specific capabilities in some particular language, processor or existing program. For example, the number-crunching power of a vector processor and the availability of a particular numerical analysis routine in Fortran might entice a programmer to attempt interoperation of a Lisp program running on a workstation and Fortran code running on the vector processor.
A central issue in supporting interoperability is achieving type compatibility so that entities, such as data objects or procedures, used in one program can be shared by another program that may be written in a different language or running on a different kind of processor. While most previous approaches to interoperability have provided support for type compatibility at the representation-level, we are pursuing an approach that will support compatibility defined at the type specification level. Representation level interoperability (RLI) defines type compatibility in terms of the structure (representation) of objects and provides a means for overcoming differences in the ways that different programming languages or machines implement simple types. Thus, RLI hides such differences as byte orders, floating-point precisions or array accessing mechanisms. Specification-level interoperability (SLI) extends RLI by defining type compatibility in terms of the properties (specification) of objects and hiding representation differences for abstract types as well as simple types. For instance, where RLI would hide the byte orders of array elements used to represent a stack object, SLI would hide the fact that the stack was represented as an array. Hence with SLI, representation of the stack as an array or as a linked list or both is made irrelevant to the interoperability of the programs sharing the stack.
Thus, SLI extends RLI in ways that offer several significant advantages. Most importantly, it facilitates integration of interoperating programs by allowing them to communicate directly in terms of higher-level, more abstract types. It also increases the degree of information hiding, thereby reducing the extent to which interoperating programs depend on low-level details of each others' data representations. Moreover, it prevents the misuse of shared data objects (i.e., violations of the intended abstractions), which could easily occur under the RLI approach. SLI allows much greater flexibility in implementation approaches and hence more opportunities for optimization. Finally, SLI increases the range of languages and types that can participate in interoperation.
In this article we present the SLI approach, give a general model of support for SLI and describe our prototype realization of the SLI approach. We begin by discussing our motivating example, namely interoperability in software environments. We then describe the representation-level and specification-level approaches and related work in this area. Next, we present our general model of support for SLI a description of our initial prototype, and a report of some actual experiences with SLI and our prototype. We conclude with a discussion of future directions for work on SLI.
Our work on support for interoperability originated as part of our research on object management for next-generation software development environments [9, 34]. This research is being done as part of the Arcadia project , a collaborative software environment research program encompassing groups at several universities and industrial organizations. The objective of Arcadia is to develop advanced software environment technology and to demonstrate this technology through prototype environments.
Three important goals of next-generation software development environments, such as those envisioned by Arcadia, are extensibility, integration and broad scope. In particular, Arcadia environments are intended to be extensible in order to support experimental investigation of software process models and evaluation of novel tools in the context of a complete environment. At the same time, Arcadia environments must remain integrated, both externally, to aid users of the extended functionality, and internally, to facilitate tool cooperation, environment maintenance and further extension. Arcadia environments are also intended to be broad in scope (i.e., to support a wide variety of development activities, not merely monolingual program development and execution), and therefore to include many different kinds of tools and objects.
These goals require that Arcadia environments facilitate the addition, modification and replacement of any and all kinds of environment components, including tools, management data or process descriptions. This, in turn, leads us directly to the need for interoperability support as an important component of the Arcadia object management infrastructure.
For example, building prototype components for Arcadia has frequently led to a need for interoperation between programs written in different languages. Most often this has arisen when a tool written in Ada has needed the capabilities available in a utility written in C--such as a window manager or an object storage manager. The situation illustrated by a prototype version of the CEDL Constrained Expression Toolset , which implements a technique for analyzing behavioral properties of concurrent software systems, is slightly more complicated. The prototype toolset, as shown in Figure 1, includes a deriver, written in Ada, that produces constrained expression representations of concurrent system behavior from system descriptions in an Ada-like design language called CEDL. The constrained expressions that it produces are then operated upon by either an inequality generator or a behavior generator, both of which are written in Lisp. The inequalities generated by the former are input to an integer programming package which is written in Fortran.
We expect that such multilingual interoperating sets of tools will be very common in the coming generation of integrated, extensible, broad-scope environments like those envisioned by Arcadia. The need to import existing tools or utilities will be an especially significant reason for supporting this kind of interoperability.
Interoperation between programs running on different processors in a distributed computing system is also likely to be a necessity in the next generation of environments. Like other complex, multifaceted software, environments will need to exploit the specialized capabilities of powerful numerical processes, graphics stations, storage servers or other such hardware in a distributed system. In building prototype components for Arcadia we have already encountered several situations where interoperation across multiple machines in a distributed system was necessary. For example, the prototype Constrained Expression Toolset cannot currently be run on any single processor in our computing system; this is because the Ada system needed for running the deriver and the Lisp system needed for running the inequality generator and behavior generator do not coexist on any of our workstations. In the future we anticipate even more opportunities for interoperation across different processors to arise within Arcadia. For instance, future versions of the Constrained Expression Toolset might use specialized numerical processors for running the integer programming component.
Two Classes of Interoperability
Any approach to supporting interoperability is based on achieving compatibility between interoperating components. There are two broad classes of concerns:
execution model (control) issues: How is the execution of the interoperating programs coordinated?
type model (data) issues: How are correspondences established among the ways interoperating programs manipulate a given shared entity?
Execution model issued include simple incompatibilities, such as the difference between using functions or procedures as the primary, or sole, construct for subprograms. More serious execution model issues arise in trying to interoperate sequential and concurrent programs or programs based on different concurrent communication constructs (e.g., synchronous vs. asynchronous or symmetric vs. asymmetric communication primitives). Execution model issues become most problematic if the interoperating programs are based on very different underlying execution models, such as a dataflow model and a logic programming model.
Recently, a variety of interesting approaches to these execution model issues have begun to be explored. For instance, the selective-broadcast communication mechanism of Field  is expressly aimed at addressing execution model aspects of integration in programming environments with its simple model of program interaction based on message-passing semantics. Similarly, the software bus or toolbus mechanism in Polylith  offers an encapsulation of interprocess communication protocols that is intended to simplify interconnection of components in multilingual software systems.
While execution model issues pose some potentially challenging interoperability problems, our work to date has focused on type model issues. We have implicitly relied upon the adequacy of the ubiquitous procedure call as a fundamental building block from which most sequential or concurrent control constructs can be synthesized. This seems reasonable, since most existing approaches to support for interoperability have been based on the use of the remote procedure call (RPC)  for coordinating the execution of the interoperating programs. Such mechanisms are in common use today (e.g., [5, 22, 29, 33]). Of course, RPC mechanisms involve some type model issues as well.
Existing Approaches to Type
Interoperability depends fundamentally on determining and achieving type compatibility. That is, when two interoperating components are sharing or communicating via some data object, they must have consistent views of whatever properties they mutually rely upon that are associated with objects of that type. (2) Note, however, that the need for compatible type definitions does not necessarily imply a requirement for identical type definitions. In the Constrained Expression Toolset example, for instance, the deriver, inequality generator and behavior generator must all have compatible views of the type of the constrained expression objects they share. Their definitions of that type need not be identical, however, but simply sufficiently similar to allow them to communicate correctly and unambiguously.
There are variety of existing approaches to the type model aspects of interoperability. Most of these approaches have been based on establishing compatibility (and usually identity) of data types at the representation level. As indicated earlier, we believe there are a significant advantages to addressing these issues at the type specification level rather than only at the type representation level. In the remainder of this section, we survey existing approaches to type model aspects of interoperability.
Single Type Model
An obvious approach to type compatibility in interoperating programs is to impose a single type model on all the languages in which interoperating programs are to be written. Since the type definitions of all entities to be shared by all interoperating programs are then necessarily directly comparable, establishing the necessary type compatibilities (e.g., by insisting on identical type definitions) is straightforward. Of course, such an approach is tantamount to imposing a single programming language on the implementers of all potentially interoperating programs. To apply it to our Constrained Expression Toolset example, we might recode all of the toolset's components in a single languaged. Once the whole toolset was written in Ada (for Lisp, or Fortran, or any other language), it would be trivial to establish type compatibility of the shared data objects.
This approach has generally been proposed in the context of multimachine interoperability. For example, Herlihy and Liskov  have taken this approach using the CLU language, though they suggest that it can be used with any language that supports abstract data types. Similarly, Emerald  is a language that supports multiple representations of objects across different machines, but again the assumption is that all interoperating programs are written in the same language, namely Emerald. While this approach clearly solves the type compatibility problem, it obviously does not address our goal of interoperability in multilingual systems, such as next-generation software development environments.
Single Universal Representation
Perhaps the most widespread, traditional approach to type model aspects of interoperability is based on explicitly translating shared objects to and from a single universal representation, such as characters or bytes. The earliest form of this approach involved interoperation through ASCII representation of data--where the data were communicated between the interoperating programs via files. This required the programs themselves to translate the data either into or out of the ASCII representation. Of course, languages often provided some automated support for this processing (e.g., the Fortran format statement).
The Unix[TM] operating system supports interoperability via pipes (untyped byte streams) through which two interoperating programs can communicate. The byte stream can encode any type of data. So, as long as the interoperating programs agree on how to interpret the bytes, they can share data of any type. Again, this requires the programs at either end of the pipe to translate a shared object from its actual type into the byte stream representation and back again. For instance, in our Constrained Expression Toolset example, the deriver might encode a constrained expression as a sequence of characters such as: (*(V E1, E2, E3)) which would then be decoded by the inequality generator or the behavior generator.
Under this approach, data type compatibility is determined by the compatibility of the two translations (into and out of the universal representation).
Standardized Basic Types
A more recent and more sophisticated version of the single universal representation approach is the use of a standardized set of definitions for a set of basic types, such as integer, float, and string. Approaches of this kind were associated with early RPC mechanisms, which sometimes leads to a confusion between RPC (one approach to address execution model concerns) and interoperability. Because the main problem addressed by early RPC mechanisms involved different hardware representations of data on different machines (e.g., byte orderings, floating-point formats, character encodings, etc.), it is not surprising that they focused on basic, directly hardware-supported, types. Later versions of this approach have added support for standard aggregate type constructors such as arrays or records.
When this approach was employed in conjunction with the earliest RPC-based systems, it only allowed data types within a single, fixed language domain to be passed from process to process. Thus, these RPC mechanisms not only abstracted away details of the communication protocol (e.g., below the ISO transport layer), but they also abstracted away representational discrepancies among machines and within the fixed language domain (e.g., byte-orderings, floating-point precisions). Support for this independence of representation was not totally automated in some earlier efforts such as those cited; the code that translates between physical representations needed to be written by hand.
More recent RPC mechanisms, such as NCS  and HRPC , automate the generation of code that maps the representations of data types within a program from one machine to another via an intermediate interface description language such as HP/Apollo's NIDL  or Xerox's Courier . Although those two systems can only generate interface code in a single language (C in both cases), their approach seems to lend itself well to supporting additional languages without too much extra effort.
Several systems, such as MLP , Q , Horus , and Matchmaker , have taken the additional step of providing RPC-based support for mixed-language programming. For example, MLP defines a Universal Type System (UTS) for describing objects that are passed among programs in various language domains. UTS builds up types from primitives such as "integer" and "float" via constructors such as "array" and "record" in a manner similar to what would be done in a language like Pascal. It communicates data among language domains by providing a small set of standard routiens that must be implemented for each language. Programmers use these routines within their code to translate from their language domains to UTS and vice versa. MLP was not designed with a particular set of languages in mind and, therefore, it appears to be applicable to a fairly wide variety of languages. The same is true of Horus and Matchmaker.
Q, on the other hand, is designed specifically to support interoperation between C and Ada programs across a heterogeneous network. It is an explicit extension of Sun's XDR/RPC, an existing RPC mechanism that only supports C data types. Like MLP, Q also uses a "constructive" approach to building up the values that are transmitted between programs. Another, similar attempt at multilingual interoperability is the Mercury project , which is designed to support interoperability among the C, Lisp and Argus  language domains.
Under the Standardized Basic Types approach, whether or not it is extended with aggregate type constructors, data type compatibility is determined via the equivalence of the structures defined for a type in all of the programs that share that type. In our Constrained Expression Toolset example, the deriver might define the constrained expression type to be an Ada record structure; the behavior generator and inequality generator might define it with a Common Lisp defstruct; and the compatibility of these definitions would depend upon the field-by-field equivalence of the record and defstruct definitions.
The Common Theme:
The Single Universal Representation and Standardized Basic Types approaches are paradigmatic cases of representation-level interoperability. In an RLI approach:
* type modeling is based on the structure (representation) of objects, and
* type compatibility is based on comparison (or explicit translation) of structures (representation).
Representation-level support for interoperability is both useful and necessary. In our view, however, it has several shortcomings. Chief among these is the fact that RLI is only applicable to low-level simple types (e.g., integers) or compound simple types (e.g., arrays of integers). In particular, RLI does not support abstract types, such as "stack" or "abstract" syntax tree." This not only makes RLI awkward to use in conjunction with modern languages having rich and extensible typing mechanisms (Ada, C ++, CLOS, etc.), but also leads to low-level dependencies on type representations between interoperating programs. For example, if two programs employ an RLI approach to sharing a stack datat object, both programs would be forced to use the same representation of the stack, such as an array with an integer index pointing to the top item, or as a linked list of records of a particular form.
Furthermore, RLI limits the flexibility and extensibility available in interoperating systems. Its reliance on isomorphism of low-level structures inhibits interoperation through similar but not identical types of entities and eliminates any possibility of using different underlying representations for different instances of the same type--thus foreclosing opportunities for optimization of representations.
Single Standardized Submodel
A final approach to type compatibility in interoperability programs is typified by a database style of interaction among components. In the Single Standardized Submodel approach, all interoperating programs must use a single type model, distinct from those found in the languages in which the programs are written, to describe any shared objects. But unlike the Single Type Model approach, unshared objects can be described using the type model(s) of the host language(s) of the interoperating programs. In traditional forms of this approach, the shared submodel provides some basic types, such as numbers and strings, and a very limited set of aggregation constructors, such as tuple or relation. Interoperating programs then carry out all manipulations of shared objects through a foregin language, by utilizing type definitions formulated using the shared submodel, such as an embedded query language (e.g., ). IDL  represents another form of this approach, in which the constructors are attributed graph, set sequence, and node. More recently, object-oriented databases (e.g., , , , ) have begun to offer richer type submodels.
The Single Standardized Submodel approach to type compability has been at the core of several proposals explicitly aimed at supporting interoperability in software development environments. In particular, CAIS , PCTE , and the Atherton backplane  have all been based on the Single Standardized Submodel approach.
For abstract types, such as the constrained expression type in our Constrained Expression Toolset example, the Single Standardized Submodel approach still provides representation-level support for interoperability. Now compatibility is based on a single type submodel and hence a single structure, but abstract types will still have to be encoded using the basic types and aggregation constructors supplied by that submodel. Thus the Single Standardized Submodel approach has some of the characteristics of the Single Type Model approach and some characteristics of the Standardized Basic Types approach.
The SLI approach is motivated by our belief that developers of potentially interoperating programs should have the maximum possible flexibility, convenience and range of expressive power available when defining types for the objects their programs will manipulate. In particular, they should be free to create those type definitions in terms of the type models found in the languages in which their programs are written. They should have full power to develop and use appropriate abstract types, and maximum freedom to ignore the representation (implementation) details associated with those type definitions. Finally, the distinction between shared and unshared objects should have minimal impact on the program and its developer. Specifically, neither the fact that an object is expected to be shared (or not to be shared) among interoperating programs, nor a later change in that status should affect the interface to that object as seen by the rest of the program that manipulates the object. An interoperability approach lacking these features will severely impeded integration and extensibility. Unfortunately, RLI approaches are deficient in all of these areas.
Specification-level interoperability overcomes the shortcomings associated with representation-level interoperability. Rather than focusing on the mapping between different representations of a type, SLI focuses on support for common definitions of a type's properties. The SLI approach thereby attains the benefits of abstraction and information hiding for interoperating programs, encouraging the use of entity descriptions (i.e., type definitions) that promote the overall organization of a software system. By raising the level of cooperation from isomorphism of representation to equivalence of overlapping properties of shared types, SLI eliminates low-level dependencies among interoperating programs, enables interoperation through similar but not identical entity types, and permits differing representations for different instances of a type--thus allowing for optimized representations. Of course, SLI depends upon RLI mechanisms, essentially subsuming RLI in those cases involving simple types.
In pursuing the goal of specification-level interoperability, we began by developing a general model of the support required to realize SLI. Guided by the model, we then assembled a prototype realization of SLI as a demonstration of its feasibility and usefulness.
A Model of Support for SLI
Our model distinguishes four components necessary to fully support specification-level interoperability.
A Unifying Type Model (UTM): A UTM is a notation for describing the types of entities that are to be shared by interoperating programs. UTM type definitions supplement but do not replace the type definitions for the shared entities that are expressed in the language(s) in which the interoperating programs are written. A UTM must be a unifying model, in the sense that it is sufficient for describing those properties of an entity's type that are relevant from the perspective of any of the interoperating programs that share instances of that type. Hence, a UTM should be capable of expressing hihg-level, abstract descriptions of the properties of a broad range of types, but need not adhere too closely to the syntax or type definition style of any particular programming language.
Language Bindings: Given a UTM and a particular programming language, there must be a way to relate the relevant parts of a type definition given in the language to a definition as given in the UTM. Each such mapping between a UTM and a particular language is referred to as a language binding. Note that not all aspects of a UTM must be mappable to a given language; mappings must exist only for those aspects that are, or could be, relevant to programs in that language. Hence, a set of different bindings could be defined for a given language, each providing mappings for only those UTM aspects relevant to a particular interoperating program written in that language.
Underlying Implementations: The combination of a UTM type definition and a language binding induces an interface through which an interoperating program written in that language can manipulate instances of the entity type. Underneath the interface will be one or more representations for data objects and code to implement procedures, such as the procedures (i.e., operations) that the interface provides for manipulating the data objects. A major benefit of the SLI approach is that all such details of implementation are hidden from the interoperating programs by the interface. This permits experimentation with alternative implementations, and "rapid prototyping" development styles in which "quickand-dirty" implementations can later be improved without affecting the interoperating program. It even allows heterogeneous representations for the same type so that different instances can be optimized for different kinds of manipulations (e.g., navigational access vs. associative access to different collections of data of some particular kind).
Automated Assistance: Although SLI can be beneficially employed using entirely manual methods, its value is greatly increased through automated support. In particular, someone creating a UTM definition would be greatly aided by a library of preexisting UTM type definitions, language bindings and underlying implementations, plus a browser for exploring that library. An automated generation tool would also be valuable. For example, such a tool would take a UTM type definition, plus specifications for the desired language binding and underlying implementation (possibly indicated interactively through a selection capability in the browser), and generate the corresponding interface.
Following is a scenario that illustrates how we envision a full-scale realization of this model being used. The next section describes our initial prototype which provides a subset of the capabilities. listed above.
Suppose a newly developed program is to be integrated with an existing set of programs. The implementer of the new program might begin by determining which objects used by which programs in the existing set would need to be shared with the new program. In the case of objects that are already shared among the existing programs or whose sharing had been anticipated by their developers, UTM definitions for their types might be found by browsing the type definition library. In other cases, new type definitions might be created using the UTM notation, possibly by finding and modifying or extending existing definitions from the library. In a similar fashion, the implementer would find or create appropriate language bindings of the language(s) in which the program is to be implemented, and find or create suitable implementations.
Using the automated generation tool, the implementer would then produce the necessary interface in the implementation language selected for the program, plus any representation and code needed to effect the implementation of object instances. The representation and code could be as simple as RPCs to operations that were part of the existing implementations of the object types in the existing set of programs. In such a case, the automated generation tool could produce them completely, given only the specification stating that this was the desired implementation. Situations of much greater complexity could also be supported. For example, a completely new implementation could be defined with corresponding modifications to the representation or code attached to the preexisting programs' interfaces to objects of this type. Alternatively, two parallel implementations could be utilized--one associated with each of the programs sharing objects of this type, with consistency maintenance mechanisms and bidirectional translation linkages joining the two implementations. In such situations, the automatic generation tool's role would be more limited, but the role of the library and browser in aiding future implementers wishing to interoperate with objects of this type would be correspondingly more important.
An Initial Prototype
Realization of SLI
To demonstrate the feasibility of SLI and to support experimentation with the model of SLI presented earlier, we have developed an initial prototype realization of the model that provides a subset of its capabilities. This prototype realization consists of a first approximation to a unifying type model, called UTM-0, bindings for the Lisp and Ada programming languages, one implementation strategy, and the UTM-0 automated generator shown in Figure 2. The automated generator accepts type definitions in the UTM-0 notation, plus binding and implementation information, and produces a standard interface specification (S/S) and a corresponding implementation for each of the entities described in the UTM-0 input. By standard we mean that, for a given target programming language, the UTM-0 automated generator will always produce the same interface specification from the same UTM-0 input. In fact, due to the bindings defined in this prototype, the Lisp and Ada SISs that are generated from a given UTM-0 description are essentially identical, differing only in the syntax used to express them. The implementations attached to those SISs by the automated generator may vary widely, depending upon the implementation information that was provided with the UTM-0 definitions. The implementations, of course, are completely invisible to the inter-operating programs which need only be concerned with the SISs.
Here we describe each of the components of our initial prototype realization of SLI. We illustrate the use of these components through an example in which we applied the UTM-0 automated generator to achieve interoperability between some of the components of the Constrained Expression Toolset previously described.
Definition of UTM-0
UTM-0 consists of a set of type definition primitives, a set of special types, and some semantics for manipulation of instances of types. We present brief overview of UTM-0 here (a more detailed description can be found in ) and an example of its use will follow.
The type definition primitives of UTM-0 are based on the approach used in the OROS type model  and are similar to those found in other recently proposed type models (e.g., [2, 10, 32, 39]). Like OROS, UTM-0 distinguishes three basic classes of types: object types, relationship types and operation types. Intuitively, object types are used to describe things whose state is their most interesting property, operation types are used to describe things that manipulate or transform other things, and relationship types are used to describe things that represent connections among other things. UTM-0 uses the word "entity" to encompass all things; thus object types, relationship types and operation types are all also entity types.
Using UTM-0, types are defined in terms of a set of properties and in terms of their relationships to other types (intertype relationships). The properties of a type are the operations that can be applied to its instances, relationships in which its instances can participate, and possibly a signature. (3) The intertype relationships include inheritance of properties, explicit differentiation of properties relative to those of other types and subtyping. Types defined using UTM-0 have names and can be parameterized.
UTM-0 supports multiple inheritance, with the parents field of a type definition listing the inherited types. The UTM-0 rules for inheritance are fairly simple. They are as follows: if type T lists type P as a parent, T inherits P's parents (recursively) and all of P's associated operations and relationships. If P has a signature, it is propended to T's signature (if any). If there is a naming conflict among multiple parents, the parent listed first prevails. The UTM-0 model also supports the concept of subtyping, whereby if type S is a subtype of T (S conforms to T), an instance of type S can be used wherever an instance of T is called for. The UTM-0 subtyping rules, which are based on those of OROS, are detailed in .
The remainder of the UTM-0 definition consists of definitions of some special types, criteria for type equivalence, and semantics for instance manipulation.
The most important special types in UTM-0 are the four primitive types in terms of which all other types are defined--namely entity, object, relationship and operation. Their definitions, in turn, involve some further operation and relationship types. Since they are used in defining the primitive types, these types are also considered "special" and are referred to as primitive operation and primitive relationship types. The final category of special types consists of the simple objects types, which can include such commonly used object types as integer, character and real. Just as the primitive operations are subtypes of operation, the simple object types are subtypes of object. Simple object types differ from all others in that they have more limited semantics.
UTM-0 defines a fairly simple semantics for manipulation of instances of types. A variable of a simple object type holds a value (e.g., the integer 17), while variables of other entity types hold pointers to the entities themselves. Semantics for assignment and entity equivalence follow directly from this dichotomy. Assignment for simple objects is done by copying their values, while assignment for other entities is done by copying pointers. Similarly, equivalence among simple objects implies equal values, while among other entities it implies equal pointers.
Finally, the UTM-0 definition includes a library of useful type definitions. This "standard library" includes simple object types such as integer, character and real. It also includes several parametric types that define commonly used aggregates, such as array[T], relation[T] and sequence[T]. Hence the UTM-0 standard library provides a superset of the type definition capabilities available in an RLI type description language such as NIDL or Courier.
The Constrained Expression Toolset is a collection of tools that must interoperate to perform their individual tasks. As discussed earlier, the deriver component, which is written in Ada, accepts a CEDL description of a concurrent system and produces a constrained expression representation of the system in the form of a constrained expression abstract syntax tree (CEAST). The behavior generator and inequality generator components are written in Common Lisp, and each of these tools uses the information stored in the CEAST. Because the CEAST created by the deriver is an Ada object, however, it had been impossible for the Lisp tools to manipulated the tree directly.
Originally, as an interim solution, another Ada tool was written to translate the CEAST into an ASCII representation of a Lisp S-expression encoding of the tree, which was written to a file. The Lisp tools then read the file and manipulated the S-expression. This translation was a minor variation on the Single Universal Representation approach to RLI. As an experiment in applying the SLI approach, we decided to use our initial prototype realization of SLI to generate a replacement for the interim solution. The first step in this process was to create the appropriate UTM-0 description for the CEAST. Part of that UTM-0 description appears in Figure 3.
The description includes definitions of two object types: Node, which describes properties of graph nodes in general, and CEAST [underscore] Node, which describes the properties of the specific kinds of nodes that appear in CEAST graphs. Note that CEAST [underscore] Node inherits from Node, so that the operations that apply to Node objects (creation, deletion, and determining what kind a node is) plus the additional operations that apply to nodes of type CEAST [underscore] Node (for manipulating the specific kinds of attributes that this kind of node contains) all apply to CEAST [underscore] Node objects. Also included in the UTM-0 description of CEAST nodes are definitions of the various operation types that are part of these two object type definitions.
The Language Bindings
As previously mentioned, there must be at least one language binding for each programming language in which interoperating programs are written. Each language binding maps UTM definitions to syntactic constructs within a particular language. This essentially comes down to determining which language constructs correspond most closely with object, operation, and relationship type definitions. We have defined two language bindings for our prototype so far: one for Ada and one for Common Lisp.
Because Ada is so supportive of abstract type definitions, it was relatively easy to produce an Ada language binding. We define all object and relationship types to be private types. Operation types are represented as either procedure or function declarations, depending on how the user specifies the code that implements them. Since object and relationship types are declared to be privarte, their representations are completely hidden from the user, which makes providing low-level representational optimizations transparent. Figure 4 illustrates how the Ada binding applies to the UTM-0 description of the CEAST example.
Our language binding for Common Lisp is similar to the Ada binding. We define object and relationship types with the predefined function deftype, and use defun to implement all operation types. Figure 5 illustrates how the Lisp binding applies to the UTM-0 description of the CEAST example. Because abstract data typing is not a part of the Common Lisp language definition, however, it is not possible to achieve the same degree of enforced encapsulation as in Ada. With disciplined use of a Lisp SIS, it is nevertheless possible to obtain the same degree of representational independence. We believe that CLOS  provides better support for the definition of abstract types, and it is likely that we will implement a CLOS binding in the near future.
Of course, both the Ada and the Lisp bindings include bindings for simple object types as well. Our initial prototype realization uses the obvious binding for the integer and character simple object types, which are all that are needed for our first set of experimental applications. Future prototypes willl include a more thorough treatment of bindings for simple object types, integrating more of the features found in existing RLI type definition mechanisms.
Generating the SISs and
Using the straightforward mapping rules implied by each language binding, it is not difficult to generate a standard interface specification to the entities specified in a UTM-0 description. However, it is considerably more difficult to generate the actual implementations of those entities (i.e., the underlying representations of object and relationship types, and the code that implements operation types). In our initial prototype realization, entities may be implemented in one of two ways: with user-specified source code, or by interfacing to an existing SIS for the entity that has already been processed by the UTM-0 automated generator.
In the former situation, producing an implementation corresponding to a given SIS is a relatively simple process. The UTM-0 automated generator contains an additional language recognizer component for each of the relevant implementation languages; this component is used to parse use-supplied implementations and turn them into a standard internal representation. The parsed implementations are then used, along with the output of the language-binding component, by the language code generator to produce the SIS and its corresponding implementation.
Figure 6a illustrates this case. To produce the Ada SIS and corresponding implementation for the CEAST type, we provided the UTM-0 automated generator with the UTM-0 description of the CEAST type and an Ada package body that implemented the types and operations specified in that description. The package body itself had been previously generated from a declarative description of the CEAST graph type, using our PGraphite toll . The automated generator then produced the appropriate Ada SIS (Figure 4) and an implementation of that SIS, using the package body that we had provided.
Generating a corresponding implementation for a given SIS when entities that it defines are to be implemented by existing SISs is handled somewhat differently. When the existing SIS is written in the same language as the one being generated, it is not difficult to "import" the existing definitions and make the appropriate operation calls. The real advantage of the automated generator, however, becomes apparent when the existing SIS is written in a different language, since this is when serious type compatibility issues arise.
Figure 6b illustrates this case. To produce the Lisp SIS and corresponding implementation for the CEAST type, we provided the UTM-0 automated generator with the UTM-0 description of the CEAST type and a specification of the correspondence between various part of the UTM-0 description and parts of the Ada SIS. Although we created the specification of the correspondence manually, a more powerful automated generator with a library capability certainly could have automated the creation of this information. The UTM-0 automated generator then produced the appropriate Lisp SIS (Figure 5) and an implementation of that SIS consisting of calls to the operations provided in the Ada SIS (using the call-out function supported in several implementations of Common Lisp).
To complete our SLI experiment, we modified the existing behavior and inequality generators to manipulate the CEAST through the Lisp standard interface, instead of through S-expressions. (4) The resulting system configuration is shown in Figure 7. It is important to note that the behavior and inequality generators never "know" that they are manipulating Ada objects, or that they are calling Ada operations. The underlying implementations are hidden from the tools, having been completely abstracted away by the SIS. This allows us the freedom to experiment with a wide variety of different implementation schemes.
At present, we employ only the implementation strategy illustrated by this example, wherein object and relationship types that do not exist locally are implemented as "foreign references" to the types defined in the SIS where they are locally defined, and operation types are implemented as foreign operation calls. We recognize this strategy forces an interlanguage call to be made upon each and every object/relationship/operation type access--and this would probably be too expensive a solution for many applications. For example, if a Lisp tool manipulates an Ada binary search tree, our implementation strategy would force an Ada operation call every time the Lisp tool wanted to either examine or set values of attributes of nodes in the tree. In some cases it might be more efficient, for example, to copy the information from the Ada object to a local Lisp object, which the Lisp tool would then access. We have begun exploring ways to let the user select alternative, optimized implementations; but, in all cases, the alternative selected would be hidden from the tool.
Specification-level interoperability provides a high-level, representation-independent approach to combining software components that are written in different languages or that are run on different machines. Unlike most other approaches to supporting interoperability--which focus on implementation concerns--SLI is aimed at maximizing the flexibility, convenience and expressive power available to developers of interoperating programs. At the very least, SLI can serve as a basis for the disciplined and orderly marshaling of interoperable components. If fully realized and properly used, SLI can be a type-safe, extensible mechanism offering some automated assitance in solving an important problem.
Our experiences with an initial prototype are extremely encouraging. SLI, even in the limited form delivered by the prototype, provided us with useful support in constructing a large system made up of diverse components. This suggests that the SLI approach can yield an important enabling technology for integrated, extensible, broad-scope environments in particular and large, evolving, heterogeneous software systems in general.
Our experience with using SLI and the prototype has also suggested a number of important directions for future work. Foremost among these is the need for improved UTMs and corresponding capabilities for establishing semantic equivalence between a UTM definition and the type definitions that supposedly correspond to it in each interoperating program. Where UTM-0 relies on name and signature conformance of operations and relationships in establishing such equivalence, a UTM based on richer semantic constructs (e.g., those used in Larch ) would provide better assurance of type compatibility. Such a UTM would also admit better automated assistance, both in compatibility checking and in automated generation, for users of SLI.
Additional work is also needed in the area of implementation strategies. We felt that the single, straight-forward strategy used in our prototype was a good one for experimentation, since it simplified the problem of determining type compatibility. However, a large number of possible alternative implementations exist, including those that would make use of RPC/XDR, NCS, HRPC, Q, IDL, Field or Polylith.
Finally, further work is also needed on language bindings and automated assistance. We are currently working on bindings for C, C++, and Prolog. C++ in particular should exercise the UTM notions of inheritance and subtyping. We are also looking into the development of a library of UTM definitions, language bindings, and underlying implementations, together with a browser for that library. Having a library and browser should make it easier to rapidly pull together the pieces needed to effect the interoperability of a given collection of components.
(1) It could be argued that both of these examples, as well as such mechanisms as "active data" or "triggers," represent intermingling of execution model and data model issues. It is possible that some perspective that separated those components would make such mechanisms fit more smoothly into the framework presented here.
(2) The object-oriented type model notion of conformance between a subtype and its supertype(s) is a special case of the notion of type compatibility.
(3) The signature of a relationship type describes the number, types and modes of the entities connected by an instance of the relationship, while the signature of an operation type describes the number, types, and modes of parameters to the operation. Object types do not have signatures in UTM-0.
(4) This step was necessary because the two Lisp tools had not originally been written in a style employing data abstraction. SLI is most helpful when the interoperating programs do make use of data abstraction; but different choices of language bindings and implementation strategies in our prototype could have obviated the need for these modifications of the behavior and inequality generators.
 ACM SIGPLAN Not 22, 11 (Nov. 1987). Special Issue on the Interface Description Language IDL.
 Andrews, T. and Harris, C. Combining language and database advances in an object-oriented development environment. In OOPSLA Conference Proceedings (Oct. 1987), pp. 430-440. Published as ACM SIGPLAN Not. 22, 12 (Dec. 1987).
 Apollo Computer Inc., Chelmsford, Mass. Network Computing System: A Technical Overview, 1989.
 Avrunin, G.S., Dillon, L.K., and Wileden, J.C. Experiments in automated analysis of concurrent software systems. In Proceedings of ACM Software Testing, Analysis and Verification Symposium (Dec. 1989), pp. 124-130.
 Balkovich, E. Lerman, S., and Parmelee, R.P. Computing in higher education: The Athena experience. Commun. ACM 28, 11 (Nov. 1985), 1214-1224.
 Bershad, B.N., Ching, D.T., Lazowska, E.D., Sanislao, J., and Schwartz, M. A remote procedure call facility for interconnecting heterogeneous computer systems. IEEE] Trans. Softw. Eng. SE-13, 8 (Aug. 1987), 880-894.
 Birrell, A.D., and Nelson, B.J. Implementing remote procedure calls. ACM Trans. Comput. Syst. 2, 1 (Feb. 1984), 39-59.
 Black, A., Hutchinson, H., Jul, E., and Levy, H. Object structure in the emerald system. Tech. Rep. 86-04-03, University of Washington, Department of Computer Science, Apr. 1986.
 Clarke, L.A., Wileden, J.C., and Wolf, A.L. Object management support for software development environments. In Proceedings of the 1987 Appin Workshop on Persistent Object Stores (July 1987), pp. 363-381.
 Dearle, A., Connor, R., Brown, F., and Morrison, R. Napier88--A database programming language? In Proceedings of the Second International Workshop on Database Programming Languages (June 1989), pp. 213-229.
 Deux, O. et al. The story of o2. Tech. Rep. 37-89, Altair, France, Oct. 1989.
 Gibbons, P.B. A Stub Generator for Multilanguage RPC in Heterogeneous Environments. IEEE Trans. Softw. Eng., SE-13, 1 (Jan. 1987), 77-87.
 Hayes, R., Manweiler, S.W., and Schlichting, R.D. A simple system for constructing distributed, mixed-languange programs. Softw.--Practice and Exper. 18,7 (July 1988), 641-660.
 Herlihy, M., and Liskov, B. A value transmission method for abstract data types. ACM Trans. Prog. Languages and Syst., 4, 4 (Oct. 1982), 527-551.
 Jones, M.B., Rashid, R.F., and Thompson, M.R. Matchmaker: An interface specification language for distributed processing. In Proceedings of the 12th ACM Symposium on Principles of Programming Languages (Jan. 1985).
 Keene, S.E. Object-Oriented Programming in Common LISP: A Programmer's Guide to CLOS. Addison-Wesley, Reading, Mass. 1989.
 Kim, Won, Ballou, N., Banerjee, J., Chou, Hong-Tai, Garza, J.F., and Woelk, D. Integrating an object-oriented programming system with a database system. In OOPSLA Conference Proceedings, vol. 23, no. 11, San Diego, Calif., Nov. 1988, pp. 142-152.
 Liskov, B., Bloom, T., Gifford, D., Scheifler, R., and Weihl, W. Communication in the Mercury system. Programming Methodology Group Memo 59-1, Laboratory for Computer Science, Massachusetts Institute of Technology, Cambridge, Mass., Apr. 1988.
 Liskov, B. and Scheifler, R. Guardians and actions: Linguistic support for robust, distributed programs. ACM Trans. Prog. Languages and Syst. 5, 3 (July 1983), 381-404.
 Maier, D., Stein, J., Otis, A., and Purdy, A. Development of an object-oriented DBMS. In OOPSLA Conference Proceedings, SIGPLAN Not. 21, 11 (Nov. 1986), ACM, N.Y., 472-482.
 Maybee, M., and Sykes, S.D. Q: Towards a multi-lingual interprocess communications model. Tech. Rep. University of Colorado, Boulder, Colo., Feb. 1989.
 Morris, J.H., Satyanarayanan, M., Conner, M.H., Howard, J.H., Rosenthal, D.S.H., and Smith, F.D. Andrew: A distributed personal computing environment. Common ACM 29, 3 (Mar. 1986), 1984-201.
 Oberndorf, P.A. The common Ada programming support environment (APSE) interface set (CAIS). IEEE Trans. Softw. Eng. SE-14, 6 (June 1988), 742-748.
 Paseman, W.G. Architecture of an integration and portability platform. In Proceedings of the 1988 CompCon (Mar. 1988).
 Purtilo, J. M., and Jalote, P. An environment for prototyping distributed applications. In Proceedings of the Ninth International Conference on Distributed Computing Systems (June 1989), pp. 588, 594.
 Reiss, S.P. Connecting tools using message passing in the Field environment. IEEE Softw. (July 1990), 57-67.
 Rosenblatt, W.R., Wileden, J.C., and Wolf, A.L. OROS: Toward a type model for software development environments. In OOPSLA Conference Proceedings (Oct. 1989) ACM SIGPLAN Not., 24, 10 (Oct. 1989), 297-304.
 Stonebraker, M., Wong, E., Kreps, P., and Held, G. The design and implementation of Ingres. ACM Trans. Database Syst. 1, 3 (Sept. 1976), 189-222.
 Sun Microsystems, Inc., Mountain View, Calif. External Data Representation Reference Manual, Jan. 1985.
 Taylor, R.N., Belz, F.C., Clarke, L.A., Osterweil, L.J., Selby, R.W., Wileden, J.C., Wolf, A.L., and Young, M. Foundations for the Arcadia environment architecture. In Proceedings of the Third ACM SIG-SOFT/SIGPLAN Symposium on Practical Software Development Environments (Dec. 1988). SIGPLAN Not. 24, 2 (Feb. 1989), ACM, N.Y.
 Thomas, I. PCTE interfaces: Supporting tools in software engineering environments. IEEE Sfotw. 6, 6 (Nov. 1989), 15-23.
 Vines, D. and King, T. Gaia: An object-oriented framework for an Ada environment. In Proceedings of the Third International IEEE Conference on Ada Applications and Environments (May 1988), pp. 81-92.
 Walker, B., Popek, G., English, R., Kline, C., and Thiel, G. The LOCUS distributed operating system. In Proceedings of the Ninth ACM Symposium on Operating System Principles (Oct. 1983), pp. 49-70.
 Wileden, J.C., and Wolf, A.L. Object management technology for environments: Experiences, opportunities and risks. In Proceedings of the International Workshop on Environments (Sept. 1989).
 Wileden, J.C., Wolf, A.L., Fisher, C.D., and Tarr, P.L. PGraphite: An experiment in persistent typed object management for environments. In Proceedings of the 3d ACM SIG-SOFT/SIGPLAN Symposium on Practical Software Delaware Environments (Dec. 1988), 130-142. Published as SIGPLAN Not. 24, 2 (Feb. 1989), ACM, N.Y.
 Wilenden, J.C., Wolf, A.L., Rosenblatt, W.R., and Tarr, P.L. UTM-0: Initial proposal for a unified type model for Arcadia environments. Arcadia Design Document UM-89-01, Department of Computer and Information Science, University of Massachusetts, Amherst, Mass., Feb. 1989.
 Wing, J.M. Writing Larch interface language specifications. ACM Trans. Prog. Languages and Syst. 9, 1 (Jan. 1987), 1-25.
 Xerox Corp., Palo Alto, California. Courier: The remote procedure call protocol. Tech. Rep. XSIS 038112, Dec. 1981.
 Zdonik, S.B., and Wegner, P. Language and methodology for object-oriented database environments. In Proceedings of the 19th Annual Hawaii International Conference on System Sciences (1986), pp. 378-387.
CR Categories and Subject Descriptors: D.1.0 [Programming Techniques]: General; D.2.2 [Software Engineering]: Tools and Techniques; D.2.6 [Software Engineering]: Programming Environments; D.3.3 [Programming Languages]: Language Constructs -- abstract data types; D.3.4 [Programming Languages]: Processors; E.2 [Data Storage Representations]: General; H.2.3 [Database Management]: Heterogeneous Databases
General Terms: Interoperability, Heterogeneous Software Systems, Multilingual Programs
Additional Key Words and Phrases: Prototype automated support, representation level, specification level, type compatibility, unifying type model
About the Authors:
JACK C. WILEDEN is a professor in the Department of Computer and Information Science at the University of Massachusetts, Amherst, and a director of the Software Development Laboratory there. His research interests center on integrated software development environments especially object management capabilities for environments, and on development tools and techniques applicable to concurrent software systems.
ALEXANDER L. WOLF is a member of the technial staff at AT&T Bell Laboratories in Murray Hill, New Jersey. His research is concerned with improving the development of large software systems through software development environments and tools. He is also designing and building an object database system to support environment object management as well as to support other complex applications. Author's Present Address: AT&T Bell Laboratories, 600 Mountain Avenue, Murray Hill, NJ 07974. email: wolf@ research. att.com
WILLIAM R. ROSENBLATT is a Ph.D. candidate in the Department of Computer and Information Science at the University of Massachusetts, Amherst. His research interests include integrated software development environments, object-oriented systems and programming language type theory.
PERI L. TARR is a Ph.D. candidate in the Department of Computer and Information Science at the University of Massachusetts, Amherst. Her research interests include integrated software development environments, object management and persistent typed object systems.
Authors' Present Address for Wileden, Rosenblatt, Tarr: Software Development Laboratory, Computer and Information Science Department, University of Massachusetts, Amherst, MA 01003, wileden@ cs.umass.edu; rosenblatt@cs. unmass.edu; firstname.lastname@example.org
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||International Conference on Software Engineering special report; one of five articles on technical material presented at ICSE-12|
|Author:||Wileden, Jack C.; Wolf, Alexander L.; Rosenblatt, William R.; Tarr, Peri L.|
|Publication:||Communications of the ACM|
|Date:||May 1, 1991|
|Previous Article:||An experiment in formal software development: using the B theorem prover on a VDM case study.|
|Next Article:||Implementing faceted classification for software reuse.|