# Dynamic Typing for Distributed Programming in Polymorphic Languages.

While static typing is widely accepted as being necessary for secure program execution, dynamic typing is also viewed as being essential in some applications, particularly for distributed programming environments. Dynamics have been proposed as a language construct for dynamic typing, based on experience with languages such as CLU, Cedar/Mesa, and Modula-3. However proposals for incorporating dynamic typing into languages with parametric polymorphism have serious shortcomings. A new approach is presented to extending polymorphic languages with dynamic typing. At the heart of the approach is the use of dynamic type dispatch, where polymorphic functions may analyze the structure of their type arguments. This approach solves several open problems with the traditional approach to adding dynamic typing to polymorphic languages. An explicitly typed language [XML.sup.dyn] is presented; this language uses refinement kinds to ensure that dynamic type dispatch does not fail at run-time. Safe dynamics axe a new form of dynamics that use refinement kinds to statically check the use of run-time dynamic typing. Run-time errors are isolated to a separate construct for performing run-time type checks.Categories and Subject Descriptors: D.3.1 [Programming Languages]: Formal Definitions and Theory--semantics; D.3.3 [Programming Languages]: Language Constructs and Features--data types; structures

General Terms: Languages

Additional Key Words and Phrases: Dynamic typing, marshalling, parametric polymorphism, static typing

1. INTRODUCTION

Statically typed languages are now widely accepted as being a necessity for secure program execution, verifying at compile-time that programs satisfy certain decidable invariants in terms of the correct use of data structures, and yielding information about data usage which allows data representations to be optimized [Appel 1992]. Languages such as Ada [1979] and ML [1990] improve the flexibility of static typing by providing parametric polymorphism, that allows static type-checking of reusable software components which operate generically over data structures such as lists, sets, trees, etc. A further attraction of languages such as ML and Haskell [Hudak et al. 1992] is that these languages are "implicitly typed": a program may be submitted to the compiler without programmer-supplied type annotations, and the type-checker is able to infer a "principal type" for the program, of which all other possible types are instances. Static typing, parametric polymorphism, and type inference combine to provide a useful framework for improved programmer productivity and the construction of reliable reusable software components.

Despite the aforesaid advantages of static typing, some form of dynamic (runtime) typing has been proposed for statically typed languages. Modula-3 [Nelson 1991] in particular contains interesting facilities for dynamic typing, benefiting from experience with languages such as Simula-67, CLU, Cedar/Mesa, Amber, and Modula-2+ [Birtwistle et al. 1979; Liskov et al. 1981; Lampson 1983; Cardelli 1986; Rovner 1986]. "Traced" pointer values may be "widened" to type REFANY, the type of all traced pointers, under the language's subtyping rules. Modula-3 provides a TYPECASE statement, which allows for run-time branching based on the type of a value:

[is less than]TB[is greater than] PROCEDURE Print (X : REFANY) = BEGIN TYPECASE X OF INTEGER(Xi) ?? Wr.put (Stdio.stdout, Fmt.Int (Xi)) | TEXT(Xs) ?? Wr.put (Stdio.stdout, Xs) | .... | DEFAULT ?? RAISE PrintError END END Print; [is less than]/TB[is greater than]

A disadvantage of the Modula-3 scheme is the overhead it places on programs in terms of needing to maintain ubiquitous type tags in the run-time; every traced pointer value must have a type tag for the above operations. This is counter to the philosophy in optimizing Haskell and ML compilers [Appel 1992], where much work has been done on removing the need for tagging in the run-time environment [Appel 1989; Morrison et al. 1991; Peyton-Jones and Launchbury 1991; Goldberg 1991; Goldberg and Gloger 1992; Peyton-Jones 1992; Aditya and Caro 1993; Tolmach 1994; Tarditi et al. 1996].

An alternative has been proposed for incorporating dynamic typing into a statically typed language, based on an explicit type dynamic for dynamically typed values [Abadi et al. 1991]. If e is a value with type [Tau], then the expression dynamic ([Tau], e) is a value of type dynamic. This corresponds to bundling a value with an explicit tag for its type into a data value. This proposal includes a construct similar to Modula-3's TYPECASE construct for examining the type tag in such a value. The most often cited applications of such a construct are pickling values to persistent storage and transmitting values over heterogeneous distributed systems.

For example, the SRC implementation of Modula-3 includes the Pickle interface for pickling values; this interface finds considerable use in persistent storage and distributed communication of program values, such as in the implementation of Network Objects [Birrell et al. 1993]. The OMG Common Object Request Broker Architecture (CORBA) specification contains a similar facility, a type ANY for the type of pickles that can be transmitted between address spaces [Object Management Group 1995]. The CORBA specification includes facilities for using run-time type information, in the client-side dynamic invocation interface (DII) and server-side dynamic skeleton interface (DSI). Some of the motivation for these facilities is in implementing gateway objects, e.g., between Microsoft COM and OMG CORBA object request brokers. Without such facilities, a gateway object would have to be recompiled with fresh stubs every time a new object interface was added on either side of the gateway.

An obvious issue in the design of dynamic typing mechanisms is how to incorporate it into languages with parametric polymorphism. Proposals for adding dynamic typing to languages such as ML (for the purpose of type-safe distributed communication with which we are primarily concerned) essentially consist of adding the dynamic construct, with minor extensions. However, this construct interacts very badly with parametric polymorphism. For example ML has a single list type constructor list which indexes a family of list types int list, string list, int list, etc. It should be possible to transmit a list value, encapsulated as a dynamic, to a server which can "unbundle" the dynamic in such a way that it can operate on the elements of the list. Leroy and Mauny [1993] and Abadi et al. [1992] have considered an extension of dynamics to allow this, allowing type variables in the patterns of the rules of the typecase

[is less than]TB[is greater than] fun print (x : dynamic) = typecase x of int(ival) ?? output (intToString (ival)) | string(str) ?? output (str) | (`a list)(xs) ?? (if null xs then output "[]" else (output "["; print (dynamic (`a, hd xs)); app (fn x ?? (output ","; print (dynamic ('a,x)))) (tl xs); output "]")) [is less than]/TB[is greater than]

where output is an operation for printing a string. There is a very serious problem with this approach: the only way to recurse over the elements of the list is to rebundle each element as a dynamic, so that the typecase can be reapplied on the recursive call. There are several difficulties with this approach. For one thing, the failure point due to run-time type errors no longer occurs once, when a dynamic is unbundled, but is distributed over the code which recursively traverses the data structure. Because dynamics essentially associate type tags with individual data values, this approach also makes some fairly natural compiler optimizations impractical (as discussed in Section 2.5). Even worse, it is impossible to transform dynamic data structures with mutable components (as discussed in Section 2.5). Finally if we think of the dynamic and typecase constructs as being the bundling and unbundling of pickled values [Duggan 1998b], then there is a significant performance penalty associated with the repeated bundling and unbundling of the data structure.

In this article we present a new approach to incorporating dynamic typing into statically typed languages, particularly languages with parametric polymorphism such as ML. We introduce a new construct for computing with dynamics that allows pickled data structures such as lists and trees to be unbundled once and then traversed. There are two components to our semantics. First, a dynamic type is translated as an existential type [exists]'a.'a [Mitchell and Plotkin 1988]. This is intuitively the type of a first-class abstract data type, where 'a is a type variable denoting the opaque type, and the ADT has a single value of type 'a. A dynamic is then an implementation (called a data algebra) for this ADT, comprising an encapsulated witness type and the value of that type. The second component of our semantics is dynamic type dispatch. For the above example, we intuitively define a recursive function print that takes two parameters, a type and a value of that type:

[is less than]TB[is greater than] print [int] (ival) = output (intToString (ival)) print [string] (str) = output (str) print [('a list)] (xs) = (if null xs then output "[]" else (output "["; print ['a] (hd (xs); app (fn x ?? (output ","; print ['a] (x))) (tl xs); output "]")) [is less than]/TB[is greater than]

That is, print recurses over its type argument, the type of the data structure, rather than recursing over the data structure itself. Computing with dynamics then consists of building such a function that uses dynamic type dispatch, extracting the witness type and value from the dynamic, and then applying the function to the type and value. In the example just presented, the unbundling of the pickled list is done once, at the point where the print function is applied, rather than repeatedly on each recursive call to print. This overcomes all of the aforesaid problems with recursing over dynamic data structures.

Harper and Morrisett [1995] and Morrisett [1995] advocate an approach to using dynamic type dispatch internally in a compiler as a basis for type-directed compilation and optimization. Our work (done independently of their work) extends their framework with a notion of refinement kinds that describe the domain of applicability of an operator that uses dynamic type dispatch. Our refinement kinds are the basis for reifying dynamic type dispatch to the programmer level, ensuring statically that type failures do not occur at run-time. This is in contrast with the work of Dubois et al. [1995], e.g., where dynamic type dispatch is advocated as a programming mechanism, but without any type system for checking its use. We also introduce a new form of dynamics, called safe dynamics, which we translate as existentials [exists]'a [is less than]: [Rho].'a. Here [Rho] is a refinement kind revealing some structure of the opaque type 'a. Provided this description [Rho] is contained within the domain of print, then applying print to the dynamic is guaranteed not to fail due to runtime type errors. Safe dynamics allow uses of dynamics to be checked statically, avoiding any run-time failure due to an unsuccessful type match. Run-time type failures are isolated to another construct.

In Section 2 we describe [XML.sup.dyn], an explicitly typed language that combines polymorphism, dynamic type dispatch, and our approach to computing with dynamic types. In Section 3 we translate [XML.sup.dyn] to [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], a more primitive calculus underlying [XML.sup.dyn], and provide a dynamic semantics for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. In Section 4 we describe [OXML.sup.dyn], an extension of [XML.sup.dyn] with "recursive overloading." This provides an alternative approach to providing dynamic type dispatch in the source language. We give a semantics for [OXML.sup.dyn] by translating it back into [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. Finally Section 5 considers related work, while Section 6 provides our conclusions.

Dynamics in our work bear some relationship to objects: both have a notion of widening (based on subtyping) and a narrowing construct, and both have a common semantic foundation in terms of existential types. We comment further on this analogy between dynamics and objects, in Section 5.

2. DYNAMIC TYPING IN A POLYMORPHIC LANGUAGE

In this section we describe [XML.sup.dyn], an explicitly typed polymorphic language with dynamics and dynamic typing. In Section 2.1 we describe the types of [XML.sup.dyn]. This includes a notion of refinement kinds that refine the "kind" of all types; refinement kinds describe the domains of polymorphic functions that can examine the structure of their type arguments. In Section 2.2 we describe the core language of functions and polymorphic functions for [XML.sup.dyn]. In Section 2.3 we give the construct for dynamic type dispatch in [XML.sup.dyn]. In Section 2.4 we augment this core language with dynamic typing in the form of safe dynamics. In Section 2.5 we provide our approach to computing with dynamics, using a recursive typecase that is defined using dynamic type dispatch and refinement kinds.

2.1 Types and Kinds

Types in our approach are stratified into simple types [Tau] and polymorphic types [Sigma]:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Our type system is based on the two-level stratified type system used by Harper and Mitchell [1993] to explain ML's polymorphic type discipline. In this approach we have the usual collection of monomorphic types or monotypes (closed under the [right arrow] type constructor and any other type constructors), and a second level of polymorphic types or polytypes, based on the closure of the collection of monotypes under the universal type quantifier [inverted]A. Type constructors t denote both base types, such as int and real, as well as type constructors such as list and the product and function type constructors ([Tau] * [Tau]' and [Tau] [right arrow] [Tau]', respectively). In our formal system we denote compound type expressions using a prefix syntax t([[Tau].sub.1], ..., [[Tau].sub.n]); in the concrete syntax of ML [Milner et al. 1990], compound type expressions are denoted using a postfix syntax (for example, int list list). Polymorphic types abstract over both type variables [Alpha] ([inverted]A[Alpha] [is less than]: [Rho].[Sigma] and [inverted]A[Alpha] [is less than]: [Tau].[Sigma]) and kind variables [Kappa] ([inverted]A([[Rho].sub.1] [is less than]: [Kappa] [is less than]: [[Rho].sub.2).[Sigma]).

Kinds are regular tree expressions denoting (possibly infinite) sets of types. The syntax of kinds is

[Rho] ::= [perpendicular to] | ?? | [Kappa] | t([[Rho].sub.1], ..., [[Rho].sub.n]) | [[Rho].sub.1] [union] [[Rho].sub.2] | [Mu][Kappa].[Rho]

where [Rho] denotes kinds, and [Kappa] denotes kind variables. The kind operator [Mu] denotes union, and [Mu] is the fixed-point operator. Kinds intuitively form a lattice of sets of types, with [perpendicular to] and ?? as the bottom and top of the lattice, respectively. Each type constructor t has an associated kind constructor t of the same arity; a kind expression t([[Rho].sub.1], ..., [[Rho].sub.n]) denotes the set of types with outermost type constructor t applied to types [[Tau].sub.1] [element of] [[Rho].sub.1], ..., [[Tau].sub.n] [element of] [[Rho].sub.n]. Then for example the kind int [union] real denotes the set of types {int, real}, while the recursive kind [Mu][Kappa].int [union] list(k) denotes the infinite set of types {int, list (int), list (list (int)), ...}.

We let [Tau] [is less than]: [Rho] denote that [Tau] is contained in the set of types Denoted by [Rho], and we refer to this as the containment relation. So for example int [is less than]: (int [union] real). The subset relation between the interpretation of kinds induces a subkinding relationship [Rho] [is less than]: [Rho]' between kinds. For example we have

list(list(int)) [is less than]: [Mu][Kappa].int [union] list(k).

Polmorphic types of the form [inverted]A[Alpha] [is less than]: [Rho].[Sigma] constrain instantiations of a type variable [Alpha] to the set of types described by [Rho]. Polymorphic types of the form [inverted]A([[Rho].sub.1] [is less than]: [Kappa] [is less than]: [[Rho].sub.2]).[Sigma] constrain the kind variable [Kappa] with both a lower bound [[Rho].sub.1] and an upper bound [[Rho].sub.2]. Some motivation for upper bound constraints on kind variables is provided in Section 2.4, while some motivation for lower bound constraints is provided in Section 4. Kind constraints include recursive constraints, e.g., list ([Kappa]) [is less than]: [Kappa] [is less than]: T; this facility is used in Section 4. A type of the form [inverted]A[Kappa] [is less than]: [Rho].[Sigma] abbreviates [inverted]A([perpendicular to] [is less than]: [Kappa] [is less than]: [Rho]).[Sigma].

Our type system involves constraints of the form [Tau] [is less than]: [Rho] (denoting that [Tau] is included in the kind [Rho]), [[Rho].sub.1] [is less than]: [[Rho].sub.2] (denoting that the set of types described by [[Rho].sub.1] is included in the corresponding set for [[Rho].sub.2]), and [[Sigma].sub.1] [is less than]: [[Sigma].sub.2] (denoting that [[Sigma].sub.1] is a subtype of [[Sigma].sub.2]). We use [Gamma], [Delta] to denote both type variables [Alpha], [Beta] and kind variables [Kappa]. Furthermore we use [Psi] to denote both types [Sigma] and kinds [Rho]. Then [[Psi].sub.1] [is less than]: [[Psi].sub.2] stands generically for any of the above three forms of constraints. We have the judgement forms

Judgement Meaning [Gamma] ?? [[Rho].sub.1] = [[Rho].sub.2] Kind equality [Gamma] ?? [[Rho].sub.1] [is less than]: [[Rho].sub.2] Kind containment [Gamma] ?? [Tau] [is less than]: [Rho] Kind membership [Gamma] ?? [[Sigma].sub.1] [is less than]: [[Sigma].sub.2] Subtyping

where [Gamma] is a context of constraints on type and kind variables:

[Gamma] ::= {} | {[[Rho].sub.1] [is less than]: [Kappa] [is less than]: [[Rho].sub.2]} | {[Alpha] [is less than]: [Rho]} | {[Alpha] [is less than]: [Tau])} | [[Gamma].sub.1] [union] [Gamma].sub.2]

This context includes both type and kind upper bound constraints on Type variables. The context also includes both lower and upper bound constraints on kind variables. These constraints are discharged by polymorphic abstraction over type variables.

Figure 1 gives the equality and ordering rules for kinds. The eighth rule expresses the uniqueness of fixed points for recursive expressions of the form [Mu][Kappa].[Rho]. It carries the antecedent that [Rho] be contractive ink. This is standard and is defined by

(1) [Kappa] is contractive in [Kappa] if [is not equal to] [Kappa]';

(2) [perpendicular to] and ?? are contractive in k;

(3) t([[Rho].sub.1], ..., [[Rho].sub.n]) is contractive in [Kappa];

(4) [[Rho].sub.1] [union] [[Rho].sub.2] is contractive in [Kappa] if [[Rho].sub.2] is contractive in [Kappa] and [[Rho].sub.2] is contractive in [Kappa]; and

(5) [Mu][Kappa]'.[Rho] is contractive in [Kappa] if [Rho] is contractive in [Kappa] and [Kappa]'.

Fig. 1. Equality and ordering rules for kinds.

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

A kind [Rho] is contractive if, for every subexpression of the form [Mu][Kappa].[Rho]', [Rho]' is contractive in k.

Figure 2 provides the inference rules for the subtype relation, and for containment of a type in a kind. In these rules, and throughout the sequel, the constraint [Sigma] = [Sigma]' denotes the two constraints [Sigma] [is less than]: [Sigma]' and [Sigma] [is less than]: [Sigma]. The SUBKTYC rule provides the rule for containment of a nonvariable type in a kind. The SUBHYP rule includes the case when ([Alpha] [is less than]: [Rho]) [element of] [Gamma], while the SUBTRANS rule includes the following kind subsumption rule:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Fig. 2. Subtyping and containment rules for [XML.sup.dyn] and [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Subtyping and containment rules for [XML.sup.dyn] and [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The SUBKTRANS rule uses a matching relation [[Tau].sub.1] [equivalence] [[Tau].sub.2] between types. This is formulated similarly to the subtype relation, but in addition includes symmetry: [[Tau].sub.1] [equivalence] [[Tau].sub.2] if [[Tau].sub.2] [equivalence] [[Tau].sub.1]. This relies on the fact that there is no subtyping relation between simple types; the subtyping relation holds between safe dynamic types, introduced in Section 2.4. For example the constraint set {[Alpha] [is less than]: [Beta], [Alpha] [is less than]: int} entails the constraint [Beta] [is less than]: int; since int is not in the subtype relation with any other type constructor, both [Alpha] and [Beta] must be instantiated to int by any satisfying substitution.

The SUBKTYC rule states that subkinding between kind constructor expressions is covariant in all arguments of the kind constructor. This is true for the arrow kind constructor ??, even though subtyping is contravariant in the domain type of [right arrow]. This demonstrates an important distinction between subkinding and subtyping. In this type system we conservatively require that subtyping for all type constructors except [right arrow] is invariant in all arguments (in the SUBTYC rule).

We require that kinds are contractive. This allows us to assume that kinds have the following form:

[Rho] ::= [Kappa] | ?? | [Mu][Kappa].[t.sub.1]([[bar][Rho].sub.1]) [union] ... [union] [t.sub.n] ([[bar][Rho].sub.n])

The last clause includes the case where n = 0 and [Mu][Kappa].[perpendicular to] = [perpendicular to]. We furthermore require that kinds are discriminative: in a union kind, the outermost type constructors [t.sub.1], ..., [t.sub.n] are required to be distinct. The unrestricted syntax of kinds allows us to manipulate kind expressions, while reasoning about them using the kind rules. This shorter syntax provides for some economy in considering the forms of kind expressions in types, while the contractiveness and discriminativity restrictions ensure that any kind expression can be manipulated to have this restricted form.

2.2 Core language Type Rules

This subsection describes the core language of [XML.sup.dyn]. The abstract syntax for this core language is given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

This is essentially the predicative second-order [Lambda]-calculus of Harper and Mitchell's Core-XML [Harper and Mitchell 1993], but with a form of bounded type quantification, and with lambda-abstraction over kinds as well as types(1).

Figure 3 gives the type rules for this core of the minilanguage. A judgement in the type rules has the form [Gamma]; A ?? e:[Sigma] a where e is a program and [Sigma] is its type. [Gamma] is a type environment, mapping program variables x to types [Sigma]. [Gamma] is a context of subtype and kind constraints on type and kind variables. The context constraint [Kappa] [is less than]: [Rho] abbreviates the constraint [perpendicular to] [is less than]: [Kappa] [is less than]: [Rho]. The type [inverted]A[Kappa] [is less than]: [Rho].[Sigma] and expression [Lambda][Kappa] [is less than]: [Rho].e abbreviate [inverted]A([perpendicular to] [is less than]: [Kappa] [is less than]: [Rho]) and [Lambda]([perpendicular to] [is less than] [Kappa] [is less than].e, respectively. The SUB rule introduces type subsumption into our minilanguage. This rule makes use of a subtyping rule between type environments that includes a form of weakening of the hypotheses:

[Gamma] ?? A [is less than]: A' ?? dom(A') [subset or equal to] dom(A) and [Gamma] ?? A(x) [is less than]: A'(x) for all x [element of] dom(A')

Fig. 3. Core language type rules for [XML.sup.dyn] and [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

This subsumption rule is necessary in Section 4.

2.3 Dynamic Type Dispatch

The construct for dynamic type dispatch in [XML.sup.dyn] is provided by the typerec construct:

e ::== ... | typerec f : [Sigma] of [t.sub.1]([[bar][[Alpha].sub.1]) ?? [e.sub.1] | ... | [t.sub.k]([[bar][[Alpha].sub.k]) ?? [e.sub.k]

The novelty of this construct is that it defines a function that recurses over a type rather than over a value. The fixed point of this function is given by the variable f that is introduced by the typerec. Such a function specifies a form of type-safe dynamic type dispatch, wherein a polymorphic function dispatches at run-time on the basis of type arguments. There are now several terminologies for dynamic type dispatch, including structural polymorphism [Gunter et al. 1993], effective polymorphism [Longo et al. 1993; Longo 1993] and intensional polymorphism [Harper and Morrisett 1995]. We use the terminology of Morrisett [1995].

The type rule for this construct is provided by the TYREC type rule in Figure 4. The rule demonstrates that the typerec defines a polymorphic function of type [inverted]A[Alpha] [is less than]: [Rho].[Sigma] The kind constraint on [Alpha] restricts the domain of applicability of this function, to types for which the cases in dynamic type dispatch are defined. Since the typerec in general defines a recursive function, this domain kind constraint must also be recursive.

Fig. 4. Type rule for dynamic type dispatch in [XML.sup.dyn].

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

For example, the following defines a function that can be applied to integers, references (no matter their element type), lists of integers, lists of references, lists of lists of integers, and so on:

typerec f: ([inverted]A[Alpha] [is less than]: ([Mu][Kappa].int [union] ref(T) [union] list([Kappa])).[Alpha] [right arrow] [Alpha]) of int ?? [Lambda]x : int.intPlus (x, x) | ref ([Alpha]) ?? [Lambda]x : ref ([Alpha]). x | list([Alpha]) ?? [Lambda]xs : list ([Alpha]).map (f [[Alpha]]) xs

The first clause defines a function of type int [right arrow] int. The second clause defines a function of type [inverted]A[Alpha] [is less than]: T.ref([Alpha]) [right arrow] ref ([Alpha]), where the type variable [Alpha] is unconstrained because no operations are defined on the element type of the reference cell. The third clause defines a function of type [inverted]A[Alpha] [is less than]: ([Mu][Kappa].int [union] ref(T) [union] list([Kappa])).list ([Alpha]) [right arrow] list ([Alpha]. In this third clause, the fixed point f of the typerec is applied to the list element type, and hence the element type [Alpha] in this clause is constrained by the declared domain kind of f. The domain of the typerec is then int [union] ref(T) [union] list([Mu] [Kappa].int [union] ref(T) [union] list([Kappa])). By the fixed-point unrolling rule for kinds ([Mu][Kappa].[Rho] = {([Mu][Kappa].p)/ [Kappa]}[Rho]), this is equal to [Mu][Kappa].int [union] ref(T) [union] list([Kappa]).

Harper and Morrisett [1995] and Morrisett [1995] present a calculus, [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], that includes a typerec construct for dynamic type dispatch by recursing over type descriptions. The motivation for their framework is in using dynamic type dispatch for type-based compilation based on transformations of data representations, and they only propose using their mechanisms internally in a compiler. They characterize this as a target-level approach, as opposed to the source-level approach presented here where we reify this functionality to the programmer level. Their typerec construct is primitive recursive, whereas the recursion in our typerec is more general. This generality is necessary for programming purposes. For example the function above is nonstrict in the element type of reference types, whereas the typerec of [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is strict in all arguments. The more general typerec of [XML.sup.dyn] is also necessary for mutual recursion, consider for example

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Harper and Morrisett include a type-level Typerec construct for computing type transformations; we omit it because it is unnecessary for our purposes, and because we are concerned with type inference for an implicitly typed version of our language [Duggan 1998a]. Type inference for the Typerec construct would require solving equations over expressions involving primitive recursive functions, which appears difficult or impossible.

The most important difference in the approaches is our provision of "refinement kinds" that refine the structure of ??, the set of all simple monotypes. Harper and Morrisett assume that all uses of dynamic type dispatch are total (defined for all types). Morrisett [1995] describes an approach where a "characteristic function" F can be defined for the domain of an operation that uses dynamic type dispatch, using Typerec. F([Tau]) = void, the empty type, if [Tau] = contains any type constructor outside the domain of the operation, and F([Tau]) = [Tau] otherwise. Beyond the fact that type inference is hard or impossible with the Typerec construct, there is also the problem that this approach does not prevent instantiation of an operator with a type outside its domain kind. For the example above, the function would be instantiated to type void [right arrow] void when applied to the string type, under the approach described by Morrisett. This is not as precise as preventing the erroneous instantiation of the function in the first place. Another disadvantage of the approach of Morrisett is that the typerec must include clauses for all type constructors, regardless of whether or not they are in the intended domain.

Dubois et al. [1995] have considered another approach to dynamic type dispatch, with different guarantees of type correctness relative to this and other work. Essentially they use dynamic type dispatch to provide unsafe operations such as a C-like printf function and variable-arity procedures in ML, as an alternative to Haskell-style parametric overloading. Their type system only distinguishes between "static" and "dynamic" type variables, the latter being variables which need to be instantiated with run-time type arguments. They also provide a static check for type-safety. However this check is not formalized in a type system. Furthermore it requires abstract interpretation of the entire program, and so is inapplicable for separate compilation of reusable software components; type-checking of uses of overloaded operations is done at link-time. Finally they do not propose any new elimination rules for dynamics, as we provide in Section 2.5.

2.4 Safe Dynamic Types

Safe dynamic types are a refinement of dynamic types, that allow the uses of dynamics to be statically type-checked in many instances. The refinement kinds introduced in Section 2.1 are at the heart of safe dynamic types. We introduce a construct dynamic and a corresponding type constructor Dynamic ([Rho]):

[Tau] ::== ... | Dynamic([Rho])

e :: == | [dynamic.sub.[Rho]]([Tau], e) | [narrow.sub.[Rho],[Rho]'](e)

We refer to Dynamic ([Rho]) as a safe dynamic type. Figure 5 gives the type rules for dynamics. The introduction rule for dynamics, DINTRO, requires an application of the subtype logic to lift the monotype [Tau] of an expression to a safe dynamic type Dynamic ([Rho]). For example, we have

(if true then [dynamic.sub.int] (int,3) else [dynamic.sub.bool] (bool,true)) : Dynamic (int [union] bool)

Fig. 5. Type rules for dynamics in [XML.sup.dyn].

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

This typing rule induces a subtyping relation between dynamics, as given by the DWID type rule.

Full dynamics are obtainable in our framework using safe dynamics with an unconstrained witness type: define the type dynamic to be Dynamic (??). Since the type is unconstrained, there are no operations which may be defined over the value component. To work with a value of type dynamic, we must first of all coerce it to a value of safe dynamic type. The narrow construct is introduced exactly to perform this coercion. The type rule for the narrow construct is provided by the DNAR rule in Figure 5. An advantage of this approach is that the narrow construct isolates the dynamic type coercion associated with dynamics to a separate construct; the loss of granularity in type checking associated with coercing dynamics only occurs where narrow is used. The narrow construct specifies both the lower bound to which a safe dynamic is coerced, and the upper bound which it must statically satisfy. This latter bound is added for fine-grained type checking, otherwise any safe dynamic type could be lifted by type subsumption to Dynamic(??) and then dynamically coerced to any lower bound.

An immediately obvious advantage of our scheme is that our approach gives tighter guarantees of type-correctness. In fact the typing rules for dynamic dispatching presented in Section 3 guarantee, based on the kind of the type being discriminated, that pattern-matching with this construct will always succeed. To see the usefulness of this, consider for example if a client wishes to open a TCP/IP connection to a server process and communicate a stream of dynamically-typed values to that server (such a course would be preferable, for efficiency purposes, to opening separate TCP/IP connections for each possible stream type). Under our approach, the well-typedness of this connection may be checked once at the point where that connection is established. The server exports a connection type Dynamic ([[Rho].sub.s]), and registers this with a nameserver as part of its interface. The client process, in requesting a connection, specifies a safe dynamic type Dynamic ([[Rho].sub.c]), which it will accept as a constraint on the values it may send to the server. The connection is well-typed provided [[Rho].sub.c] [is less than]: [[Rho].sub.s] under the subkinding theory given in Section 2.1.

An advantage of this scheme is that clients and servers may be modified and recompiled independently without affecting each others' implementations. For example if the server gains or loses functionality, it is recompiled and reregisters its interface with the nameserver. An exception is raised for any client that tries to communicate using an old connection, requiring it to reconnect to the server. Such a reconnection is established provided the new server interface still provides the functionality required by the client. In this way safe dynamics may be seen as a useful compromise between name equivalence for type checking local communications and structural equivalence for type checking distributed communications (a central issue in the design of the type system for the Modula-3 distributed programming language [Nelson 1991]).

The approach of dynamics was first suggested in the literature by Abadi et al. [1991]. Leroy and Mauny [1993] actually implemented a related facility for a dialect of ML (CAML). They state that "the main novelty of [their] work is the combination of dynamics with a polymorphic type discipline." As noted, their approach amounts to allowing type variables in the patterns of a typecase. Abadi et al. [1992] propose a more powerful, but also more complicated, scheme for incorporating the type dynamic into polymorphic languages. All of the motivating examples provided in their paper involve composing functions, particularly polymorphic functions, that have been bundled in dynamics. However as far as we are aware no practical applications have been found for the complicated mechanisms developed by Abadi et al. [1992].

Buneman and Ohori [1996] have considered another extension of dynamics, which they refer to as partial dynamics. Superficially partial dynamics appear very similar to safe dynamics; however they are in fact quite different in motivation and mechanism. A partial dynamic contains a value of some record type, and exports a record kind that characterizes the possible record types for the bundled value. Partial dynamics then contain operations for joining two dynamics (and thereby unioning their kinds), and coercing a partial dynamic to a record type (extracting the value in the dynamic). The motivation for partial dynamics is in heterogeneous collections in languages without subtyping; for example, partial dynamics might be used to give an alternative version of the example in Figure 10 in Section 5. As such partial dynamics are more akin to objects with a narrowing operation.

Fig. 10. Dynamic typing in object oriented languages

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

It is important to realize that safe dynamics are not simply the combination of partial dynamics with recursive types. A key in our approach is the use of recursion at the kind level. As explained more fully in Section 5, this makes type checking undecidable if we try to add multiparameter pattern-matching [Duggan and Ophel 1997b]; this would be relatively easy to add if we only had recursion at the type level. On the other hand, recursion at the type level does not overcome a single one of the problems with recursing over dynamic data structures, which are our concern.

In a sequel to this work [Duggan 1998b], we consider, a refinement of this framework where the domain of the dynamic operation is constrained by a domain kind (for example, dynamic might not be defined for "native" procedures that are compiled to native machine code). In that work we define dynamic as an operation with type

dynamic : [inverted]A[Kappa] [is less than]: [Rho].[inverted]A[Alpha] [is less than]: [Kappa].[Alpha] [right arrow] Dynamic([Kappa])

where [Rho] is the domain kind of dynamic. For example, suppose dynamic is only defined for integers and reals. Then the following type-checks:

(dynamic [int] [int] 3) : Dynamic(int [union] string)

This provides some motivation for upper bound constraints on kind variables in types of the form [inverted]A([[Rho].sub.l] [is less than]:[Kappa] [is less than]: [[Rho].sub.2]).[Sigma] in [XML.sup.dyn].

2.5 Computing with Dynamics

The elimination construct for (safe) dynamics is obtained with a construct for unpacking a dynamic:

e ::= ... | [unpack.sub.[Rho]]([e.sub.1], [e.sub.2])

The type rule for this construct is provided by the DELIM rule in Figure 5. In the expression unpack([e.sub.1],[e.sub.2]), [e.sub.2] is a safe dynamic of type Dynamic([Rho]), while [e.sub.1] is a polymorphic function with type [inverted]A[Alpha] [is less than]: [Rho]'.[Sigma]. In the semantics of the unpack, the dynamic value [e.sub.2] is unbundled and the polymorphic function [e.sub.1] applied to both the type and value components of the dynamic. Provided [Rho] [is less than]: [Rho]' (which can be checked statically), this application of dynamic type dispatch is guaranteed not to encounter run-time type failure.

The recursive typecase construct is defined in terms of unpack and typerec:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII],

The following is one of the standard examples that is used to illustrate the use of dynamics, as proposed by Abadi et al. [1991]:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

As already noted in Section 1, the interesting issue faced by Leroy and Mauny [1993] and Abadi et al. [1992] in extending the dynamic construct of Abadi et al. [1991] to polymorphic languages is its interaction with type variables and polymorphic values. The problem is that there is insufficient information available at the server side to resolve the output operations for the elements of the list. The solution proposed by the aforesaid authors is to individually rebundle these components (and their type tags) as dynamics and recursively call the print operation for dynamics on the results. This approach has all the problems cited in Section 1.

Using our recursive typecase construct, we can write the server code as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

This typecase defines a recursive function of type [inverted]A[Alpha] [is less than]: [[Rho].sup.Printable].[Alpha] [right arrow] unit. The fixed point of this recursive function is given by the variable f introduced by the typecase. For the case where the component type is a product type [Alpha] * [Beta], f is applicable to the component types (and hence the components of the pair (x, y)) because [Alpha] and [Beta] are suitably constrained. For the case where the component type is a function type [Alpha] [right arrow], [Beta], f is not applicable to the component types because they are not suitably constrained ([Alpha] [is less than]: ?? [is not less than]: [[Rho].sup.Printable] and similarly for [Beta]).

We see several advantages to this approach. Both the Modula-3 approach to dynamic typing [Nelson 1991] and the approach of dynamics [Abadi et al. 1991; Leroy and Mauny 1993] are based on associating type tags with individual data items. In a language with parametric polymorphism, there are several problems with this approach. It is not possible to determine the type of a data structure without examining the individual elements of the data structure (to examine the type tags on those elements); this is essentially how polymorphic equality is implemented in Standard ML compilers. With the approach of dynamics, which explicitly wraps a data value with its type tag, an operation that relies on these type tags must continually rewrap values with their type tags as it recurses over the data structure.

In our approach, based on the application of dynamic type dispatch to the type component of a dynamic, we separate the type of a dynamic data structure from the actual components of that data structure. An operation that needs to examine the individual components of a dynamic data structure does so at the point where the function defined by a recursive typecase is dispatched; the dispatched function then operates uniformly over the elements of the data structure. For example, with the approach of tags on data items, polymorphic equality for lists continually rechecks the tags on the elements of the lists as it compares those elements for equality, continually redispatching to the implementation of equality for the element type. With dynamic type dispatch, the type description for the entire list is separated from the list itself, so that the operation that compares two list elements for equality can be determined once and for all before polymorphic equality recurses over the lists.

Our approach has several consequences. First, the combination of safe dynamics with our elimination rule for dynamics (based on the recursive typecase) ensures that there are no run-time failures due to dynamic type dispatch. The narrowing operation allows the possibility of run-time type failure, but isolates this failure point to the point where the dynamic is unbundled. Safe dynamics allow this single failure point to be detected statically, in applications where it is possible to place some upper bound on the types that may be bundled in a dynamic.

Second, since type information is separated from data values, a compiler can specialize the code for an operation according to the types, and verify that the type matches the specialized code's expectations at the point where the operation is applied. As a simple example, a compiler might generate a quick, specialized loop for printing integer lists and have the slower, more general loop for other sorts of lists. On entry to the loop, the program can verify that the components are integers and branch to the quick loop, or fall back to the more general loop if the components are not integers. This would not be practical if it was necessary to verify that each component of the list was an integer before proceeding with the specialized code. This is essentially the motivation for the approach of Harper and Morrisett [1995] to compiler optimization; our approach to computing with dynamics provides a similar facility for dynamic values.

Third, transformations of dynamic data structures with tags do not work in the presence of state. For example if a transformation is applied to a mutable reference cell or array bundled as a dynamic, then the transformation must be recursively applied to the contents of the mutable data structure. Since these elements must be rebundled as dynamics to facilitate this recursion, this leads to a coherence problem in the transformation of the bundled data structure: updates to the bundled copy must be reflected in updates to the original, and vice versa [Morrisett 1995]. This is related to the coherence problems associated with copying mutable values between address spaces, with the copy-in/copy-out semantics of many remote procedure call systems. For this reason it is not possible to transform dynamically typed values that contain mutable references or arrays using the approach of standard dynamics, unless we assume that all values are tagged; recall from Section 1 that one of the motivations for dynamics is to avoid having to tag all values. Our approach overcomes this problem.

Finally, we can consider the dynamic operation as essentially a marshalling operation for distributed programming, with the typecase construct providing the corresponding unmarshalling operation. This interpretation of dynamics is made explicit in a sequel paper [Duggan 1998b]. In the latter semantics, bundling (applying dynamic) and unbundling (using some form of typecase) involve the application of marshalling operations to the value being bundled or unbundled. In this scenario, we obtain considerable benefit from extracting the type tag once, when we do the unbundling, and thereafter pattern-matching against this tag. We do not need to continually construct and deconstruct dynamics in the process of recursively walking over the bundled data structure.

3. DYNAMIC SEMANTICS

In this section we provide a dynamic semantics for [XML.sup.dyn]. We first give a syntactic transformation from [XML.sup.dyn] into [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], where the constructs for dynamic type dispatch are decomposed into more primitive constructs. In Section 3.1 we provide the static semantics for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. In Section 3.2 we give a reduction semantics for this language, and verify the semantic soundness of the type system.

3.1 Static Semantics for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

We give a semantics for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] [XML.sup.dyn], a variation of [XML.sup.dyn] in which the constructs for dynamic type dispatch are decomposed into more primitive constructs. This is not essential for giving a semantics for [XML.sup.dyn], but does reveal a more primitive calculus underlying [XML.sup.dyn].

The core language of [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is exactly the same as the core language of [XML.sup.dyn].

We repeat it here for convenience:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

For dynamic type dispatch we add two constructs for dynamic type dispatch, a typecase construct and a fixed-point operation for polymorphic functions:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The type rules for these constructs are provided in Figure 6. The typecase construct builds a polymorphic function with non-T domain kind that uses run-time type discrimination with respect to its single type argument, [tyrec.sub.([inverted]A [is less than]:[Rho].[Sigma])](e) denotes the fixed point of a recursively defined polymorphic function.

[Figure 6 ILLUSTRATION OMITTED]

The definition of dynamics requires the addition of existential types [Mitchell and Plotkin 1988] with the witness type constrained by a kind, and the addition of the narrow construct:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The type rules for existentials and narrowing are given in Figure 7. The pack operation bundles a type and a value as a first-class ADT implementation (a data algebra). The open construct allows the contents of a data algebra to be used locally (bound to local variables [Alpha] and x in e' in such a way that encapsulation is not violated).

Fig. 7. Type rules for existentials and narrowing in [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The translation of the types of [XML.sup.dyn] into [MATHEMATICAL NOT REPRODUCIBLE IN ASCII] is the Obvious homomorphic extension (to monotypes ([Tau] and polytypes [Sigma]) of

[[Dynamic([Rho])]] = [exists][Alpha] [is less than]: [Rho].[Alpha].

The translation of kinds is simply the identity: [[[Rho]]] = [Rho]. The Translation of type environments A and constraint environments [Gamma] is given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

The translation of the typerec construct is given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

where [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. The translation of dynamics reveals them to be data algebras of a particularly simple form:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

THEOREM (TYRE PRESERVATION). If [Gamma]; A ?? e:[Sigma] in [XML.sup.dyn], then [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

3.2 Dynamic Semantics for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

In this subsection we consider a dynamic semantics for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], assuming call-by-value evaluation. We provide the semantics in the style of reduction semantics [Wright and Felleisen 1991]. Configurations of the semantics are expressions of [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]; function evaluation is modeled by substitution rather than introducing environments. To identify completed evaluation, we identify a subset of the terms of [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] as canonical tetras:

Definition 3.1. A term of [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is in canonical form if it is of the form:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Then define the notion of an evaluation context for the semantics by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Definition 3.2. The run-time type checking algorithm, for closed types and kinds, is given by

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Given [Tau] [is less than]: [Rho], where r and p are closed, let II([Tau] [is less than]: [Rho]) denote the partial type check, the partial derivation tree which has [Tau] [is less than]: [Rho] as its root, and which results from repeatedly applying the above rules until no such rule is applicable. This type check is complete if all leaf vertices in this tree are labeled by axioms.

Definition 3.3. The reduction semantics for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is defined by the computation rules in Figure 8, augmented with the contextual rule

E[e] [right arrow] E[e'] if e [right arrow] e'.

Fig. 8. Computation rules for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

e ?? denotes that the evaluation of e loops infinitely, i.e., there is an infinite sequence [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. Let ?? denote the reflexive transitive closure of the transition relation [right arrow]. Semantic soundness for dynamic semantics usually involves demonstrating that a well-typed configuration never "gets stuck." Our notion of semantic soundness allows the possibility of a stuck well-typed configuration, but only resulting from the failure of the run-time type check for the narrow construct.

Definition 3.4. A term e is failed if e [equivalent] E[[e.sub.0]] where [e.sub.0] is of the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

where the partial type-check [Pi]([Tau] [is less than]: [[Rho].sub.2]) is incomplete. A term e is faulty if it is not a value and not failed, and e [equivalent] E[[e.sub.0]] where no computation rule is applicable to [e.sub.0].

LEMMA 3.1. Given {} ?? [[Sigma].sub.1] [is less than]: [[Sigma].sub.2]. Then one of the following holds:

(1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(2) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(3) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(4) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(5) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

LEMMA 3.2. A closed faulty term is not typable.

Lemma 3.1 is verified by permuting structural rules with the transitivity rules. Lemma 3.2 is verified by reasoning about the possible cases for a faulty term, and using Lemma 3.1.

LEMMA 3.3. A canonical derivation is one without consecutive applications of the SUB rule. If [Gamma]; A ?? e: [sigma] is derivable, then there is a canonical derivation for this judgement.

The notion of an arbitrary context C[] can be defined for terms similarly to the notion of evaluation contexts E[] in a completely standard way.

LEMMA (REPLACEMENT LEMMA). If

(1) [Pi] is a derivation for [Gamma]; A ?? C[[e.sub.1]]: [Sigma];

(2) [[Pi].sub.1] is a subderivation of [Pi] concluding with [Gamma]'; A' ?? [e.sub.1]: [Sigma]';

(3) [[Pi].sub.1] occurs in [Pi] in the position corresponding to the hole ([]) in C; and

(4) [Gamma]'; A' ?? [e.sub.2]: [Sigma]',

then [Gamma]; A ?? C[[e.sub.2]]: [Sigma].

LEMMA (SUBSTITUTION LEMMA).

(1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

(2) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

(3) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

PROOF. By induction on derivations. For example given [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], and the conclusion follows by an application of SUB, and [[Gamma].sub.1] ?? [Tau] [is less than]: [Psi]. By induction we have derivations for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], and the results follows by an application of SUB. [square]

LEMMA (SUBJECT REDUCTION). If [Gamma]; A ?? e: [Sigma] and e [right arrow] e', then [Gamma]; A ?? e': [Sigma].

PROOF. Assume e [equivalent] E[[e.sub.0]] and e' [equivalent] E[[e'.sub.0]], and e [right arrow] e' because [e.sub.0] [right arrow] [e'.sub.0]. We verify that if [Gamma]; A ?? [e.sub.0]: [Sigma] and [e.sub.0] [right arrow] [e'.sub.0], then [Gamma]; A ?? [e'.sub.0]: [Sigma], and the result follows by the replacement lemma. We consider some representative cases:

--[e.sub.0] = ([[Lambda]x: [[Sigma]".sub.2].[e.sub.1]) [e.sub.2]: We have a type derivation ending with:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

We have a derivation for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], so we must have derivations for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] and [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. An application of SUB gives [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. Since [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], we apply the substitution lemma to obtain [Gamma]; [A.sub.1] ?? {[e.sub.2]/x} [e.sub.1]: [[Sigma].sub.1]. Then another application of SUB gives [Gamma]; A ?? {[e.sub.2]/x} [e.sub.1]: [[Sigma].sub.1].

--[e.sub.0] = v[[Tau]], so we have a derivation ending with

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where v is either a A-abstraction or is a typecase. Suppose the former case, so v = [Lambda][Alpha] [is less than]: [[Psi]'.sub.1].e' and [[Sigma]'.sub.1] = [inverted]A[Alpha] [is less than]: [[Psi]'.sub.1].[[Sigma]".sub.1] (for some [[Sigma]".sub.1] and [Gamma] ?? ([inverted]A[Alpha] [is less than]: [[Psi]'.sub.1]. [[Sigma]".sub.1] [is less than]: ([inverted]A[Alpha] [is less than]: [[Psi]'.sub.1].[[Sigma].sub.1]. So there must exist derivations for [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. The application of SUB must be preceded by a derivation ending with

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Applying SUB to the premise, we obtain [Gamma], [Alpha] [is less than]: [[Psi]'.sub.1]; A ?? e': [[Sigma].sub.1]. Applying the substitution lemma we obtain [Gamma]; A ?? {[Tau]/[Alpha]} e': {[Tau]/[Alpha]} [[Sigma].sub.1]. Suppose v was typed with an application of the TYCASE rule, so [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], and we have the inference step

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [Rho] = [t.sub.1] ([[bar][Rho].sub.1] [union] ... [union] [t.sub.n]([[bar][Rho].sub.n]). So in the application of the PAPP rule above, we have [[Sigma]'.sub.1] = ([inverted]A[Alpha] [is less than]: [Rho]. [Sigma]' and [[Psi].sub.1] = [Rho]' for some [Rho]', and furthermore [Gamma] ?? ([inverted]A[Alpha] [is less than]: [Rho].[Sigma]') [is less than]: ([inverted]A[Alpha] [is less than]: [Rho]'.[[Sigma].sub.1]) and [Gamma] ?? [Tau] [is less than]: [Rho]'. Since [Gamma] ?? [Tau] [is less than]: [Rho]' (the second premise to the PAPP rule) and [Gamma] ?? [Rho]' [is less than]: [[bar][[Rho], then [Tau] must have the form t([bar][Tau]') where t = [t.sub.i] for some i, and we must have [Gamma] ?? [Tau]' [is less than]: [[bar][Rho].sub.i]. The ith clause is typed with a premise of the form [Gamma], [[bar][Alpha].sub.i] [is less than]: [[bar][Rho].sub.i]; A' ?? [e.sub.i]: {t([[bar][Alpha].sub.i])/[Alpha]} [Sigma]', where {[[bar][Alpha].sub.i] [intersection] FV([Gamma])= {}. Applying the substitution lemma we have [Gamma]; A' ?? {[bar][Tau]'/ [bar][Alpha]} [e.sub.i]: {t([bar][Tau]')/[Alpha]} [Sigma]'. Finally since [Gamma], [Alpha] [is less than]: [Rho]' ?? [Sigma]' [is less than]: [[Sigma].sub.1] (from the fact that [Gamma] ?? ([inverted]A[Alpha] [is less than]: [Rho].[Sigma]' [is less than]: ([inverted]A[Alpha] [is less than]: [Rho]'.[[Sigma].sub.1])), we apply the SUB rule and the Substitution Lemma to obtain [Gamma]; A ?? {[bar] [Tau]'/[bar] [[Alpha].sub.i]} [e.sub.i]: {t([bar] [Tau]')/[Alpha]} [[Sigma.sub.1].

--[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]: so we have a derivation ending with

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

From the run-time type check ?? [Tau]' [is less than]: [[Rho].sub.2] that is complete, we can build a derivation for [Gamma] ?? [Tau]' [is less than]: [[Rho].sub.2]. So we obtain the typing

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

which provides the subject reduction property for the result of evaluating the narrow.

[square]

The last part of the proof demonstrates that the narrow operation essentially tries to construct a type derivation at run-time to satisfy the narrow check. If this succeeds, then the resulting type derivation is itself justification for the well- typedness of the result. Combining Lemma 3.6 with Lemma 3.2 gives us:

THEOREM (SEMANTIC SOUNDNESS). Assume {}; {} ?? e: [Sigma]. Then e ??, or e ?? e' where e' is failed or e' is some value v.

4. DYNAMICS AND RECURSIVE OVERLOADING

In this section we consider an alternative to the recursive typecase for computing with dynamics. This approach is based on the addition of a form of recursive overloading to [XML.sup.dyn]; we name the resulting language [OXML.sup.dyn]. The semantics of recursive overloading in [OXML.sup.dyn], translated into [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], is exactly the same as the semantics for the typerec in [XML.sup.dyn]. This approach reveals further insights into our approach to computing with dynamics.

4.1 Overloading in [OXML.sup.dyn]

The approach of [OXML.sup.dyn] involves the addition of recursively overloaded functions to [XML.sup.dyn], and an elimination rule for dynamics based on applying such a function to the witness type in a dynamic. An overloaded function can be considered as form of dynamic type dispatch, that dispatches to a particular instance based on the types of the arguments. For example, a + function might dispatch to one of two different implementations, depending on whether its type argument is integer or real.

Overloaded functions require a template, parameterized by a single type parameter. The single-parameter restriction is crucial. Section 5 considers what is involved in allowing several parameters in a template. Each instance then has a type which is an instantiation of this template with a type expression. We introduce a separate namespace for overloaded function names, denoted by a, and constructs for defining templates and instances:

e ::= ... | template a:([inverted]A[Alpha].[Tau]) in e | instance a: [Kappa].[Sigma] = [e.sub.1] in [e.sub.2] | a | typecase (a,e)

An instance is required to have a type which is some instance of the template, for example

template print: [inverted]A[Alpha].[Alpha] [right arrow] unit

instance print: [Kappa].(string [right arrow] unit) = output (* output : string [right arrow] unit *)

instance print: [Kappa].(int [right arrow] unit) = [Lambda]x:int. output (intToString x)

After these declarations, print is applicable to strings and integers.

The type environment A now also contains overload types of the form (a: ([inverted]A[Alpha] [is less than]: [Rho] [Tau])), where [Rho] is a recursive kind describing the set of types over which an overloaded operation is defined. We refer to [Tau] as the signature, and we refer to a as the index variable for the overload type. Figure 9 gives the type rules for template and instance declarations. The OTY rule for the template construct specifies the signature for an overloaded operation. A template declaration that introduces a function with template [inverted]A[Alpha].[Tau] corresponds to an initial type assertion [inverted]A[Alpha] [is less than]: [perpendicular to].[Tau] where [perpendicular to] denotes the empty kind. The ODEF rule type-checks the definition of instances, tc denotes the outermost type constructors of a kind: [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]; the overloaded operation must not already have an instance defined for the type constructor of the current instance type. If the variable a has template [inverted]A[Alpha].[Tau], then the instance must have type [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] for some type constructor t. Upper bound constraints [[Rho].sub.1], ..., [[Rho].sub.n] on the type variables [[Alpha].sub.1], ..., [[Alpha].sub.n] constrain these [[Alpha].sub.i] variables in order to allow recursive applications of a (and applications of other variables) to these type variables. The kinds [[Rho].sub.i] may contain occurrences of the kind variable [Kappa], denoting the final fixed point of the domain kind of the overloaded operation. This fixed point is extended by definitions of new instances.

Fig. 9. Type rules for [OXML.sup.dyn].

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The overload type for a reflects, in its domain kind, all of the instances which are defined for it. For example, if print has type [inverted]A[Alpha] [is less than]: (string [union] int).[Alpha] [right arrow] unit, then the possible instances for print have types string [right arrow] unit and int [right arrow] unit, respectively. Suppose we then define

instance print : [Kappa]. ([inverted]A[Alpha] [is less than]: [Kappa]. [inverted]A[Beta][is less than]:[Kappa].[Alpha]* [Beta] [right arrow] unit) = [Lambda]((x,y):[Alpha] *[Beta]). (... print [[Alpha]] x; ... print [[Beta]] y; ...)

The recursive applications of print to [Alpha] and [Beta] are allowed because [Alpha] and [Beta] are constrained by the kind variable [Kappa], and the recursive reference to print has type [inverted]A[Alpha] [is less than]: [Kappa].[Alpha] [right arrow] unit. After this declaration, the type for print is [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Suppose we define another instance of print, for functions:

instance print : [Kappa]. ([inverted]A[Alpha] [is less than]: T.[inverted]A[Beta] [is less than]: T.([Alpha] [right arrow] [Beta]) [right arrow] unit) = [Lambda]f:[Alpha] [right arrow] [Beta]. output "<fun>"

Then print will be given the type [inverted]A[Alpha] [is less than]: ([Mu][Kappa].string [union] int [union] ([Kappa]*[Kappa]) [union] (T ?? T)).[Alpha] [right arrow] unit. The last component of the domain kind, T ?? T, denotes the set of all ground arrow types.

An instance has type [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], where the kinds [[Rho].sub.i] may contain occurrences of the kind variable [Kappa]. [Kappa] is constrained in the kind environment [Gamma] by the binding ([[Rho].sup.new] [is less than]: [Kappa] [is less than]: T), where [[Rho].sup.new] denotes a domain kind that includes the domain of the instance itself, and the domains of all of the other instances that have been defined for this operation. The lower bound constraint on n allows the recursive application of the overloaded operation a to types for which instances have already been defined, for example

template a: ([inverted]A[Alpha].[Alpha] [right arrow] [Alpha]) in

instance a: [Kappa].(int [right arrow] int) = ... in

instance a: [Kappa].([inverted]A[Alpha][is less than]:[Kappa].list([Alpha]) [right arrow] list ([Alpha])) = ... a [list([Alpha])] ... in ...

Within this last instance definition, recursive uses of a have the type [inverted]A[Alpha] [is less than]: [Kappa].[Alpha] [right arrow] [Alpha], where the kind environment in this definition has the local binding nit [union] list ([Kappa]) [is less than]: [Kappa] [is less than]: T. The recursive application of a within the instance is well-typed because list([Alpha]) [is less than]: list([Kappa]) (using the upper bound constraint on [Alpha]), and list([Kappa]) [is less than]: (nit [union] list([Kappa])) [is less than]: [Kappa], where the latter inclusion comes from the lower bound constraint on [Kappa] introduced by the ODEF rule.

We compute with a dynamic by opening it and then applying an overloaded function to the type and value component of the safe dynamic. This is provided by the typecast (a, e) construct that applies the function a to the type and value wrapped in the dynamic e. The type rule for this is provided by the OAPP rule in Fig. 9. The function dispatches to one of its instances based on the type tag and applies this instance to the value component of the dynamic. For example, using the definitions in Sect. 2.5, and with print now defined as a recursively overloaded function, we have the following definition of the print server loop:

printServer = [Lambda]chan: channel. ( typecase (print, intern(chart)); printServer chan)

4.2 Translation of Overloaded Functions into [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Unlike [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] does not contain overloaded functions. We assume a collection of distinct program variables [f.sub.a,t] in [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], one for type constructor t and each overloaded variable a of [OXML.sup.dyn]. We also assume a collection of distinct program variables [f.sub.a] in [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], to denote the fixed points in the recursive declarations of instances for overloaded variables. The definition of an instance for a for the type constructor t is translated to the let-definition of a variable [f.sub.a,t] in [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. The use of an overloaded variable a is translated to an expression of the form tyrec([Lambda][f.sub.a]....).

The translation of environments provided in Sect. 3.1 is extended with

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

The metafunction INSTS gives the translation of an overloaded variable type declaration in [OXML.sup.dyn], to a set of variable type declarations in [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

In the first case, there is one type declaration in [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] for each instance defined for an operation a. Each instance definition is translated as a functional, abstracting over the recursive definition of the overloaded variable. The fixed point of the definition of an overloaded function is left open in the environment to allow it to be incrementally extended by further instance definitions.

For the second case, the assertion (a : ([inverted]A[Alpha] [is less than]: [Kappa].[Tau])) is used inside a derivation for an instance type, where [Kappa] is treated as an explicit kind variable. We assume a collection of program variables [f.sub.a] in [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] to denote the fixed points of the recursive definitions of the overloaded functions a. The conclusion of this derivation is a premise to the ODEF rule; therefore, we never need to consider translating an instantiation of this derivation (where [Kappa] has been instantiated).

The translation of an overloaded function definition at its use site is given by:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

if [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

CLOSE (A, a) = [f.sub.a] if A (a) = [inverted]A[Alpha] [is less than]: [Kappa].[Tau]

The first case of this definition, when a has type [inverted]A[Alpha] [is less than]: ([Mu].[Rho]).[Tau], is the operation of closing up the definition of an overloaded operation at its use sites. An instance of a for the type constructor t is represented in [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] by the program variable

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [Rho] = [t.sub.l]([[bar][Rho].sub.1]) [union] ... [union] [t.sub.n]([[bar] [Rho].sub.n]). Instantiating this function with the final fixed point of the domain kind gives

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

The final fixed point f has type ([inverted]A[Alpha] [is less than]: ([Mu][Kappa].[Rho]). [[Tau]]). Applying the instance to this fixed point gives

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Therefore we have:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Abstracting over the fixed point and applying the fixed-point operator tyrec gives the final use-site definition:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

If an overloaded operation a has type [inverted]A[Alpha] [is less than]: [Kappa].[Tau] in the environment in [OXML.sup.dyn], then this denotes an instance which is being recursively defined. Within the overload definition, the fixed point of this recursive definition is represented by a parameter (the program variable [f.sub.a) of type [inverted]A[Alpha] [is less than]: [Kappa].[[Tau]] (since the fixed point is left open to further extensions). This parameter is used internally by simply applying it as a polymorphic function. Accordingly we translate such a function type as a simple polymorphic function type [inverted]A[Alpha] [is less than]: [Kappa]. [[Tau]]. Within an instance definition, the kind variable n is constrained by [Rho] [is less than]: [Kappa] [is less than]: T, where [Rho] denotes the set of types for which the overloaded operation is defined.

In contrast with the syntax-directed translation of [XML.sup.dyn] into [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] (in Sect. 3.1), the translation of [OXML.sup.dyn] into [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is defined by induction on type derivations in [OXML.sup.dyn]. This is necessary because the information in the type derivation is required in order to translate uses of overloaded variables. Given [OXML.sup.dyn] type judgments [Gamma]; A ?? e: [Tau], we write the translation as inference rules for sequents of the form [[Gamma]]; [A] ?? e'[[Tau]]. [[Gamma]], [[A]], and [[Tau]] are inputs to the translation, while e' is the output.

Template definitions are used only to provide information for type-checking in [OXML.sup.dyn]; therefore, such definitions are translated away:

(OTY) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Note that INSTS (a :([inverted]A[Kappa] [is less than]: [perpendicular to].[Tau])) = {}.

An instance definition is translated to a let-definition binding the instance definition to a program variable:

(ODEF) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

We compute with dynamics by closing up the definition of an overloaded operation and applying the resulting polymorphic function to the witness type of the dynamic and the value packaged with that type in the dynamic. Therefore the translation of typecase (a, e) is provided by:

(OAPP) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

THEOREM (TYPE PRESERVATION). If [Gamma]; A ?? e :[Sigma] in [OXML.sup.dyn], let e' be the result of applying the above translation into [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. Then [[Gamma]]; [[A]] ?? e' : [[Sigma]] in [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

There is a subtlety in the well-typedness of the translation. Assume a is defined already for the type constructors int and list, and say for example that A contains the binding (a: ([inverted]A[Alpha] [is less than]: ([Mu][Kappa].int [union] list([Kappa])).[Alpha] [right arrow] [Alpha])). Then [[A]] contains the instance types

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

If an instance of a is added for arrays, then the translation of

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

contains three instance types:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

The SUB type subsumption rule in [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], in particular the (contravariant) subsumption of the type environment A, is necessary in order to widen the two original instance definition types in the type environment. This widening reflects that recursive applications of a to arrays are now allowed within these instances, following the addition of an instance for arrays.

We conjecture that the translation is coherent (in the sense that different type derivations for the same program have "equivalent" translations), although we do not provide a formal justification for this. The "degree of freedom" in the translation is the translation of uses of overloaded operations, which depends on the domains of these operations in the type environment in [OXML.sup.dyn]. These domains may be narrowed using the SUB rule in Fig. 3 in Sect. 2. For any program in [OXML.sup.dyn] there is a "canonical" derivation where the SUB rule is only used where necessary to narrow the domains of overloaded operations, where a new instance is defined that overrides an already defined instance for an operation. In the translation of this canonical derivation, the domains of overloaded operations at their use sites contain the domains at the corresponding use sites in any other derivations. However in all derivations the use-site translations compute the same result for types that are in their domain.

The system of recursive overloading presented here bears some relationship to Haskell type classes [Hall et al. 1996]. Our type system uses a "closed-world" assumption. Haskell type classes use an "open-world" assumption. For example if in Haskell we define

class Plus [Alpha] where + :: [Alpha] [right arrow] [Alpha] [right arrow] [Alpha]

double x = x + x

then double is defined over instances for + that are defined after the definition of double. While the semantics of type classes is usually expressed in terms of call-site closure construction, we can provide a semantics for type classes in terms of dynamic type dispatch using our framework. Assume for example we have the class, instance, and type declarations

class Plus [Alpha] where + :: [Alpha] [right arrow] [Alpha] [right arrow] [Alpha] instance Plus (int) instance Plus([Alpha]) ?? Plus(list ([Alpha])) double :: ??[Alpha].Plus([Alpha]) ?? [Alpha] [right arrow] [Alpha]

Then these declarations are represented in a type judgment [Gamma];A ?? e: [Tau] by

--in the kind environment [Gamma], a global kind variable binding int [union] list([Kappa]) [is less than]: [Kappa] [is less than]: T and

--in the type environment A, the bindings (+ : ([inverted]A[Alpha] [is less than]: [Kappa].[Alpha] [right arrow] [Alpha] [right arrow] [Alpha])) and (double: ([inverted]A[Alpha] [is less than]: [Kappa].[Alpha] [right arrow] ([Alpha])).

The final fixed point of an overloaded variable is not formed until "program startup." The double function, which uses +, abstracts over the final fixed point of this operation. This allows the domain of double to be extended by the addition of new instances for +. In contrast with [OXML.sup.dyn], all class and instance declarations in Haskell must be provided at the top-level because of scoping problems associated with type classes [Duggan and Ophel 1997a].

Thatte [1994] and Duggan and Ophel [1997a] propose semantics for Haskellstyle parametric overloading based on overloaded functions that implicitly use a typecase construct to dispatch based on run-time type parameters. Thatte's semantics requires a global program analysis to build the semantic definitions of overloaded functions, so it is not applicable to the separate compilation of software components. Thatte's semantics is also based on a greatest fixed point construction, whereas our semantics is based on a least fixed point semantics where the fixed point is left open to incremental extension. The type system presented by Duggan and Ophel [1997a] incorporates both the "open-world" assumption associated with Haskell type classes and the "closed-world" assumption implicit in the type system of [OXML.sup.dyn].

5. RELATED WORK

Although static typing is widely regarded as useful and necessary for large-scale programming, the need for some degree of dynamic typing has also been recognized. One motivation has been to allow any program to be expressed in a statically typed language which is expressible in a dynamically typed language (such as Lisp or Scheme); if a program in the latter language cannot be statically type-checked in certain places, then insert run-time checks into the code. This approach is exemplified by approaches such as "partial types" [Thatte 1989; 1990] and "soft typing" [Cartwright and Fagan 1991; Aiken et al. 1994; Wright and Cartwright 1994]. Thatte [1990] reports however

Cardelli [personal communication] points out that automatic generation of run-time checks could be dangerous since users might not be aware of the points where they are being used. A slight change to a statically welltyped program may unwittingly cause a whole set of automatic dynamic checks to be inserted. This could reduce the robustness of the program below the programmers intentions.

In the case of soft typing at least, the approach appears mainly applicable to compiler optimization, using the result of compile-time type checking to remove as much of the run-time type checking as possible from a dynamically typed language. Freeman and Pfenning [1991] pursue an interesting variant on this approach, where refinement types are subtypes of programmer-defined datatypes.

The idea of a type dynamic for statically typed languages appears to have originated with Gordon [1980]. Mycroft [1983] extended Gordon's ideas, including adding type variables to the patterns in the typecase. Unfortunately none of this work saw publication. We have already described the earlier approaches to dynamics [Abadi et al. 1991; Abadi et al. 1992; Leroy and Mauny 1993; Buneman and Ohori 1996]. Leroy and Mauny relate several interesting applications of the dynamic construct, beyond low-level operations such as marshalling and unmarshalling. For example, the CAML language contains an evaluation function eval_syntax : ML [right arrow] dynamic which compiles and evaluates an arbitrary CAML abstract syntax tree. Applications of this facility include macro evaluation (where macros may be arbitrary CAML functions operating on abstract syntax trees) and tactics evaluation in the Coq theorem prover (where tactics are CAML functions, entered by the proof editor and dynamically compiled and evaluated) [Leroy and Mauny 1993].

All of the aforesaid approaches suffer from the problems cited in Section 1 with the unbundling of data structures in polymorphic languages. This reflects a difference in emphasis in the approaches. Our emphasis is only on the very practical question of bundling and unbundling data structures in polymorphic languages. The composition of functions bundled in dynamics, which is the main concern of the work of Abadi et al. [1992], is irrelevant to the concerns of the current work.

In the recursive typecase, type patterns match a single type and have the form t([bar][Alpha]). We might consider generalizing this by allowing tuples of type patterns and by allowing type variables as type patterns. For example, Leroy and Mauny [1993] and Abadi et al. [1992] are able to give the following definition of a function which expects a function and its argument, bundled in dynamics, extracts the function and argument and applies the function, and then rebundles the result as another dynamic:

fun Ap (f,x) = typecase (f,x) of (([Alpha] [right arrow] [Beta])(f), [Alpha](x)) ?? dynamic ([Beta], f x)

This is by far the most radical change to consider making to the type system presented here. Duggan and Ophel [1997b] demonstrate that type-checking is undecidable for any such type system.

Subtyping and narrowing suggest an analogy between dynamics and objects, and we comment on this analogy briefly. Narrowing is necessary in typed object-oriented languages because of heterogeneous collections, and is now found in languages such as Modula-3 [Nelson 1991], Java [Gosling et al. 1997], and the OMG CORBA specification [Object Management Group 1995]. In object-oriented languages, subtyping and narrowing replace the use of variant types: an abstract base class defines the variant type, and the constructors for concrete derived subclasses provide injections into this variant type. Dynamic method dispatching then replaces the case construct associated with variant types. Figure 10 illustrates this use of narrowing in Java, in an object-oriented implementation of a language evaluator. In this example abstract syntax trees are implemented as objects, with the particular "type" of a node given by the class of the corresponding object.

This analogy can be made deeper, by noting that there are now several semantics for objects that use existential types, and usually recursive types, to define object types [Bruce 1994; Pierce and Turner 1994; Abadi and Cardelli 1996]. In its simplest flavor, an object type Object{[a.sub.1] : [T.sub.1], ..., [a.sub.k] : [[Tau].sub.k}} is interpreted as the existential type

[exists][Alpha] [is less than]: T.{state : [Alpha], methods : {[a.sub.1]: [Alpha] [right arrow] [[Tau].sub.1], ..., [a.sub.k] : [Alpha] [[Tau].sub.k}}.

Method invocation e.a is then interpreted as opening the data algebra for the object and applying the method a to the encapsulated state

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Similarly, dynamic types in [XML.sup.dyn] are translated into existential types in [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], in the semantics provided in Section 3. So our approach suggests a unified framework that combines objects and dynamics, based on existential types, subtyping, and a primitive narrowing operation,

As already noted in Section 2.3, Harper and Morrisett [1995] have proposed an approach to compilation with dynamic type dispatch. An optimizing compiler based on the formal system developed by Harper and Morrisett [1995] has been implemented [Tarditi et al. 1996]. Harper and Morrisett have also noticed some of the other applications of dynamic type dispatch stated here. They demonstrate that dynamic type dispatch can be used to implement the marshalling and unmarshalling of values in distributed programming, as outlined by Ohori and Kato [1993], and that dynamics can be viewed as existential types combined with dynamic type dispatch. The latter observation was in fact already made by Abadi et al. [1991]. However, Harper and Morrisett do not make the connection between these two and do not consider using dynamic type dispatch to provide a new elimination rule for dynamics. Harper and Morrisett do not consider subtyping and do not have a narrow construct. Narrowing can be defined using their type-level typecase, with type errors represented by the empty type. However, this cannot be done when the language has subtyping: the typecase cannot be defined over dynamic types (or object types, if they are added), because of the coherence problems caused by the combination of subtyping and dynamic type dispatch.

6. CONCLUSION

We have presented an approach to dynamic typing for polymorphic languages, based on the use of dynamic type dispatch. This approach solves several outstanding problems with computing with dynamic data structures, and has particular application to marshalling in distributed programming environments.

Consideration of safe dynamics has led us to consider languages with subtyping. Our approach can be incorporated into a language without subtyping, by replacing subtyping with explicit coercions. Another alternative is to only incorporate full (unsafe) dynamics, with the recursive typecase implicitly narrowing its dynamic argument to the domain of the typecase. With this latter approach, the only addition required to an ML-like type system is a facility for constraining a type variable by (no more than one) kind variable, [Alpha] [is less than]: [Kappa]. In such a system the only kinds are ?? and explicit kind variables [Kappa]. This is necessary in type checking the arms of the typecase, as can be seen by examining the DELIM rule in Figure 5.

An aspect of marshalling that is related to the current work is the need to support user-defined marshalling operations. Birrell et al. [1993] cite this as a critical need in distributed programming environments. In a sequel paper [Duggan 1998b] we demonstrate how the framework presented here can be extended to support user-defined marshalling and unmarshalling operations that are then attached to dynamics.

ACKNOWLEDGMENTS

The article was improved immensely by detailed comments and suggestions from the reviewers. The author appreciates their help and feedback. Thanks to Luca Cardelli, Greg Morrisett and Zhong Shao for helpful discussions.

(1) Strictly speaking, to be consistent with Harper and Mitchell's XML, we should make use of a third level of types, above simple types [Tau] and polymorphic types [inverted]A[Kappa].[Sigma], for types of the form [inverted]A[Alpha].[Sigma]. This is consistent with the predicative type hierarchy of Martin-Lof type theory underlying XML. We choose to combine the second and third levels for economy. Polymorphic types can be partitioned into two levels without affecting the results presented here, since we do not have type variables ranging over either form of polymorphic type.

REFERENCES

ABADI, M. AND CARDELLI, L. 1996. A Theory of Objects. Springer-Verlag.

ABADI, M., CARDELLI, L., PIERCE, B., AND PLOTKIN, G. 1991. Dynamic typing in a statically typed language. ACM Transactions on Programming Languages and Systems 13, 2, 237-268.

ABADI, M., CARDELI, L., PIERCE, B., AND REMY, D. 1992. Dynamic typing in polymorphic languages. In P. LEE (Ed.), Proceedings of the ACM SIGPLAN Workshop on ML and its Applications, San Francisco, California. Carnegie-Mellon University Technical Report CMU-CS-93-105.

ADITYA, S. AND CARO, A. 1993. Compiler-directed type reconstruction for polymorphic languages. In Proceedings of ACM Symposium on Functional Programming and Computer Architecture, Copenhagen, Denmark, pp. 74-82. ACM Press.

AIKEN, A., WIMMERS, E. L., AND LAKSHMAN, T. K. 1994. Soft typing with conditional types. In proceedings of ACM Symposium on Principles of Programming Languages, pp. 163-173. ACM Press.

APPEL, A. 1989. Run-time tags aren't necessary. Lisp and Symbolic Computation 19, 7 (July), 703-705.

APPEL, A. 1992. Compiling with Continuations. Cambridge University Press.

BIRRELL, A., NELSON, G., OWICKI, S., AND WOBBER, E. 1993. Network objects. In Symposium on Operating Systems Principles, pp. 217-230. ACM Press.

BIRTWISTLE, G. M., DAHL, O.-J., MYHRHAUG, B., AND NYGAARD, K. 1979. Simula Begin. Studentlitteratur (Lund, Sweden), Bratt Institute Fuer Neues Lerned (Goch, FRG), Chartwell-Bratt Ltd (Kent, England.

BRUCE, K. 1994. A paradigmatic object-oriented programming language: Design, static typing and semantics. Journal of Functional Programming J, 127-206.

BUNEMAN, P. AND OHORI, A. 1996. Polymorphism and type inference in database programming. ACM Transactions on Database Systems. To appear.

CARDELLI, L. 1986. Amber. In P.-L. C. GUY COUSINEAU AND B. ROBINET (Eds.), Combinators and Functional Programming Languages, pp. 22-70. Springer-Verlag. Lecture Notes in Computer Science 242.

CARTWRIGHT, R. AND FAGAN, M. 1991. Soft typing. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 278-292. ACM Press.

DUBOIS, C., ROUAIX, F., AND WEIS, P. 1995. Extensional polymorphism. In Proceedings of ACM Symposium on Principles of Programming Languages, San Francisco, California. ACM Press.

DUGGAN, D. 1998a. Finite subtype inference with explicit polymorphism. In Static Analysis Symposium, Pisa, Italy.

DUGGAN, D. 1998b. A type-based semantics for user-defined marshalling in polymorphic languages. In Workshop on Types in Compilation, Kyoto, Japan.

DUGGAN, D. AND OPHEL, J. 1997a. Scoped parametric overloading. Submitted for publication. DUGGAN, D. AND OPHEL, J. 1997b. Type-checking multi-parameter type classes. Submitted for publication.

FREEMAN, T. AND PFENNING, F. 1991. Refinement types for ML. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, pp. 268-277. ACM Press.

GOLDBERG, B. 1991. Tag-free garbage collection for strongly-typed programming languages. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, Toronto, Ontario, pp. 165-176. ACM Press.

GOLDBERG, B. AND GLOGER, M. 1992. Polymorphic type reconstruction for garbage collection. In Proceedings of ACM Symposium on Lisp and Functional Programming, San Francisco, California, pp. 53-65. ACM Press.

GORDON, M. 1980. Adding eval to ml. Private communication to an author of Abadi et al. [1991].

GOSLING, J., JOY, B., AND STEELE, G. 1997. The Java Language Specification. The Java Series. Addison-Wesley.

GUNTER, C. A., GUNTER, E. L., AND MACQUEEN, D. B. 1993. Using abstract interpretation to compute ML equality kinds. Information and Computation 107, 303-323.

HALL, C., HAMMOND, K., PEYTON-JONES, S., AND WADLER, P. 1996. Type classes in Haskell. ACM Transactions on Programming Languages and Systems 18, 2 (March), 109-138.

HARPER, R. AND MITCHELL, J. C. 1993. On the type structure of Standard ML. ACM Transactions on Programming Languages and Systems 15, 2, 211-252.

HARPER, R. AND MORRISETT, G. 1995. Compiling polymorphism using intensional type analysis. In Proceedings of ACM Symposium on Principles of Programming Languages, San Francisco, California. ACM Press.

HUDAK, P., PEYTON-JONES, S., WADLER, P., BOUTEL, B., FAIRBAIRN, J., FASEL, J., GUZMAN, M. M., HAMMOND, g., HUGHES, J., JOHNSSON, T., KIEBURTZ, R., NIKHIL, R., PARTAIN, W., AND PETERSON, J. 1992. Report on the programming language Haskell, a non-strict purely functional language, Version 1.2. ACM SIGPLAN Notices 27, 5.

ICHBIACH, J. D., BARNES, J. G. P., FIRTH, R. J., AND WOODGER, M. 1979. Rationale for the design of the Ada programming language. SIGPLAN Notices 14, 6B. Special issue.

LAMPSON, B. 1983. A description of the Cedar language. Technical Report CSL-83-15, Xerox Palo Alto Research Center.

LEROY, X. AND MAUNY, M. 1993. Dynamics in ML. Journal of Functional Programming 3, 4, 431-463.

LISKOV, B., ATKINSON, R., BLOOM, T., MOSS, r., SCHAFFERT, J. C., SCHEIFLER, R.., AND SNYDER, A. 1981. CLU Reference Manual. Lecture Notes in Computer Science. Springer-Verlag.

LONGO, G. 1993. Parametric and type-dependent polymorphism. Technical Report LIENS-93-21 (November), Laboratoire d'Informatique de l'Ecole Normale Superieure. To appear in Fundamenta Informatica.

LONGO, C., MILSTED, K., AND SOLOVIEV, S. 1993. The genericity theorem and the notion of parametricity in the polymorphic [Lambda]-calculus. Theoretical Computer Science. Special issue in honour of C. Bohm. Abstract appeared in Proceedings of IEEE Symposium on Logic in Computer Science, Montreal, June 1993.

MILNER, R., TORTE, M., AND HARPER, R. 1990. The Definition of Standard ML. The MIT Press.

MITCHELL, J. AND PLOTKIN, G. 1988. Abstract types have existential types. ACM Transactions on Programming Languages and Systems 10, 3, 470-502.

MORRISETT, J. G. 1995. Compiling With Types. Ph.D. thesis, Carnegie-Mellon University. MORRISON, R., DEARLE, A., CONNOR, R,. C. H., AND BROWN, A. L. 1991. An ad-hoc approach to the implementation of polymorphism. ACM Transactions on Programming Languages and Systems 13, 3 (July), 342-371.

MYCROFT, A. 1983. Dynamic typing in ML. Unpublished technical report.

NELSON, G. 1991. Systems Programming in Modula-3. Prentice-Hall Series in Innovative Technology. Prentice-Hall. Object Management Group 1995. The Common Object Request Broker: Architecture and Specification 2.0. Object Management Group.

OHORI, A. AND KATO, K. 1993. Semantics for communication primitives in a polymorphic language. In Proceedings of ACM Symposium on Principles of Programming Languages, pp. 99-112. ACM Press.

PEYTON-JONES, S. 1992. Implementing lazy functional languages on stock hardware: the spineless tag-less g-machine. Journal of Functional Programming 2, 2 (July), 127-202.

PEYTON-JONES, S. AND LAUNCHBURY, J. 1991. Unboxed values as first-class citizens. In Proceedings of ACM Symposium on Functional Programming and Computer Architecture, Volume 529 of Lecture Notes in Computer Science, Cambridge, Massachusetts, pp. 636-666. Springer-Verlag.

PIERCE, B. AND TURNER, D. 1994. Simple type-theoretic foundations for object-oriented programming. Journal of Functional Programming 4, 2 (April), 207-247.

ROVNER, P. 1986. On extending Modula-2+ to build large, integrated systems. IEEE Software 3, 6 (November), 46-57.

TARDITI, D., MORRISETT, G., CHENG, P., STONE, C., HARPER., R., AND LEE, P. 1996. TIL: A type-directed optimizing compiler for ML. In Proceedings of ACM SIGPLAN Conference on Programming Language Design and Implementation, Philadelphia, Pennsylvania. ACM Press.

THATTE, S. 1989. Type inference with partial types. In Proceedings of the International Conference on Automata, Languages and Programming, Volume 317 of Lecture Notes in Computer Science, pp. 615-629. Springer-Verlag.

THATTE, S. 1990. Quasi-static typing. In Proceedings of ACM Symposium on Principles of Programming Languages, pp. 367-381. ACM Press.

THATTE, S. 1994. Semantics of type classes revisited. In Proceedings of ACM Symposium on Lisp and Functional Programming, pp. 208-219. ACM Press.

TOLMACH, A. 1994. Tag-free garbage collection using explicit type parameters. In Proceedings of

WRIGHT, A. AND CARTWRIGHT, R. 1994. A practical soft type system for Scheme. In Proceedings of ACM Symposium on Lisp and Functional Programming, Orland, Florida, pp. 250-262. ACM Press.

WRIGHT, A. AND FELLEISEN, M. 1991. A syntactic approach to type soundness. Technical Report TR91-160 (April), Rice University.

Received January 1995; revised August 1997; accepted May 1998

Author's address: Department of Computer Science, Stevens Institute of Technology, Castle Point on the Hudson, Hoboken, NJ 07030; e-mail: dduggan@cs.stevens-tech.edu.

Permission to make digital/hard copy of all or part of this material without fee for personal or classroom use provided that the copies are not made or distributed for profit or commercial advantage, the ACM copyright/server notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists requires prior specific permission and/or a fee.

DOMINIC DUGGAN

Stevens Institute of Technology

Printer friendly Cite/link Email Feedback | |

Author: | DUGGAN, DOMINIC |
---|---|

Publication: | ACM Transactions on Programming Languages & Systems |

Date: | Jan 1, 1999 |

Words: | 15102 |

Previous Article: | On Failure of the Pruning Technique in "Error Repair in Shift-Reduce Parsers". |

Next Article: | Specification and Verification of Fault-Tolerance, Timing, and Scheduling. |

Topics: |