Printer Friendly

From System F to Typed Assembly Language.

1. INTRODUCTION

Compilers that manipulate statically typed intermediate languages have compelling advantages over traditional compilers that manipulate untyped program representations. An optimizing compiler for a high-level language such as ML may make as many as 20 passes over a single program, performing sophisticated analyses and transformations such as CPS conversion, closure conversion, unboxing, subsumption elimination, or region inference. Many of these optimizations require type information in order to succeed, and even those that do not, often do benefit from the additional structure supplied by a typing discipline. Moreover, the essence of many of these program transformations can be specified by the corresponding type translation. Types provide concise and yet precise documentation of the compilation process that can be automatically verified by a type checker. In practice, this technique has been invaluable for debugging new transformations and optimizations [Tarditi et al. 1996; Morrisett et al. 1996].

Today a small number of compilers work with typed intermediate languages in order to realize some or all of these benefits [Leroy 1992; Peyton Jones et al. 1993; Birkedal et al. 1993; Tarditi et al. 1996; Lindholm and Yellin 1996; Shao 1997; Dimock et al. 1997]. However, in all of these compilers, there is a conceptual line where types are lost. For instance, the TIL/ML compiler preserves type information through approximately 80% of compilation, but the remaining 20% is untyped. We show how to recode the untyped portions of a compiler to maintain type information through all phases of compilation and, in so doing, extend the paradigm of compiling with typed intermediate languages to compiling with typed target languages.

The target language in this article is a strongly typed assembly language (TAL) based on a generic RISC instruction set. The type system for TAL is surprisingly standard; supporting tuples, polymorphism, existential packages, and a restricted form of function pointer, yet it is sufficiently powerful that we can automatically generate well-typed code from high-level ML-like languages.

The TAL framework admits most conventional low-level optimizations such as global register allocation, copy propagation, constant folding, and dead-code elimination. Except for a small number of atomic code patterns, TAL also supports code motion optimizations such as instruction scheduling, common-subexpression elimination, and loop-invariant removal. Some more advanced implementation techniques are not supported by the simple typed assembly language we present here, including run-time code generation, intensional polymorphism, and array bounds check elimination. In Section 8 we discuss how to extend the type system presented here to support such techniques.

TAL not only allows us to reap the benefits of types throughout a compiler, but it also enables a practical system for executing untrusted code safely and efficiently. For example, as suggested by the SPIN project [Bershad et al. 1995], operating systems could allow users to download TAL extensions into the kernel. The kernel would type check the TAL extension, thereby ensuring that it never accesses hidden resources within the kernel, always calls kernel routines with the right number and types of arguments, and so forth. After the type checker has verified the extension, the kernel can safely assemble it and dynamically link it in. Such a TAL-based system has a number of advantages. Currently, SPIN requires that extensions be written in a single high-level language (Modula-3) and use a single trusted compiler (along with cryptographic signatures) in order to ensure their safety. In contrast, a kernel based on a typed assembly language could support extensions written in a variety of high-level languages using a variety of untrusted compilers, since the safety of the resulting assembly code can be checked independently of the source code or the compiler. Furthermore, critical inner loops could be hand-written in assembly language in order to achieve better performance. TAL could also be used to support extensible Web browsers, extensible servers, active networks, or any other extensible system where security, performance, and language independence are desired. Of course, while type safety implies many important security properties such as memory safety, neither SPIN nor TAL can enforce other important security properties, such as termination, that do not follow from type safety.

Another framework for verifying safety properties in low-level programs, proposed by Necula and Lee, is called proof-carrying code (PCC) [Necula and Lee 1996; Necula 1997; 1998]. Necula and Lee encode the relevant operational content of simple type systems using extensions to first-order predicate logic, and automatically verify proofs of security properties such as memory safety [Necula 1997]. Because Necula and Lee use a general-purpose logic, they can encode more expressive security properties and permit some optimizations that are impossible in TAL. TAL, on the other hand, provides compiler writers with a higher-level set of abstractions than PCC. These abstractions make compiling languages with features such as higher-order functions and data types simpler. In order to do the same, a PCC programmer must build such abstractions from the low-level logical primitives, and it is not always obvious how to obtain a compact logical encoding of these language constructs that preserves the necessary security properties. Another benefit of the TAL abstractions is that they make it possible to automatically reconstruct the proof of type safety; TAL binaries can be more compact than PCC binaries because they do not need to contain a complete proof. Clearly, however, the ideal system contains both the compiler support and compact annotations of TAL and the flexibility of PCC. We leave this long-term goal to future research; here we focus on the theoretical framework for automatic compilation of high-level languages to type-safe assembly language.

2. OVERVIEW

The goals of this work are twofold: first, to define a type system for a conventional assembly language and to prove its soundness, and, second, to demonstrate the expressiveness of the resulting language by showing how to automatically compile a high-level language to type-correct assembly code. In this section, we give a brief overview of our typed assembly language and the structure of our compiler.

2.1 TAL

The primary goal of the TAL type system is to provide a fully automatic way to verify that programs will not violate the primitive abstractions of the language. In a conventional untyped assembly language, all values are represented as wordsized integers, and the primitive operations of the language apply to any such values. That is, in an untyped assembly language, there is only one abstraction: the machine word. In contrast, TAL provides a set of built-in abstractions, such as (word-sized) integers, pointers to tuples, and code labels, for each of which only some operations are applicable. For example, arithmetic is only permitted on integer values; dereferencing is only permitted for pointer values; and control transfer is only permitted for code labels. We say that a program becomes stuck if it attempts to perform an unpermissible operation. Hence, the primary goal of the TAL type system is to ensure that well-typed programs do not become stuck.

Because TAL treats integers as a separate abstraction from pointers or code labels, and because arithmetic is only permitted on integers, it is possible to show, that, in addition to never becoming stuck, a well-typed TAL program satisfies a number of important safety policies relevant to security. For instance, it is possible to conclude that programs cannot manufacture pointers to arbitrary objects, or that programs cannot jump to code that has not been verified.

In addition to providing a set of built-in abstractions, TAL provides a set of type constructors that may be used by programmers or compilers to build new abstractions. For example, in the functional language compiler that we sketch, closures (a high-level language abstraction) are encoded as TAL-level abstractions using the existential type constructor. In the high-level language, it is impossible for a program to apply any primitive operation to a closure except for function application. For instance, it is impossible for a program to inspect the environment of the closure. At the TAL level, closures are represented as a pair of a label (to some code) and an environment data structure (intended to hold the free variables of the code). We use an existential type to hide the type of the environment data structure and to connect it to the code. The resulting object prevents malicious or faulty code from reading the environment, or passing anything but the closure's environment to the closure's code.

In other work, we have extended the type system to support many more abstractions, such as modules [Glew and Morrisett 1999] and the run-time stack [Morrisett et al. 1998]. Here, we have attempted to keep the type system simple enough that the formalism may be understood and proven sound, yet powerful enough that we can demonstrate how a high-level ML-like language may be compiled to type-correct TAL code automatically.

The typed assembly language we present here is based on a conventional RISC-style assembly language. In particular, all but two of the instructions are standard assembly operations. In an effort to simplify the formalism, we have omitted many typical instructions, such as a jump-and-link, that may be synthesized using our primitives. Figure 1 gives an example TAL program that, when control is transferred to the label 1_main, computes 6 factorial and then halts with the result in register r1. The code looks and behaves much like standard assembly code, except that each label is annotated with a code precondition that associates types with registers. The precondition specifies, that, before control can be transferred to the corresponding label, the registers must contain values of the specified types. Hence, before allowing a jump to 1_fact as in 1_main, the type checker ensures that an integer value resides in register r1.
Fig. 1. A TAL program that computes 6 factorial.

1_main:
  code[]{}.                % entry point
    mov r1,6
    jmp 1_fact
1_fact:
  code[]{r1:int}.          % compute factorial of r1
    mov r2,r1              % set up for loop
    mov r1,1
    jmp 1_loop
1_loop:
  code[]{r1:int,r2:int}.   % r1: the product so far,
                           % r2: the next number to be multiplied
    bnz r2,1_nonzero       % branch if not zero
    halt[int]              % halt with result in r1
1_nonzero:
  code[]{r1:int,r2:int}.
    mul r1,r1,r2           % multiply next number
    sub r2, r2,1           % decrement the counter
    jmp 1_loop


2.2 A Type-Preserving Compiler

In order to motivate the typing constructs in TAL and to justify our claims about its expressiveness, we spend a large part of this article sketching a compiler from a variant of the polymorphic lambda-calculus to TAL. Our compiler is structured as five translations between six typed calculi:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Each calculus is a first-class programming language in the sense that each translation operates correctly on any well-typed program of its input calculus. The translations do not assume that their input is the output from the preceding translation. This fact frees a compiler to optimize code aggressively between any of the translation steps. The inspiration for the phases and their ordering is derived from SML/NJ [Appel and MacQueen 1991; Appel 1992] (which is in turn based on the Rabbit [Steele 1978] and Orbit [Kranz et al. 1986] compilers) except that our compiler uses types throughout compilation.

The rest of this article proceeds by describing each of the languages and translations in our compiler in detail. We give the syntax and static semantics of each language as well as type-preserving translations between them. The middle calculi ([[Lambda].sup.K], [[Lambda].sup.C], [[Lambda].sup.H], and [[Lambda].sup.A]) have many features in common. Therefore, we only describe [[Lambda].sup.K] in full, and each successive calculus is defined in terms of its differences from the preceding one.

We begin by presenting the compiler's source language, [[Lambda].sup.F], in Section 3. Section 4 describes the first intermediate language, [[Lambda]. sup.K], and gives a typed CPS translation to it based on Harper and Lillibridge [1993]. The CPS translation fixes the order of evaluation and names intermediate computations. Section 5 presents [[Lambda].sup.C] and gives a typed closure translation to it based on, but considerably simpler than, that of Minamide et al. [1996]. The closure translation makes the construction of functions' environments and closures explicit, thereby rendering all data structures explicit. This is followed by a simple hoisting translation that lifts the (now closed) code to the top level.

Section 6 presents [[Lambda].sup.A], in which the allocation and initialization of data structures is made explicit, and gives a translation from [[Lambda].sup.H] to [[Lambda].sup.A]. At this point in compilation, the intermediate code is essentially in a lambda-calculus syntax for assembly language (following the ideas of Wand [1992]). Section 7 presents the formal details of our typed assembly language. We show type safety for TAL, and also define a translation from [[Lambda].sup.A] to TAL. Finally, in Section 8 we discuss extensions to TAL to support additional language constructs and optimizations. We also describe our current implementation of TAL and discuss some directions for future investigation.

3. SYSTEM F

The source language for our compiler, [[Lambda].sup.F], is a variant of System F [Girard 1971; 1972; Reynolds 1974] (the polymorphic [Lambda]-calculus) augmented with integers, products, and recursion on terms. The syntax for [[Lambda].sup.F] appears below:
types            [Tau], [Sigma] ::= [Alpha] | int | [[Tau].sub.1]
                                    [right arrow] [[Tau].sub.2] |
                                    [inverted]A[Alpha].[Tau] |
                                    <[Tau]>

annotated terms  e ::= [u.sup.[Tau]]

terms            [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE
                 IN ASCII]

primitives       p ::= + | - | x

type contexts    [Delta] ::= [[Alpha].sub.1], ..., [[Alpha].sub.n]

value contexts   [Gamma] ::= [x.sub.1]:[[Tau].sub.1], ...,
                             [x.sub.n]:[[Tau].sub.n]


Integers, the only base type, are introduced by integer literals (i), operated on by by arithmetic primitives p, and eliminated by a conditional expression if0. The term if0 ([e.sub.1], [e.sub.2], [e.sub.3]) evaluates to [e.sub.2] when [e.sub.1] evaluates to zero, and otherwise evaluates to [e.sub.3]. We use the notation E to refer to a vector of syntactic objects drawn from the syntactic class E. For example, <e> is shorthand for a tuple <[e.sub.1], ..., [e.sub.n]). The elimination form for tuples, [[Pi].sub.i](e), evaluates to the ith field of the tuple e. Recursive functions are written fix x([x.sub.1]:[[Tau].sub.1]):[[Tau].sub.2].e, where x is the name of the recursive function and may appear free in the body; [x.sub.1] is its argument (with type [[Tau].sub.1]); and e is its body (with type [[Tau].sub.2]). Polymorphic functions are written [Lambda][Alpha].e, where [Alpha] is the abstracted type, and e is the body of the polymorphic function. For example, the polymorphic identity function may be written as [Lambda][Alpha].fix id(x:[Alpha]):[Alpha].x. Instantiation of a polymorphic expression e is written e [[Tau]]. As usual, we consider types and expressions that differ only in the names of bound variables to be identical. We write the capture-avoiding substitution of E for X in E' as E'[E/X].

We interpret [[Lambda].sup.F] with a conventional call-by-value operational semantics (which is not presented here). The static semantics (given in Figure 2) is specified as a set of inference rules for concluding judgments of the form [Delta]; [Gamma] [??.sub.F] e: [Tau], where [Delta] is a context containing the free type variables of [Gamma], e, and [Tau]; [Gamma] is a context that assigns types to the free variables of e; and [Tau] is the type of e. A second judgment [Delta] [??.sub.F] [Tau] asserts that type [Tau] is well-formed under type context [Delta]. In later judgments, we will use 0 to denote an empty type or value context.

[Figure 2 ILLUSTRATION OMITTED]

To simplify the presentation of our translations, we use type-annotated terms (e), which are unannotated terms (u) marked with their types. This decision allows us to present our translations in a simple, syntax-directed fashion, rather than making them dependent on the structure of typing derivations. The typing rules ensure that all annotations in a well-formed term are correct. In the interest of clarity, however, we will omit the type annotations in informal discussions and examples.

As a running example, we will consider compiling a term that computes 6 factorial:

(fix f(n:int):int, if0(n,1,n x f(n - 1)))6

4. CPS CONVERSION

The first compilation stage is conversion to continuation-passing style (CPS). This stage names all intermediate computations and eliminates the need for a control stack. All unconditional control transfers, including function invocation and return, are achieved via function call. The target calculus for this phase is called [[Lambda].sup.K]:
types              [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE
                   IN ASCII]

annotated values   v ::= [u.sup.[Tau]]

values             u ::= x | i | fix x[[Alpha]]([x.sub.1]:[[Tau].
                         sub.1], ..., [x.sub.n]:[[Tau].sub.n]).e
                         | <v>

primitives         P ::= + | - | x

declarations       d ::= x = v | x = [[Pi].sub.i] v | x = [v.sub.1]
                         p [v.sub.2]

terms              e ::= let d in e
                         | v[[Tau]](v)
                         | if0(v, [e.sub.1],[e.sub.2])
                         | halt[[Tau]]v

type contexts      [Delta] ::= [[Alpha].sub.1], ...,
                               [[Alpha].sub.n]

value contexts     [Gamma] ::= [x.sub.1]:[[Tau].sub.1], ...,
                               [x.sub.n]:[[Tau].sub.n]


Code in [[Lambda].sup.K] is nearly linear: it consists of a series of let bindings followed by a function call. The exception to this is the if0 construct, which forms a tree containing two subexpressions.

In [[Lambda].sup.K] there is only one abstraction mechanism (fix), which abstracts both type and value variables, thereby simplifying the rest of the compiler. The corresponding [inverted]A and [right arrow] types are also combined. We abbreviate [inverted]A [].([Tau]) [right arrow] void as ([Tau]) [right arrow] void; we abbreviate fix f[[Alpha]]([x.sub.1]:[[Tau].sub.1], ..., [x.sub.n]:[[Tau].sub.n]).e as [Lambda] [[Alpha]]([x.sub.1]:[[Tau].sub.1], ..., [x.sub.n]:[[Tau].sub.n).e, when f does not appear free in e; and we omit empty type argument brackets in both the fix and A forms and in applications.

In [[Lambda].sup.K], unlike [[Lambda].sup.F], functions do not return values, so function calls are just jumps. The function notation "[right arrow] void" is intended to suggest this fact. If control is to be returned to the caller, the caller must pass the callee a continuation function for it to invoke. Execution is completed by the construct halt[[Tau]]v, which accepts a result value v of type [Tau] and terminates the computation. Typically, this construct is used by the top-level continuation.

Since expressions never return values, only typing judgments for values state types. The new judgment [Delta]; [Gamma] [??.sub.K] e indicates that the term e is well formed under type and value contexts [Delta] and [Gamma]. Aside from these issues, the static semantics for [[Lambda].sup.K] is standard and appears in Figure 3.

[Figure 3 ILLUSTRATION OMITTED]

4.1 Translation

The CPS translation that takes [[Lambda].sup.F] to [[Lambda].sup.K] is based on that of Harper and Lillibridge [1993] and appears in Figure 4. The type translation is written K [[ [multiplied by] ]]. The principal translation for terms, [K.sub.exp] [[e]], takes a continuation k, computes the value of e and hands that value to k. A second term translation for full programs, [K.sub.prog] [[e]], calls the principal translation with a special top-level continuation that accepts a final answer and halts. In the translation, the variables c and x are assumed to be fresh in order to avoid variable capture.

[Figure 4 ILLUSTRATION OMITTED]

An important property of the translation is that it translates well-formed [[Lambda].sup.F] expressions to well-formed [[Lambda].sup.K] expressions:

LEMMA (CPS CONVERSION TYPE CORRECTNESS). If 0;0 [??.sub.F] e: [Tau] then 0;0 [??.sub.K] [K.sub.prog] [[e]].

In this translation, and in those that follow, no particular effort is made to optimize the resulting code. A realistic compiler based on these type systems, such as the one we discuss in Section 8, would integrate optimizations into these translations. For instance, a realistic CPS-converter would eliminate "administrative" redices and optimize tail recursion [Danvy and Filinski 1992].

The factorial example coded in [[Lambda].sup.K] is given below. This code would be obtained by [K.sub.prog] [[ [multiplied by] ]] in conjunction with two optimizations mentioned above.
(fix f (n:int, k:(int) [right arrow] void).
      if0(n, k(1),
          let x = n - 1 in
               f(x,[Lambda](y:int), let z = n x y in k(z))))
(6, [Lambda](n:int). halt[int]n)


5. SIMPLIFIED POLYMORPHIC CLOSURE CONVERSION

The second compilation stage is closure conversion, which makes closures explicit, thereby separating program code from data. This is done in two steps. The first and main step, closure conversion proper, rewrites all functions so that they contain no free variables. Any variables that appear free in a function must be taken as additional arguments to that function. Those additional arguments are collected in an environment that is paired with the (now closed) code to create a closure. Function calls are performed by extracting the code and the environment from the closure, and then calling that code with the environment as an additional argument.

In the second step, hoisting, the code blocks are lifted to the top of the program, achieving the desired separation between code and data. Since those code blocks are closed, hoisting can be done without difficulty. We begin with closure conversion proper; the hoisting step is considered in Section 5.2.

Although the operational explanation of closure conversion is quite simple, there are a number of subtle issues involved in type-checking the resulting code. In the absence of polymorphic functions, our approach to typing closure conversion is based on Minamide et al. [1996], who observe that if two functions with the same type but different free variables (and therefore different environment types) were naively typed after closure conversion, the types of their closures would not be the same. To prevent this, they use existential types [Mitchell and Plotkin 1988] to abstract the types of environments, thereby hiding the fact that the closures' environments have different types.

In the presence of polymorphism, functions may have free type variables as well as free value variables, and, just as for free value variables, closure conversion must rewrite functions to take free type variables as additional arguments. Our approach for dealing with this issue diverges from that of Minamide et al., who desire a type-passing interpretation of polymorphism in which types are constructed and passed as data at run time. In such a type-passing interpretation, those additional type arguments must be collected in a type environment, which is the type-level equivalent of the value environment discussed earlier. Type environments necessitate two complex mechanisms: abstract kinds, to hide the differences between type environments, and translucent types, to ensure that code blocks are called with the correct type environments.

We propose a considerably simpler approach to polymorphic closure conversion. To avoid the complexities of type environments, we adopt a type-erasure interpretation of polymorphism as in The Definition of Standard ML [Milner et al. 1997]. In a type-erasure interpretation, we need not save the contents of free type variables in a type environment; instead, we substitute them directly into code blocks. Semantically, this amounts to making copies of code blocks in which the relevant substitutions have been performed. However, as types will ultimately be erased, these "copies" are represented by the same term at run time, resulting in no runtime cost.

Formally this means, that, in a type-erasure interpretation, we consider the partial application of a function to a type argument to be a value. For example, suppose v has the type [inverted]A[[Alpha], [Beta]].([Tau]) [right arrow] void where the type variables [Alpha] stand for the function's free type variables, and the type variables [Beta] are the function's ordinary type arguments. If [Sigma] are the contents of those free type variables, then the partial instantiation v[[Sigma]] is considered a value and has type [inverted]A[[Beta]].([Tau][[Sigma]/[Alpha]]) [right arrow] void. This instantiation takes the place of the construction of a type environment.

The work of Minamide et al. arose from the TIL compiler [Morrisett et al. 1996], which uses run-time type information to optimize data layout [Tarditi et at. 1996]. At first, it seems that a type-erasure semantics precludes these optimizations. However, recent work of Craw et al. [1998; 1999] shows how to encode run-time type information in a type-erasure language. Rather than manipulating types directly, programs manipulate values that represent types. Using this device, the type environment can become part of the value environment, and closure conversion may be performed in a similar fashion as described here. These mechanisms can be added to TAL, and the optimizations above can be used in a compiler targeting it.

Figure 5 presents the differences between [[Lambda].sup.C] and [[Lambda].sup.K]. The principal difference is that the body of a function must type check in a context containing only its formal arguments. In other words, code blocks must be closed, as desired. As discussed above, we also make type instantiation a value form. Finally, we add existential types [Mitchell and Plotkin 1988] to support the typing of closures. Note that in a type-erasure interpretation, the type portion of an existential package (all but v of pack [[Tau], v] as [exists][Alpha].[Tau]') is erased at run time, and hence the creation of such a package has no run-time cost.

[Figure 5 ILLUSTRATION OMITTED]

5.1 Translation

The closure conversion algorithm is formalized in Figure 6. The translation for types is denoted by C[[ [multiplied by] ]], the only interesting rule of which is the one for function types:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

[Figure 6 ILLUSTRATION OMITTED]

The existentially quantified variable [Beta] represents the type of the value environment for the closure. The closure itself is a pair consisting of a piece of code instantiated with types for its free type variables, and a value environment. The instantiated code takes as arguments its original type and value arguments, as well as the value environment. Closures are invoked by extracting the code and environment from the closure and then applying the code to the environment and the function's arguments.

The term translation has three parts: one for terms, [C.sub.exp] [[ [multiplied by] ]] one for declarations, [C.sub.dec] [[ [multiplied by] ]], and one for values, [C.sub.val] [[ [multiplied by] ]]. For uniformity with other translations, we also provide a whole program translation ([C.sub.prog] [[ [multiplied by] ]]), which in this case simply invokes the term translation. To avoid variable capture, the variables z and [Gamma] are assumed to be fresh.

Again, we may show that the translation preserves well-formedness of programs:

LEMMA (CLOSURE CONVERSION TYPE CORRECTNESS). If 0;0 [??.sub.K] e then 0;0 [??.sub.C] [C.sub.prog][[e]].

5.2 Hoisting

After closure conversion, all functions are closed and may be hoisted out to the top level without difficulty. In a real compiler, these two phases would be combined, but we have separated them here for simplicity. The target of the hoisting translation is the calculus [[Lambda].sup.H], in which fix is no longer a value form. Instead, code blocks are defined by a letrec prefix, which we call a heap in anticipation of the heaps of [[Lambda].sup.A] and TAL. This change is made precise in Figure 7.

[Figure 7 ILLUSTRATION OMITTED]

Programs are translated from [[Lambda].sup.C] to [[Lambda].sup.H] by replacing all fix expressions with fresh variables and binding those variables to the corresponding code expressions in the heap. This translation, denoted by [H.sub.prog][[ [multiplied by] ]], is straightforward to formalize, so we omit the formalization in the interest of brevity.

LEMMA (HOISTING TYPE CORRECTNESS). If 0;0 [??.sub.c] e then 0;0 [??.sub.H] [H.sub.prog]??e??.

Some examples of closure conversion and hoisting appear in Figures 8 and 9. Figure 8 gives the factorial example after closure conversion, hoisting, and few simplifying optimizations (beta reduction and copy propagation). To illustrate polymorphic closure conversion we consider another example in Figure 9, the polymorphic, higher-order function twice that takes a function and composes it with itself. The twice function contains two nested functions, twicef and oncef, each of which contains the free type variable [Alpha], and therefore, after closure conversion, becomes part of the type environment for these functions. Consequently, the type argument to [twice.sub.code] is an ordinary type argument, but the type arguments to the code blocks [twicef.sub.code] and [oncef.sub.code] stand for free type variables and are instantiated appropriately whenever closures are formed from those code blocks.

[Figures 8-9 ILLUSTRATION OMITTED]

6. EXPLICIT ALLOCATION

The [[Lambda].sup.H] intermediate language still has an atomic constructor for forming tuples, but machines must allocate space for a tuple and fill it out field by field; the allocation stage makes this process explicit. To do so, we eliminate the value form for tuples, and introduce new declaration forms for allocating and initializing tuples, as shown in Figure 10. The creation of an n-element tuple becomes a computation that is separated into an allocation step and n initialization steps. For example, if [v.sub.1] and [v.sub.2] are integers, the pair <[v.sub.1], [v.sub.2]> is created as follows (where types have been added for clarity):
let [x.sub.1]:([int.sup.0], [int.sup.0]) = malloc[int, int]
    [x.sub.2]:([int.sup.1], [int.sup.0]) = [x.sub.1][1]
                                           [left arrow] [v.sub.1]
            x:([int.sup.1], [int.sup.1]) = [x.sub.2][2]
                                           [left arrow] [v.sub.2]
    .
    .
    .


[Figure 10 ILLUSTRATION OMITTED]

The "[x.sub.1] = malloc[int, int]" step allocates an uninitialized tuple and binds [x.sub.1] to the address of the tuple. The "0" superscripts on the types of the fields indicate that the fields are uninitialized, and hence no projection may be performed on those fields. The "[x.sub.2] = [x.sub.1][1] [left arrow] [v.sub.1]" step updates the first field of the tuple with the value [v.sub.1] and binds [x.sub.2] to the address of the tuple. Note that [x.sub.2] is assigned a type where the first field has a "1" superscript, indicating that this field is initialized. Finally, the "x = [x.sub.2][2] [left arrow] [v.sub.2]" step initializes the second field of the tuple with [v.sub.2] and binds x to the address of the tuple, which is assigned the fully initialized type <[int.sup.1], [int.sup.1]>. Hence, both [[Pi].sub.1] and [[Pi].sub.2] are allowed on x.

The code sequence above need not be atomic; it may be rearranged or interleaved with projections in any well-typed manner. The initialization flags on the types ensure that a field cannot be projected unless it has been initialized. Moreover, a syntactic value restriction ensures there is no unsoundness in the presence of polymorphism. Operationally, the declaration x[i] [left arrow] v is interpreted as an imperative operation, and thus at the end of the sequence, [x.sub.1], [x.sub.2], and x are all aliases for the same location, even though they have different types. Consequently, the initialization flags do not prevent a field from being initialized twice. It is possible to use monads [Wadler 1990a; Launchbury and Peyton Jones 1995] or linear types [Girard 1987; Wadler 1990b; 1993] to ensure that a tuple is initialized exactly once, but we have avoided these approaches in the interest of a simpler type system. The presence of uninitialized values also raises some garbage collection issues; in Section 8 we discuss how our implementation deals with these issues.

6.1 Translation

The translation of types from [[Lambda].sup.H] to [[Lambda].sup.A] is simple; it amounts to adding initialization flags to each field of tuple types:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The term translation is formalized in Figure 11 as five translations: full programs ([A.sub.prog][[ [multiplied by] ]]), heap values ([A.sub.hval][[ [multiplied by] ]]), expressions ([A.sub.exp][[ [multiplied by] ]]), declarations ([A.sub.dec][[multiplied by] ]]), and values ([A.sub.val][[ [multiplied by] ). The focus of the translation is on the last rule, which generalizes the informal translation of tuples given in the previous section. This rule returns the sequence of declarations to allocate and initialize a tuple. Although the other (nontuple) values are more or less unchanged by the translation, they too must return sequences of declarations needed to construct those values. Such sequences will be empty unless the value in question contains a tuple. Similarly, the declaration translation produces a sequence of declarations. To avoid variable capture, the variable y is assumed to be fresh.

[Figure 11 ILLUSTRATION OMITTED]

LEMMA (ALLOCATION TYPE CORRECTNESS). If [??.sub.H] P then [??.sub.A] [A.sub.prog]??P??.

The factorial example after application of the explicit allocation translation appears in Figure 12.

[Figure 12 ILLUSTRATION OMITTED]

7. TYPED ASSEMBLY LANGUAGE

The final compilation stage, code generation, converts [[Lambda].sup.A] to TAL. All of the major typing constructs in TAL are present in [[Lambda].sup.A] and, indeed, code generation is largely syntactic. To summarize the type structure at this point, there is a combined abstraction mechanism that may simultaneously abstract a type environment, a set of type arguments, and a set of value arguments. Values of these types may be partially applied to type environments and remain values. There are existential types to support closures and other data abstractions. Finally, there are n-tuples with flags on the fields indicating whether the field has been initialized.

A key technical distinction between [[Lambda].sup.A] and TAL is that [[Lambda].sup.A] uses alpha-varying variables, whereas TAL uses register names, which, like labels on records, do not alpha-vary.(1) We assume an infinite supply of registers. Mapping to a language with a finite number of registers may be performed by spilling registers into a tuple and reloading values from this tuple when necessary.

One of the consequences of this aspect of TAL is that a calling convention must be used in code generation, and calling conventions must be made explicit in the types. Hence TAL includes the type [inverted]A[[Alpha]]. {r1:[[Tau].sub.1] ,..., rn:[[Tau].sub.n]}, which is used to describe entry points of code blocks and is the TAL analog of the [[Lambda]. sup.A] function type, [inverted]A[[Alpha]]. ([[Tau].sub.1] ,..., [[Tau].sub.n]) [right arrow] void. The key difference is that we assign fixed registers to the arguments of the code. Intuitively, to jump to a block of code of this type, the type variables [Alpha] must be suitably instantiated, and registers r1 through rn must contain values of type [[Tau].sub.1] through [[Tau].sub.n], respectively.

Another distinction between [[Lambda].sup.A] and TAL is, that, while [[Lambda].sup.A] has one mechanism (variables) for identifying values, TAL follows real machines and distinguishes between labels (which may be thought of as pointers) and registers. Registers may contain only word values, which a re integers or pointers. As in [[Lambda].sup.A], tuples and code blocks are large values and must be heap allocated. Heap objects are identified by labels, which may reside in registers. In this manner, TAL makes the layout of data in memory explicit.

In the remainder of this section, we present the syntax of TAL (Section 7.1), its dynamic semantics (Section 7.2), and its full static semantics (Section 7.3). Finally, we present the translation from [[Lambda].sup.A] to TAL (Section 7.4).

7.1 TAL Syntax

We present the syntax of TAL in Figure 13. A TAL abstract machine state, or program, is a triple consisting of a heap (H), a register file (R), and a sequence of instructions (I). The heap is a mapping of labels (l) to heap values (tuples and code blocks). The register file is a mapping of registers (r) to word values. Heaps, register files, and their respective types are not syntactically correct if they repeat labels or registers. When r appears in R, the notation R{r ?? w} represents the register file R with the r binding replaced with w; if r does not appear in R, the indicated binding is simply added to R. Similar notation is used for heaps, register file types, and heap types.

[Figure 13 ILLUSTRATION OMITTED]

Although heap values are not word values, the labels that point to them are. The other word values are integers, instantiations of word values, existential packages, and junk values (?[Tau]), which are used by the operational semantics to represent uninitialized data. A small value is either a word value, a register, or an instantiated or packed small value. We draw a distinction between word values and small values because a register must contain a word, not another register. Code blocks are linear sequences of instructions that abstract a set of type variables and state their register assumptions. The sequence of instructions is always terminated by a jmp or halt instruction. Expressions that differ only by alpha-variation of bound type variables are considered identical, as are expressions that differ only in the order of the fields in a heap, a register file, or a heap or register file type.

7.2 TAL Operational Semantics

The operational semantics of TAL is presented in Figure 14 as a deterministic rewriting system P ?? P' that maps programs to programs. Although, as discussed above, we ultimately intend a type-erasure interpretation, we do not erase the types from the operational semantics presented here, so that we may more easily state and prove a subject reduction theorem (Lemma 1). If we erase the types from the instructions, then their meaning is intuitively clear, and there is a one-to-one correspondence with conventional assembly language instructions. The two exceptions to this are the unpack and malloc instructions, which are discussed below.

[Figure 14 ILLUSTRATION OMITTED]

Intuitively, the ld [r.sub.d], [r.sub.s] [i] instruction loads the ith component (counting from 0) of the tuple bound to the label in [r.sub.s], and places this word value in [r.sub.d]. Conversely, st [r.sub.d][i], [r.sub.s] places the word value in [r.sub.s] into the ith position of the tuple bound to the label in [r.sub.d]. The instruction jmp v, where v is a value of the form l[[Tau]], transfers control to the code bound to the label l, instantiating the abstracted type variables of that code with [Tau]. The bnz r, v instruction tests the value in r to see if it is zero. If so, control continues with the next instruction; otherwise control is transferred to v as with the jmp instruction.

The instruction unpack[[Alpha], [r.sub.d]], v, where v is a value of the form pack [[Tau]', v'] as [Tau], is evaluated by substituting [Tau]' for [Alpha] in the remainder of the sequence of instructions currently being executed and by binding the register [r.sub.d] to the value v'. If types are erased, the unpack instruction reduces to a simple mov instruction.

As in [[Lambda].sup.A], malloc [r.sub.d] [[[Tau].sub.1]] ,..., [[Tau].sub.n]] allocates a fresh, uninitialized tuple in the heap and binds [r.sub.d] to the address of this tuple. Of course, real machines do not provide a primitive malloc instruction. Our intention is, that, when types are erased, malloc is expanded into a fixed instruction sequence that allocates a tuple of the appropriate size. Because this instruction sequence is abstract, it prevents optimization from reordering and interleaving these underlying instructions with the surrounding TAL code. However, this is the only instruction sequence that is abstract in TAL.

7.3 TAL Static Semantics

The purpose of the static semantics is to specify when programs are well formed and to ensure that well-formed programs do not get stuck. As programs are closed and self-contained, this is expressed by the judgment [??.sub.TAL] P. The well-formedness of a program is defined by the well-formedness of its three components: the heap, the register file, and the instruction stream. Consequently, formation judgments are required for heaps, register files, and instruction sequences, which in turn require judgments for the various sorts of values and types. The static semantics for TAL appears in Figures 16-18 and consists of 13 judgments, summarized in Figure 15 and elaborated on below. The large number of judgments is a reflection more of the large number of syntactic classes, than of any inherent semantic complexity. The static semantics is inspired by and follows the conventions of Morrisett and Harper's [1997] [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

[Figures 15-18 ILLUSTRATION OMITTED]

The first five judgments in Figure 15 specify the well-formedness conditions for types and define subtyping relationships. Four of the five judgments include a type context that indicates which type variables are in scope. Heaps and heap types must be closed, and as a result, their judgments do not include type contexts.

The subtyping judgments are not intended to support subtyping in the usual generality, although they could be expanded to do so. Instead, they are used to allow the forgetting of information. The judgment [Delta] [??.sub.TAL] [[Tau]sub.1] [is less than or equal to] [[Tau].sub.2], for instance, makes it possible to forget that a field of a tuple has been initialized. This is used in the subject reduction argument (Lemma 1) where it is sometimes necessary that references to an initialized tuple be given the old uninitialized type. The register file subtyping judgment makes it possible to forget about the contents of some registers. This makes it possible to jump to a code block when too many registers are defined.

The rest of the judgments check the well-formedness of the term constructs. Neither heaps nor register files may contain free type variables, so their judgments do not include a type context. Since values in the heap are mutually recursive, the heap's own type is used while typing the heap; to make this sound we separately require that heap types be well formed. The next four judgments are for assigning types to values. In addition to one judgment for each sort of value, there is a judgment for assigning flagged types to word values: the junk value ?[Tau] may not be assigned any regular type, but it may be assigned the flagged type [[Tau].sup.0]. Each sort of value may contain references to the heap; all but heap values may contain free type variables, but only small values may contain registers. Consequently, heap value formation requires only a heap type; word value formation adds a type context; and small value formation adds a type context and a register file type.

The central result is the type safety of TAL programs: well-formed programs never get "stuck." The well-formed terminal configurations of the operational semantics have the form (H, R{r1 ?? w}, halt[[Tau]]). All other terminal configurations are considered stuck. Type safety follows from the usual Subject Reduction and Progress theorems. Their proofs are given in the Appendix.

THEOREM (SUBJECT REDUCTION). If [??.sub.TAL] P and P ?? P', then [??.sub.TAL] P'.

THEOREM (PROGRESS). If [??.sub.TAL] P, then either there exists P' such that P ?? P' or P is of the form (H,R{r1 ?? w},halt[[Tau]]).

COROLLARY (TYPE SAFETY). If [??.sub.TAL] P, then there is no stuck P' such that P ??* P'.

7.4 Code Generation

The translation from [[Lambda].sup.A] to TAL appears in Figures 19 and 20. The type translation, [Tau][[ [multiplied by] ]], from [[Lambda].sup.A] to TAL is straightforward. The only point of interest is the translation of function types, which must assign registers to value arguments:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

[Figures 19-20 ILLUSTRATION OMITTED]

The most interesting part of the term translation is the translation of declarations. Informally, declarations are translated to instruction sequences as follows:

-- x = v is mapped to mov [r.sub.x], v.

-- x = [[Pi].sub.i](v) is mapped to the sequence mov [r.sub.x],v; ld[r.sub.x], [r.sub.x] [i- 1].

-- x = [v.sub.1] p [v.sub.2] is mapped to the sequence mov [r.sub.x], [v.sub.1]; arith [r.sub.x], [r.sub.x], [v.sub.2], where arith is the appropriate arithmetic instruction.

-- [[Alpha], x] = unpack v is mapped to unpack[[Alpha], [r.sub.x]], v.

-- x = malloc[[Tau]] is mapped to malloc [r.sub.x] [[Tau]].

-- x = v[i] [left arrow] v' is mapped to the sequence

mov [r.sub.x], v; mov [r.sub.temp], v'; ... ; st [r.sub.x][i - 1], [r.sub.temp].

-- v([v.sub.1] ,..., [v.sub.n]) is mapped to the sequence

mov [r.sub.temp], v; [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], [v.sub.1]; ... ; [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII], [v.sub.n];

mov [r.sub.1], [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]; ...; mov [r.sub.n], [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]; jmp [r.sub.temp]

Note that the arguments cannot be moved immediately into the registers [r.sub.1] ,..., rn because those registers may be used in later arguments.

-- if0(v, [e.sub.1], [e.sub.2]) is mapped to the sequence

mov [r.sub.temp], v; bnz [r.sub.temp], l[[Alpha]]; [I.sub.1] where l is bound in the heap to code[[Alpha]][Gamma].[I.sub.2] the translation of [e.sub.i] is [I.sub.i]; the free type variables of [e.sub.2] are contained in [Alpha] and [Gamma] is the register file type corresponding to the free variables of [e.sub.2].

--halt[[Tau]]v is mapped to the sequence movr1, v; halt[[Tau]]

The formal translation uses a mapping [Gamma] that tracks what label or register is used to implement each term variable. As discussed above, if0 terms are implemented by a conditional branch to a new code block representing the else-clause. These new code blocks must be heap allocated, so translations of terms (and translations of heap values, which can contain terms) must return an addition to the heap as well as an instruction sequence. Also, the translation of terms must track the current type context [Delta] and register file type F in order to place that information into new code blocks resulting from if0 terms.

LEMMA (CODE GENERATION TYPE CORRECTNESS). If [??.sub.A] P then [??.sub.TAL] [T.sub.prog][[P]].

By composing the five translations (CPS conversion, closure conversion, hoisting, allocation, and code generation), we obtain a translation from [[Lambda].sup.F] to TAL. The type correctness of the composite translation follows from the preceding type correctness lemmas.

COROLLARY (COMPILER TYPE CORRECTNESS). If [??.sub.F] e : [Tau] then [??.sub.TAL] ([T.sub.prog] o [A.sub.prog] o [H.sub.prog] o [C.sub.prog] o [K.sub.prog])[[e]].

7.5 TAL Factorial

The factorial computation translated into TAL appears in Figure 21. To obtain the code shown, a few standard optimizations were applied; in particular, a clever (but automatable) register allocation and the removal of redundant moves. Were the efficiency of this version unsatisfactory, a more efficient version could be obtained by improving the [[Lambda].sup.F] source program (e.g., by using tail recursion), by optimizing intermediate language programs (e.g., by eliminating unnecessary closure creation), or by hand-coding a highly optimized version directly in TAL, such as the one in Figure 1.

[Figure 21 ILLUSTRATION OMITTED]

8. EXTENSIONS AND PRACTICE

The previous sections provide a theoretical basis for compiling high-level languages to typed assembly language. In this section we discuss some issues that arise when putting this technology into practice.

8.1 Implementation

In order to investigate the applicability of our approach to realistic modern programming languages, we have implemented a version of TAL for the Intel 32-bit Architecture (IA32) [Intel 1996], and have compilers for a number of different source languages including a safe C-like language [Morrisett et al. 1999] and a higher-order, dynamically typed language (a subset of Scheme). Compilers for Standard ML and a small object-oriented language are also in development.

TALx86, the target language for our compilers, is a strongly typed version of the IA32 assembly language. Our type checker verifies standard MASM assembly code in which type annotations and complex instructions such as malloc are assembly language macros. The MASM assembler processes this annotated code as it would any other assembly code, expanding the instruction macros as their definitions dictate and erasing types as it translates the assembly code into concrete machine instructions. We have also implemented our own assembler and are extending it to produced typed object files. Such typed object files include code/data segments and a type segment similar to Necula and Lee's code and proof segments in their PCC binaries [Necula 1997]. We have implemented a tool that reads TALx86 files, type checks them, and assembles them into object files or invokes MASM.

The TALx86 type system is based on the type system described in this article but enriched with a variety of standard constructs including floats, sums, arrays, references, recursive types, and higher-order type constructors. In order to deal with floating-point values correctly in the presence of polymorphism, we use a kind structure that distinguishes types of objects that are 32 bits wide (such as pointers and integers) from types of objects possibly of other sizes. If a polymorphic type variable [Alpha] has the 32-bit kind, then objects of type [Alpha] can be passed in general-purpose registers, and tuple offsets may be computable for fields appearing after a field of type [Alpha]. If, on the other hand, [Alpha] has the more general kind "Type," the type checker cannot tell how large objects of type [Alpha] are, and these operations are disallowed.

To support separate compilation and type-safe linking, we have also augmented our typed assembly language with a module system [Glew and Morrisett 1999]. A TAL interface file specifies the types and terms that a TAL implementation file defines. The types may either be opaque to support information hiding and modularity, or transparent to allow information sharing and admit some cross-module optimizations. Our system performs a series of checks to ensure that implementations are well formed and that their interfaces are compatible and complete. Once interface compatibility and completeness have been verified, we assemble the code as described above and invoke a standard untyped linker.

To deal with the creation and examination of exception packets TALx86 includes a type-tagging mechanism [Glew 1999]. The basic idea is that freshly created he appointers may be associated with a type, and that a tag for an unknown type [Alpha] can be tested against a tag for a known type [Tau]. If the test succeeds, the unknown type is refined to the known type. Using these tags, we implement an exception packet as an existentially packaged pair containing a tag of the hidden type (serving as the exception name) and a value of that type.

TALx86 also contains some support for compiling objects. The type system has a more general notion of subtyping than this article that includes the usual contravariant rule for code, right-extension and depth subtyping for tuples, and a variance mechanism for arrays and references. Furthermore, the type-tagging mechanism can also be used to tag objects with their class. This mechanism provides us with a way to implement down-casting. However, while TALx86 contains the necessary constructs to admit some simple object encodings, we are still developing the theoretical and practical tools we need to admit efficient object encodings.

Although this article describes a CPS-based compiler, all of the compilers we have built use a stack-based compilation model. Both standard continuations and exceptions are allocated on the stack, and the stack is also used to store spilled temporary values. The details of our stack typing discipline are discussed in Morrisett et al. [1998]. The primary mechanisms are as follows. The size of the stack and the types of its contents are specified by stack types, and code blocks indicate stack types describing the state of the stack they expect. Since code is typically expected to work with stacks of varying size, functions may quantify over stack type variables, resulting in stack polymorphism. The combination of stack types and our register typing discipline allows us to model almost any standard calling convention. Arguments, results, and continuations (or return addresses) may be placed in registers, on the stack, or both, and the implementer may specialize conventions for known functions for better register allocation.

Real machines also have a finite amount of heap space. It is straightforward to link TALx86 to a conservative garbage collector [Boehm and Weiser 1988] and reclaim dead heap values. It is worth noting that our use of conservative collection is sound. Conservative collectors make assumptions about the way pointers can be used in programs that untyped assembly language programs can violate. However, the TAL type system guarantees that these assumptions do hold because labels are a strong abstraction; labels cannot be synthesized out of integers, and operations like pointer arithmetic are disallowed. TAL guarantees that other GC constraints hold because values that disobey the constraints cannot be constructed. For example, TAL disallows pointers into the middle of objects and ensures alignment constraints are obeyed.

Support for an accurate collector would require introducing tags so that we may distinguish pointers from integers, or else require a type-passing interpretation [Tolmach 1994; Morrisett and Harper 1997]. In the former case, we would have to ensure that all values are properly tagged and fully initialized at every potential garbage collection site. We believe that it is feasible to devise a type system to ensure these constraints, but we have not seriously investigated this option.

8.2 Future Work

There remain several directions in which TAL could be improved. One of the most important is to make array manipulation efficient. In order to ensure safe access to arrays, TALx86 uses complex instructions (which expand into three real instructions) that perform subscript and update operations after checking that the array offset is in bounds. These bounds checks cannot be eliminated in the current TAL framework. As a result, array-intensive applications will suffer the same performance penalties that they do in Java just-in-time compilers where there is no time to perform analyses to eliminate the checks. However, Xi [1999] and Xi and Pfenning [1998; 1999] have shown how to eliminate array bounds checks effectively using dependent types. TALx86 can be extended with similar constructs. We have implemented a prototype version in which these checks can be eliminated, but we have not yet added compiler support for generating code with unchecked array subscripts.

Another important direction is to augment our compiler with data-layout optimizations such as those used in the TIL compiler [Tarditi et al. 1996]. As discussed in Section 5, such optimizations require programs to have the ability to analyze types at run-time, which is not directly compatible with the type-erasure interpretation adopted here. To make such optimizations permissible, we are augmenting the TALx86 language so that TAL programs can construct values that represent types and analyze those values when necessary, following the work of Crary et al. [1998; 1999].

Although we believe our translations are operationally correct, we are still searching for robust proofs of correctness. Similar CPS [Danvy and Filinski 1992] and closure conversion [Minamide et al. 1996] translations have already been proven correct, but these results do not easily extend to languages that include recursive types or objects. The principal problem is that these arguments are based on inductively defined, type-indexed logical relations between source and target language terms. Extending this framework so that it supports recursive types or objects is difficult because the relations can no longer be constructed in a simple inductive fashion. A syntactic proof of correctness seems possible (we have constructed such arguments for the CPS and closure conversion phases), but the proofs are overly specific to the details of the translation. Moreover, security-conscious applications might require translations that are not only operationally correct but also fully abstract. We hope further research on the proof theory of similar systems will eventually allow us to construct these arguments.

Other avenues of future research include extension of our type system to the same level of generality as PCC through the use of a dependent type theory, an investigation of the support required to compile Java classes and objects into TAL, and an exploration of type-theoretic mechanisms for performing explicit memory management.

9. SUMMARY

We have given a compiler from System F to a statically typed assembly language. The type system for the assembly language ensures that source-level abstractions such as closures and polymorphic functions are enforced at the machine-code level. Furthermore, although the type system may preclude some advanced optimizations, many common compiler-introduced, low-level optimizations, such as register allocation, instruction selection, or instruction scheduling, are largely unaffected. Furthermore, programmers concerned with efficiency can hand-code routines in assembly, as long as the resulting code passes the type checker. Consequently, TAL provides a foundation for high-performance computing in environments where untrusted code must be checked for safety before being executed.

ACKNOWLEDGMENTS

We are grateful to Dan Grossman, Dexter Kozen, Frederick Smith, Stephanie Weirich, Steve Zdancewic, and the anonymous referees for their many helpful comments and suggestions.

REFERENCES

APPEL, A. 1992. Compiling with Continuations. Cambridge University Press, New York, NY, USA.

APPEL, A. W. AND MACQUEEN, D. B. 1991. Standard ML of New Jersey. In 3rd International Symposium on Programming Language Implementation and Logic Programming, M. Wirsing, Ed. Springer-Verlag, New York, NY, USA, 1-13. Volume 528 of Lecture Notes in Computer Science.

BERSHAD, B., SAVAGE, S., PARDYAK, P., SIRER, E., FIUCZYNSKI, M., BECKER, D., CHAMBERS, C., AND EGGERS, S. 1995. Extensibility, safety and performance in the SPIN operating system. In 15th ACM Symposium on Operating Systems Principles. ACM Press, New York, NY, USA, 267-284.

BIRKEDAL, L., ROTHWELL, N., TOFTE, M., AND TURNER, D. 1993. The ML Kit (version 1). Tech. Rep. 93/14, Department of Computer Science, University of Copenhagen.

BOEHM, H. AND WEISER, M. 1988. Garbage collection in an uncooperative environment. Softw. Pract. Exper. 18, 9 (Sept.), 807-820.

CRARY, K. AND WEIRICH, S. 1999. Flexible type analysis. In 1999 ACM SIGPLAN International Conference on Functional Programming. ACM Press, New York, NY, USA. To appear.

CRARY, K., WEIRICH, S., AND MORRISETT, G. 1998. Intensional polymorphism in type-erasure semantics. In 1998 ACM SIGPLAN International Conference on Functional Programming. ACM Press, New York, NY, USA, 301-312.

DANVY, O. AND FILINSKI, A. 1992. Representing control: a study of the CPS transformation. Math. Struct. Comput. Sci. 2, 4 (Dec.), 361-391.

DIMOCK, A., MULLER, R., TURBAK, F., AND WELLS, J. B. 1997. Strongly typed flow-directed representation transformations. In 1997 ACM SIGPLAN International Conference on Functional Programming. ACM Press, New York, NY, USA, 85-98.

GIRARD, J.-Y. 1971. Une extension de l'interpretation de Godel a l'analyse, et son application l'elimination de coupures dans l'analyse et la theorie des types. In Proceedings of the Second Scandinavian Logic Symposium, J. E. Fenstad, Ed. North-Holland, 63-92.

GIRARD, J.-Y. 1972. Interpretation fonctionelle et elimination des coupures de l'arithmetique d'ordre superieur. Ph.D. thesis, Universite Paris VII.

GIRARD, J.-Y. 1987. Linear logic. Theor. Comput. Sci. 50, 1-102.

GLEW, N. 1999. Type dispatch for named hierarchical types. In 1999 ACM SIGPLAN International Conference on Functional Programming. ACM Press, New York, NY, USA.

GLEW, N. AND MORRISETT, G. 1999. Type-safe linking and modular assembly language. In 26th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press, New York, NY, USA, 250-261.

HARPER, R. AND LILLIBRIDGE, M. 1993. Explicit polymorphism and CPS conversion. In 20th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press, New York, NY, USA, 206-219.

INTEL. 1996. Intel Architecture Software Developer's Manual. Intel Corporation, P.O. Box 7641, Mt. Prospect IL 60056-7641.

KRANZ, D., KELSEY, R., REES, J., HUDAK, P. R., PHILBIN, J., AND ADAMS, N. 1986. ORBIT: An optimizing compiler for Scheme. In Proceedings of the ACM SIGPLAN '86 Symposium on Compiler Construction. ACM Press, New York, NY, USA, 219-233.

LAUNCHBURY, J. AND PEYTON JONES, S. 1995. State in Haskell. J. Lisp Symb. Comput. 8, 4 (Dec.), 293-341.

LEROY, X. 1992. Unboxed objects and polymorphic typing. In 19th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press, New York, NY, USA, 177-188.

LINDHOLM, T. AND YELLIN, F. 1996. The Java Virtual Machine Specification. Addison-Wesley, Menlo Park, CA, USA.

MILNER, R., TOFTE, M., HARPER, R., AND MACQUEEN, D. 1997. The Definition of Standard ML (Revised). MIT Press, Cambridge, MA, USA.

MINAMIDE, Y., MORRISETT, G., AND HARPER, R. 1996. Typed closure conversion. In 23rd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press, New York, NY, USA, 271-283.

MITCHELL, J. AND PLOTKIN, G. 1988. Abstract types have existential type. ACM Trans. Program. Lang. Syst. 10, 3 (July), 470-502.

MORRISETT, G., CRARY, K., GLEW, N., GROSSMAN, D., SAMUELS, R., SMITH, F., WALKER, D., WEIRICH, S., AND ZDANCEWIC, S. 1999. TALx86: A realistic typed assembly language. In ACM SIGPLAN Workshop on Compiler Support for System Software. INRIA Research Report, vol. 0228. INRIA, Centre de Diffusion, INRIA, BP 105-78153 Le Chesnay Cedex, France.

MORRISETT, G., CRARY, K., GLEW, N., AND WALKER, D. 1998. Stack-based typed assembly language. In Workshop on Types in Compilation. Lecture Notes in Computer Science, vol. 1473. Springer-Verlag, Berlin, Germany, 28-52.

MORRISETT, G. AND HARPER, R. 1997. Semantics of memory management for polymorphic languages. In 1st Workshop on Higher Order Operational Techniques in Semantics, A. Gordon and A. Pitts, Eds. Publications of the Newton Institute. Cambridge University Press, Cambridge, UK.

MORRISETT, G., TARDITI, D., CHENG, P., STONE, C., HARPER, R., AND LEE, P. 1996. The TIL/ML compiler: Performance and safety through types. In ACM SIGPLAN Workshop on Compiler Support for System Software.

NECULA, G. 1997. Proof-carrying code. In 24th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press, New York, NY, USA, 106-119. NECULA, G. 1998. Compiling with proofs. Ph.D. thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213-3891, USA. Also available as CMU technical report CMU-CS-98-154.

NECULA, G. AND LEE, P. 1996. Safe kernel extensions without run-time checking. In 2nd USENIX Symposium on Operating System Design and Implementation. USENIX Assocation, Berkeley, CA, USA, 229-243.

PEYTON JONES, S., HALL, C., HAMMOND, K., PARTAIN, W., AND WADLER, P. 1993. The Glasgow Haskell compiler: a technical overview. In Proc. UK Joint Framework for Information Technology (JFIT) Technical Conference. 249-257.

REYNOLDS, J. 1974. Towards a theory of type structure. In Programming Symposium. Lecture Notes in Computer Science, vol. 19. Springer-Verlag, New York, NY, USA, 408-425.

SHAO, Z. 1997. An overview of the FLINT/ML compiler. In Workshop on Types in Compilation. Boston College Computer Science Department Technical Report, vol. BCSS-97-03. Boston College, Fulton Hall, Room 460, Chestnull Hill, MA 02467-3808, USA.

STEELE, JR., G. 1978. Rabbit: A compiler for Scheme. M.S. thesis, MIT.

TARDITI, D., MORRISETT, G., CHENG, P., STONE, C., HARPER, R., AND LEE, P. 1996. TIL: A type-directed optimizing compiler for ML. In ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM Press, New York, NY, USA, 181-192.

TOLMACH, A. 1994. Tag-free garbage collection using explicit type parameters. In 1996 ACM Conference on Lisp and Functional Programming. ACM Press, New York, NY, USA, 1-11.

WADLER, P. 1990a. Comprehending monads. In 1990 ACM Conference on Lisp and Functional Programming. ACM Press, New York, NY, USA, 61-78.

WADLER, P. 1990b. Linear types can change the world! In Programming Concepts and Methods, M. Broy and C. Jones, Eds. North-Holland, Sea of Galilee, Israel. IFIP TC 2 Working Conference.

WADLER, P. 1993. A taste of linear logic. In Mathematical Foundations of Computer Science. Lecture Notes in Computer Science, vol. 711. Springer-Verlag, Gdansk, Poland.

WAND, M. 1992. Correctness of procedure representations in higher-order assembly language. In Proceedings Mathematical Foundations of Programming Semantics '91, S. Brookes, Ed. Lecture Notes in Computer Science, vol. 598. Springer-Verlag, New York, NY, USA, 294-311.

XI, H. 1999. Dependent types in practical programming. Ph.D. thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213-3891, USA.

XI, H. AND PFENNING, F. 1998. Eliminating array bound checking through dependent types. In ACM SIGPLAN Conference on Programming Language Design and Implementation. ACM Press, New York, NY, USA, 249-257.

XI, H. AND PFENNING, F. 1999. Dependent types in practical programming. In 26th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages. ACM Press, New York, NY, USA, 214-227.

Received December 1997; revised December 1998; accepted March 1999

APPENDIX

LEMMA (CONTEXT STRENGTHENING). If [Delta] [subset or equal to] [Delta]' then

(1) If [Delta] [??.sub.TAL] [Tau] then [Delta]' [??.sub.TAL] [Tau].

(2) If [Delta] [??.sub.TAL] [[Tau].sub.1] [is less than or equal to] [[Tau].sub.2] then [Delta]' [??.sub.TAL] [[Tau].sub.1] [is less than or equal to] [[Tau].sub.2].

PROOF. Part 1 is immediate by (type). Part 2 is by induction on derivations. [square]

LEMMA (SUBTYPING REGULARITY). If [Delta] [??.sub.TAL] [Tau] [is less than or equal to] then [Tau]' [Delta] [??.sub.TAL] [Tau] and [Delta] [??.sub.TAL] [Tau]'.

PROOF. By induction on derivations. [square]

LEMMA (HEAP EXTENSION). If [??.sub.TAL] H : [Psi], 0 [??.sub.TAL] [Tau], [Psi] {l:[Tau]} [??.sub.TAL] h:[Tau] hval, and l [is not an element of] H then

(1) [??.sub.TAL] [Psi] {l : [Tau]}

(2) [??.sub.TAL] H {l ?? h} : [Psi] {l : [Tau]}

(3) If [Psi] [??.sub.TAL] R: [Gamma] then [Psi] {l : [Tau]} [??.sub.TAL] R : [Gamma]

(4) If [Psi]; [Delta]; [Gamma] [??.sub.TAL] I then [Psi] {l : [Tau]}; [Delta]; [Gamma] [??.sub.TAL] I

(5) If [Psi] [??.sub.TAL] h: [Sigma] hval then [Psi] {l : [Tau]}; [??.sub.TAL] h : [Sigma] hval

(6) If [Psi]; [Delta] [??.sub.TAL] w : [[Sigma].sup.[Psi]] then [Psi] {l : [Tau]}; [Delta] [??.sub.TAL] w : [[Sigma].sup.[Psi]]

(7) If [Psi]; [Delta] [??.sub.TAL] w : [Sigma] wval then [Psi] {l : [Tau]}; [Delta] [??.sub.TAL] w : [Sigma] wval

(8) If [Psi]; [Delta]; [Gamma] [??.sub.TAL] v : [Sigma] then [Psi] {l : [Tau]}; [Delta]; [Gamma] [??.sub.TAL] v : [Sigma].

PROOF. Part 1 is immediate by (htype). Part 2 follows from parts 1 and 5. Parts 3-8 are by induction on derivations. [square]

LEMMA (HEAP UPDATE). If [??.sub.TAL] H : [Psi], 0 [??.sub.TAL] [Tau] [is less than or equal to] [Psi] (l), and [Psi] {l : [Tau]} [??.sub.TAL] h : [Tau] then

(1) [??.sub.TAL] [Psi] {l : [Tau]}

(2) [??.sub.TAL] H {l ?? h} : [Psi] {l : [Tau]}

(3) If [Psi] [??.sub.TAL] R : [Gamma] then [Psi] {l : [Tau]} [??.sub.TAL] R : [Gamma]

(4) If [Psi]; [Delta]; [Gamma] [??.sub.TAL] I then [Psi] {l : [Tau]}; [Delta]; [Gamma] [??.sub.TAL] I

(5) If [Psi] [??.sub.TAL] h : [Sigma] hval then [Psi] {l : [tau]} [??.sub.TAL] h : [Sigma] hval

(6) If [Psi]; [Delta] [??.sub.TAL] w : [[Sigma].sup.[Psi]] then [Psi] {l : [Tau]}; [Delta] [??.sub.TAL] w : [[Sigma].sup.[Psi]]

(7) If [Psi]; [Delta] [??.sub.TAL] w : [Sigma] wval then [Psi] {l : [Tau]}; [Delta] [??.sub.TAL] w : [Sigma] wval

(8) If [Psi]; [Delta]; [Gamma] [??.sub.TAL] v : [Sigma] then [Psi] {l : [Tau]}; [Delta] [Gamma] [??.sub.TAL] v : [Sigma].

PROOF. Part 1 is immediate by (htype) and Subtyping Regularity. Part 2 follows from parts 1 and 5. Parts 3-8 are by induction on derivations. The only interesting case is the case for the rule (label). The derivation must end with the following:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

If l [is not equal to] l' then clearly the inference also holds for [Psi] {l : [Tau]}. Suppose l = l'. By hypothesis and Context Strengthening, we deduce [Delta] [??.sub.TAL] [Tau] [is less than or equal to] [Sigma]'. Then the conclusion may be proven with the (trans) rule, as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] [square]

LEMMA (REGISTER FILE UPDATE). If [Psi] [??.sub.TAL] R : [Gamma] and [Psi]; 0 [??.sub.TAL] w : [Tau] wval then [Psi] [??.sub.TAL] R{r ?? w} : [Gamma]{r : [Tau]}.

PROOF. Suppose R is {[r.sub.1] ?? [w.sub.1], ..., [r.sub.n] ?? [w.sub.n]} and [Gamma] is {[r.sub.i] ?? [[Tau].sub.1], ..., [r.sub.m] ?? [[Tau].sub.m]} where r may or may not be in {[r.sub.1], ..., [r.sub.n]}. Since [Psi] [??.sub.TAL] R : [Gamma], by the rule (reg) it must be the case that n [is greater than or equal to] m and [Psi]; 0 [??.sub.TAL] [w.sub.i] : [[Tau].sub.i] wval (for all 1 [is less than or equal to] i [is less than or equal to] n and some [[Tau].sub.m+1], ..., [[Tau].sub.n]). So certainly for i such that [r.sub.i] [is not equal to] r, we have [Psi]; 0 [??.sub.TAL] [w.sub.i] : [[Tau].sub.i] wval, and by hypothesis we have [Psi]; 0 [??.sub.TAL] w : [Tau] wval; so by rule (reg) [Psi] [??.sub.TAL] R {r ?? w} : [Gamma] {r ?? [Tau]}. [square]

LEMMA (CANONICAL HEAP FORMS). If [Psi] [??.sub.TAL] h : [Tau] hval then

(1) If [Tau] = [inverted]A[[Alpha]].[Gamma] then

(a) h = code[[Alpha]] [Gamma].I

(b) [Psi]; [Alpha]; [Gamma] [??.sub.TAL] I.

(2) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. By inspection. [square]

LEMMA (CANONICAL WORD FORMS). If [??.sub.TAL], H : [Psi] and [Psi]; 0 [??.sub.TAL] w : [Tau] wval then

(1) If [Tau] = int then w = i.

(2) If [Tau] = [inverted]A [[[Beta].sub.1], ..., [[Beta].sub.m]].[Gamma] then

(a) w = l [[Sigma].sub.1], ..., [[Sigma].sub.n]]

(b) H (l) = code [[Alpha].sub.1], ..., [[Alpha].sub.n], [[Beta].sub.1], ..., [[Beta].sub.m]][Gamma]'.I

(c) [Gamma] = [Gamma]'[[Alpha]/[Alpha]]

(d) [Psi]; [[Alpha].sub.1], ..., [[Alpha].sub.n], [[Beta].sub.1], ..., [[Beta].sub.m; [Gamma]' [??.sub.TAL] I.

(3) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(4) If [Tau] = [exists][Alpha].[Tau] then w = pack [[Tau]', w'] as [exists][Alpha].[Tau] and [Psi];0 [??.sub.TAL] w' : [Tau][[Tau]'/[Alpha]] wval.

PROOF. (1) By inspection.

(2) By induction on the derivation of [Psi]; 0 [??.sub.TAL] w : [Tau] wval: The derivation must end with either the (label) or the (tapp-word) rule. Suppose the former. Then we have w = l, [Psi] (l) = r', and 0 [??.sub.TAL] [Tau]' [is less than or equal to] [inverted]A [[Beta]].[Gamma]. Inspection of the subtyping rules then reveals that [Tau]' = [inverted]A[Beta].[Gamma]. Since [??.sub.TAL] H : [Psi], we may deduce that [Psi] [??.sub.TAL] H (l): [inverted]A[Beta].[Gamma] hval. The conclusion follows by Canonical Heap Forms.

Alternatively, suppose the derivation ends with (tapp-word). Then w = w'[[Sigma]] and [Psi]; 0 [??.sub.TAL] w' : [inverted]A[[Alpha], [Beta]].[Gamma]' wval with [Gamma] = [Gamma]' [[Sigma]/[Alpha]]. The conclusion follows by induction.

(3) The derivation [Psi]; 0 [??.sub.TAL] w : [Tau] wval must be shown by use of the (label) rule. Thus, w = l, [Psi] (l) = [Tau]', and 0 [??.sub.TAL]. [Tau]' [is less than or equal to] [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]. Let us say that [Psi] [is less than or equal to] [Psi] and 1 [is less than or equal to] 0. Then inspection of the subtype rules reveals that [Tau] must be of the form [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] with [Psi]' [is less than or equal to] [Psi].sub.i] (for each 0 [is less than or equal to] i [is less than or equal to] n - 1). Since [??.sub.TAL] H : [Psi], we may deduce that [Psi] [??.sub.TAL] H (l) : [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] hval. Thus H (l) = <[w.sub.0], ..., [w.sub.n-1]> and [Psi]; 0 [??.sub.TAL] [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] by Canonical Heap Forms. It remains to show that [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] for all 0 [is less than or equal to] n - 1. Suppose [[Psi]'.sub.i] = 1 and [Psi].sub.i] = 0 (otherwise the conclusion is immediate). Then [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] is shown by the (init) rule, which also permits the deduction of [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

(4) By inspection. [square]

LEMMA (R TYPING). If [Psi] [??.sub.TAL] R : [Gamma] and [Psi];0;[Gamma] [??.sub.TAL] v : [Tau] then [Psi];0 [??.sub.TAL] R(v) : [Tau] wval.

PROOF. The proof is by induction on the syntax of v. Consider the cases for v:

Case v = w. Immediate.

Case v = r. The only rule that can type v is (reg-val), and this rule requires [Tau] = [Gamma](r). The only rule that can type R is (reg), and this rule requires [Psi];0 [??.sub.TAL] R(r):[Tau] wval. The conclusion follows, since R(r) = R(r).

Case v = v'[[Sigma]]. The only rule that can type v is (tapp-val), so [Tau] = [inverted]A[[Beta]].[Gamma]'[[Sigma]/[Alpha]] and [Psi]; 0; [Gamma] [??.sub.TAL] v': [inverted]A[[Alpha], [Beta]].[Gamma]'. By induction we deduce [Psi]; 0 [??.sub.TAL] R(v') : [inverted]A[[Alpha], [Beta]].[Gamma]' wval, and then the rule (tapp-word) proves [Psi]; 0 [??.sub.TAL] R(v'): [inverted]A[[Alpha], [Beta]].[Gamma]' wval, and then the rule (tapp-word) proves [Psi]; 0 [??.sub.TAL] R(v') [Sigma] : [inverted]A[[Beta]].[Gamma]]'[[Sigma]/[Alpha]] wval. The result follows, since R(v'[[Sigma]]) = R(v')[[Sigma]].

Case v = pack [[Sigma], v'] as [exists][Alpha].[Tau]'. The only rule that can type v is (pack-val), so [Tau] = [exists][Alpha].[Tau] and [Psi]; 0; [Gamma][??.sub.TAL] v' : [Tau]'[[Sigma]/[Alpha]]. By induction we deduce [Psi]; 0 [??.sub.TAL] R(v') : [Tau]'[[Sigma]/[Alpha]] wval, and then the rule (pack-word) proves [Psi]; 0 [??.sub.TAL] pack [[Sigma], R(v')] as [exists][Alpha].[Tau]' : [exists][Alpha].[Tau]' wval. The result follows, since R (pack [[Sigma], v'] as [exists][Alpha].[Tau]') = pack [[Sigma], R(v')] as [exists][Alpha].[Tau]'. [square]

LEMMA (CANONICAL FORMS). If [??.sub.TAL] H : [Psi] hval, [Psi] [??.sub.TAL] R : [Gamma], and [Psi]; 0; [??.sub.TAL] v : [Tau] then

(1) If [Tau] = int then R(v) = i.

(2) If [Tau] = [inverted]A[[[Beta].sub.1], ..., [[Beta].sub.m]]].[Gamma] then

(a) R(v) = l[[Sigma].sub.1], ..., [[Sigma].sub.n]]

(b) H(l) = code [[[Alpha].sub.1], ..., [[Alpha].sub.n], [[Beta].sub.1], ..., [[Beta].sub.m]]][Gamma]'.I

(c) [Gamma] = [Gamma]'[[Sigma]/[Alpha]]

(d) [Psi]; [[Alpha].sub.1], ..., [[Alpha].sub.n], [[Beta].sub.1], ..., [[Beta].sub.m];[Gamma] [??.sub.TAL] I.

(3) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(4) If [Tau] = [exists][Alpha].[Tau] then R(v) = pack [[Tau]', w] as [exists][Alpha].[Tau] and [Psi]; 0 [??.sub.TAL] w : [Tau][[Tau]'/[Alpha]] wval.

PROOF. Immediate from R Typing and Canonical Word Forms. [square]

LEMMA (TYPE SUBSTITUTION). If [Beta] [??.sub.TAL] [[Tau].sub.i] then

(1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(2) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(3) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(4) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(5) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(6) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. By induction on derivations. The only interesting case is the case for the rule (type):

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The hypothesis must also be proven with the rule (type), so FTV ([[Tau].sub.i]) [subset of equal to] {[Beta]}. Consequently

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Hence we may prove [Beta] [??.sub.TAL] [Tau][[Tau]/[Alpha]] using the (type) rule. []

LEMMA (REGISTER FILE WEAKENING). If [Delta] [??.sub.TAL] [[Gamma].sub.1] [is less than or equal to] [[Gamma].sub.2] and [Psi]; [Delta] [??.sub.TAL] R : [[Gamma].sub.1] then [Psi]; [Delta] [??.sub.TAL], R : [[Gamma].sub.2].

PROOF. By inspection of the rules (weaken) and (peg). []

THEOREM (SUBJECT REDUCTION). If [??.sub.TAL], P and P, ?? P' then [??.sub.TAL], P'.

PROOF. P has the form (H, R, [Iota]; I) or (H, R, jmp v). Let TD be the derivation of [??.sub.TAL] P. Consider the following cases for jmp or [Iota]:

Case jmp. TD has the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

By the operational semantics, P' = (H,R,I[[Sigma]/[Alpha]]) where R(v) = l[[Sigma]] and H(l) = code[[Alpha]][Gamma]".I. Then

(1) [??.sub.TAL], H: [Psi] is in TD.

(2) From 0 [??.sub.TAL] [Gamma] [is less than or equal to] [Gamma]' and [Psi] [??.sub.TAL] it follows by Register File Weakening that [Psi] [??.sub.TAL] R : [Gamma]'.

(3) By Canonical Forms it follows from [Psi];0;[Gamma] [??.sub.TAL], v : inverted] A[].[Gamma]' that [Gamma]' = [Gamma]"[[Sigma]/[Alpha]] and [Psi]; [Alpha]; [Gamma]" [??.sub.TAL] I. By Type Substitution we conclude [Psi]; 0; [Gamma]' [??.sub.TAL] I[[Sigma]/[Alpha]].

Case add, mul, sub. TD has the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [Gamma]' = [Gamma]([r.sub.d] : int}. By the operational semantics, P' = (H,R',I) where R' = R([r.sub.d] ?? R([r.sub.s]) p R(v)). Then

(1) [??.sub.TAL] H : [Psi] is in TD.

(2) By Canonical Forms it follows that R([r.sub.s]) and R(v) are integer literals, and therefore [Psi]; 0 [??.sub.TAL] R([r.sub.s])p R(v) : int wval. We conclude [Psi] [??.sub.TAL] R' : [Gamma]' by Register File Update.

(3) [Psi]; 0; [Gamma]' [??.sub.TAL] I is in TD.

Case bnz. TD has the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

If R(r) = 0 then P' = (H, R, I) and [??.sub.TAL] P' follows, since [Psi]; 0; [Gamma] [??.sub.TAL] I is in TD. Otherwise the reasoning is exactly as in the case for jmp.

Case 1d. TD has the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [Gamma]' = [Gamma]{[r.sub.d] : [[Tau].sub.i]}. By the operational semantics, P' = (H, R', I) where R' = R{[r.sub.d] ?? [w.sub.i]}, R([r.sub.s]) = l, H(l) = <[w.sub.0], ..., [w.sub.m]-1}, and 0 [is less than or equal to] i [is less than] m. Then

(1) [??.sub.TAL] H : [Psi] is in TD.

(2) By Canonical Forms it follows from [Psi];0;[Gamma] [??.sub.TAL] [r.sub.s] : [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] that m = n and [Psi];0 [??.sub.TAL] [w.sub.i] [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] for 0 [is less than or equal to] j [is less than] n. Since [[Psi].sub.i] = 1 it must be the case (by inspection of the (init) rule) that [Psi]; 0 [??.sub.TAL] [w.sub.i] : [[Tau].sub.i] wval. By Register File we conclude [Psi] [??.sub.TAL] R' : [Gamma]'.

(3) [Psi]; 0; [Gamma]' [??.sub.TAL] I is in TD.

Case malloc. TD has the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [Sigma] = [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] [Psi]' = [Psi] {l : [Sigma]}, and [Gamma]' = [Gamma]{[r.sub.d] : [Sigma]}. By the operational semantics, P'= (H',R',I) where H'= H{l ?? <?[[Tau].sub.1], ..., ?[[Tau].sub.n]>}, R'= R{[r.sub.d] ?? l}, and l [is not element of] H. Then

(1) By the (tuple) and (uninit) rules we may deduce [Psi]' [??.sub.TAL] <[[Tau].sub.1], ..., ?[[Tau].sub.n]>) : [Sigma] hval.

By Heap Extension it follows that [??.sub.TAL] H' : [Psi]'.

(2) By the (type), (reflex), and (label) rules we may deduce that [Psi]'; 0 [??.sub.TAL] l : [Sigma] wval. By Heap Extension we deduce that [Psi]';0 [??.sub.TAL] R: [Gamma], and it follows by Register File Update that [Psi]' [??.sub.TAL] R': [Gamma]'.

(3) By Heap Extension, [Psi]'; 0; [Gamma]'[??.sub.TAL] I.

Case mov. TD has the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [Gamma]' = [Gamma]{r : [Tau]}. By the operational semantics, P' = (H, R', I) where R' = R{r ?? R(v)}. Then

(1) [??.sub.TAL] H : [Psi] is in TD.

(2) By R Typing it follows from [Psi]; 0; [Gamma] [??.sub.TAL] v : [Tau] that [Psi]; 0 [??.sub.TAL] R(v): [Tau] wval.

Using Register File Update we conclude that [Psi] [??.sub.TAL] R': [Gamma]'.

(3) [Psi]; 0; [Gamma]' [??.sub.TAL] I is in TD.

Case st. TD has the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

By the operational semantics, P' = (H', R, I) where

H'= H{l ?? <[w.sub.0], ..., [w.sub.i-1], R([r.sub.s]), [w.sub.i+1], ..., [w.sub.m-1]>}

and R([r.sub.d]) = l, H(l) = <[w.sub.0], ..., [w.sub.m]>, and 0 [is less than or equal to] i [is less than] m. Then

(1) Since [Psi]; 0; [Gamma] [??.sub.TAL] [r.sub.d] = [[Sigma].sub.0], it must be the case that [Gamma]([r.sub.d]) = [[Sigma].sub.0], and thus since [Psi] [??.sub.TAL] R : [Gamma] and R([r.sub.d]) = l we may deduce [Psi]; 0 [??.sub.TAL], l : [[Sigma].sub.0] wval. The latter judgment must be proven with the (label) rule; hence 0 [??.sub.TAL] [[Sigma]'.sub.0] <[is less than or equal to] [[Sigma].sub.0] where [Psi] (l) = [[Sigma]'.sub.0]. Note that it follows from Subtyping Regularity and the definition of [[Sigma].sub.0] that 0 [??.sub.TAL] [[Tau].sub.j] for each 0 [is less than or equal to] j [is less than] n.

Let us say that [Psi] [is less than or equal to] [Psi] and 1 [is less than or equal to] 0. Inspection of the subtyping rules reveals that [[Sigma].sub.0]' must be of the form [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] with [[Psi]'.sub.j] [is less than or equal to] [[Psi].sub.j]. Let

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Then 0 [??.sub.TAL] [[Sigma]'.sub.1] [is less than or equal to] and 0 [??.sub.TAL] [[Sigma]'.sub.1] [is less than or equal to] [[Sigma]'.sub.0]. Since [??.sub.TAL] H : [Psi], we may deduce that m = n and [Psi];0 [??.sub.TAL] [w.sub.j] [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII] for 0 [is less than or equal to] j [is less than] n. Let [Psi]' = [Psi]{l : [[Sigma]'.sub.1]}. By

Heap Update it follows that [Psi]'; 0 [??.sub.TAL] [w.sub.j]: [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII].

Using R Typing and Heap Update, we may deduce that [Psi]; 0 [??.sub.TAL] R([r.sub.s]) : [[Tau].sub.i] wval, and by applying the (init) and (tuple) rules we may conclude that

[Psi]'[??.sub.TAL] <[w.sub.0], ..., [w.sub.i-1], R([r.sub.s]), [w.sub.i+1], ..., [w.sub.m-1]) : [[Sigma]'.sub.1] hval.

Hence [??.sub.TAL] H': [Psi]' by Heap Update.

(2) By Heap Update we may deduce that [Psi] [??.sub.TAL] R : [Gamma]. Recall that 0 [??.sub.TAL] [[Sigma]'.sub.1] [is less than or equal to] [[Sigma].sub.1]. Thus, [Psi]; 0 [??.sub.TAL] l : [[Sigma].sub.1] wval, and by Register File Update we may conclude that [Psi]' [??.sub.TAL] R: [Gamma]' (since R = R{[r.sub.d] ?? l}).

(3) By Heap Update, [Psi]'; 0; [Gamma]' [??.sub.TAL] I.

Case unpack. TD has the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

By the operational semantics, P' = (H, R',I') where R' = R{r ?? w}, I' = I [[Tau]/[Alpha]] and R(v) - pack [[Tau], w] as [exists][Alpha].[Tau]'. Then

(1) [??.sub.TAL] H: [Psi] is in TD.

(2) By Canonical Forms it follows from [Psi];0 [Gamma] [??.sub.TAL]F v : [exists][Alpha].[Tau]' that [Psi]; 0 [??.sub.TAL] w : [Tau]'[[Tau]/[Alpha]] wval. Let [Gamma]'= [Gamma]{r: [Tau]'[[Tau]-/[Alpha]]}. By Register File Update if follows that [Psi] [??.sub.TAL] R': [Gamma]'.

(3) By Type Substitution it follows from [Psi];[Alpha]; [Gamma]{r:[Tau]'} [??.sub.TAL] I that [Psi];0;[Gamma]' [??.sub.TAL] I'.

[square]

THEOREM (PROGRESS). If [??.sub.TAL] P then either there exists P' such that P ?? P', or P is of the form (H,R{r1 ?? w},halt[[Tau]]) (and, moreover, [Psi];0 [??.sub.TAL] w : [Tau] wval for some [Psi] such that [??.sub.TAL] H : [Psi]).

PROOF. Suppose P = (H, R, [I.sub.full]). Let TD be the derivation of [??.sub.TAL] P. The proof is by cases on the first instruction of [I.sub.full]. Case halt TD has the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

By R Typing we may deduce that R(r1) is defined and [Psi]; 0 [??.sub.TAL] R(r1): [Tau] wval.

In other words, R = R'{r1 ?? w} and [Psi]; 0 [??.sub.TAL] w : [Tau] wval.

Case add, mul, sub TD has the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

By Canonical Forms, R([r.sub.s]) and R(v) each represent integer literals. Hence P ?? (H, R{[r.sub.d] ?? R([r.sub.s]p R(v)}, I).

Case bnz TD has the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

By Canonical Forms, R(r) is an integer literal and R(v) = l[[Sigma].sub.1], ..., [Sigma].sub.n]] with H(l) = code[[Alpha].sub.1], ..., [[Alpha].sub.n]].[Gamma]". I'. If R(r) - 0 then P [right arrow] (H,R,I). If R(r) [is not equal to] 0 then P [right arrow] (H, R, I'[??/[Alpha]]).

Case jmp TD has the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

By Canonical Forms, R(v) = l[[[Sigma].sub.1], ..., [Sigma].sub.n]]] with H(l) = cod,[[Alpha].sub.1], ..., [Alpha].sub.n]]].[Gamma]".I'. Hence P ?? (H,R,I'[[Sigma]/[Alpha]]).

Case 1d TD has the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

By Canonical Forms, R([r.sub.d]) = l with H(l) = <[w.sub.0], ...,[w.sub.n-1]>. Hence P ?? (H,R{[r.sub.d]??[w.sub.i]}, I).

Case malloc Suppose that [I.sub.full] is of the form malloc r[[[Tau].sub.1], ..., [[Tau].sub.n]; I. Then P ?? (H{l ?? <[[[Tau].sub.1], ..., ?[[Tau].sub.n]]}, R{r ?? l}, I) for some l [is not an element of] H.

Case mov TD has the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

By R Typing, R(v) is defined. Hence P ?? (H,R{r ?? R(v)},I).

Case st TD has the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

By Canonical Forms, R([r.sub.d]) = l with H(l) = <[w.sub.0], ..., [w.sub.n-1>. By R Typing, R([r.sub.s]) is defined. Hence P ?? (H{l ?? {[w.sub.0], ..., [w.sub.i-1], R([r.sub.s]), [w.sub.i+1], ..., [w.sub.n-1]>}, R,I).

Case unpack TD has the form

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

By Canonical Forms, R(v) = pack [[Tau]', w] as [exists][Alpha].[Tau]. Hence P ?? (H,R{r ?? w},I[[Tau]'/[Alpha]]). [square]

This material is based on work supported in part by the AFOSR grant F49620-97-1-0013, ARPA/RADC grant F30602-1-0317, and NSF grants CCR-9317320 and CCR-9708915. Any opinions, findings, and conclusions or recommendations expressed in this article are those of the authors and do not reflect the views of these agencies.

Authors's addresses: G. Morrisett, D. Walker, and N. Glew: 4130 Upson Hall, Ithaca, NY 14853-7501, USA; K. Crary: School of Computer Science, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA.

Permission to make digital/hard copy of all or part of this material without fee for personal or classroom use provided that the copies are not made or distributed for profit or commercial advantage, the ACM copyright/server notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists requires prior specific permission and/or a fee.

GREG MORRISETT and DAVID WALKER Cornell University

KARL CRARY Carnegie Mellon University and

NEAL GLEW Cornell University
COPYRIGHT 1999 Association for Computing Machinery, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 1999 Gale, Cengage Learning. All rights reserved.

 
Article Details
Printer friendly Cite/link Email Feedback
Author:MORRISETT, GREG; WALKER, DAVID; CRARY, KARL; GLEW, NEAL
Publication:ACM Transactions on Programming Languages & Systems
Date:May 1, 1999
Words:14572
Previous Article:Should Your Specification Language Be Typed?
Next Article:Efficient Logic Variables for Distributed Computing.

Terms of use | Privacy policy | Copyright © 2018 Farlex, Inc. | Feedback | For webmasters