# A Type System for Java Bytecode Subroutines.

Java is typically compiled into an intermediate language, JVML, that is interpreted by the Java Virtual Machine. Because mobile JVML code is not always trusted, a bytecode verifier enforces static constraints that prevent various dynamic errors. Given the importance of the bytecode verifier for security, its current descriptions are inadequate. This article proposes using typing rules to describe the bytecode verifier because they are more precise than prose, clearer than code, and easier to reason about than either. JVML has a subroutine construct which is used for the compilation of Java's try-finally statement. Subroutines are a major source of complexity for the bytecode verifier because they are not obviously last-in first-out and because they require a kind of polymorphism. Focusing on subroutines, we isolate an interesting, small subset of JVML. We give typing rules for this subset and prove their correctness. Our type system constitutes a sound basis for bytecode verification and a rational reconstruction of a delicate part of Sun's bytecode verifier.Categories and Subject Descriptors: D.3.1 [Programming Languages]: Formal Definitions and Theory--semantics; D.4.6 [Operating Systems]: Security and Protection--invasive software; F.3.2 [Logics and Meanings of Programs]: Semantics of Programming Languages--operational semantics; program analysis; F.3.3 [Logics and Meanings of Programs]: Studies of Program Constructs--type structure

General Terms: Languages, Security, Theory, Verification

Additional Key Words and Phrases: Bytecode verification, Java

1. BYTECODE VERIFICATION AND TYPING RULES

The Java language is typically compiled into an intermediate language that is interpreted by the Java Virtual Machine (VM) [Lindholm and Yellin 1996]. This intermediate language, which we call JVML, is an object-oriented language similar to Java. Its features include packages, classes with single inheritance, and interfaces with multiple inheritance. However, unlike method bodies in Java, method bodies in JVML are sequences of bytecode instructions. These instructions are fairly high-level but, compared to the structured statements used in Java, they are more compact and easier to interpret.

JVML code is often shipped across networks to Java VMs embedded in Web browsers and other applications. Mobile JVML code is not always trusted by the VM that receives it. Therefore, a bytecode verifier enforces static constraints on mobile JVML code. These constraints rule out type errors (such as dereferencing an integer), access control violations (such as accessing a private method from outside its class), object initialization failures (such as accessing a newly allocated object before its constructor has been called), and other dynamic errors.

Figure 1 illustrates how bytecode verification fits into the larger picture of Java security. The figure represents trusted and untrusted code. At the base of the trusted code is the Java VM itself--including the bytecode verifier--plus the operating system, which provides access to privileged resources. On top of this base layer is the Java library, which provides controlled access to those privileged resources. Java security depends on the VM correctly interpreting JVML code, which in turn depends on the verifier rejecting illegal JVML code. If the verifier were broken but the rest of the VM assumed it was correct, then JVML code could behave in ways not anticipated by the Java library, circumventing the library's access controls.

[Figure 1 ILLUSTRATION OMITTED]

For example, the current implementation of the library contains private methods that access privileged resources without performing access control checks. This implementation depends on the verifier preventing untrusted code from calling those private methods. But if the verifier were broken, an attacker might be allowed to treat an instance of a library class as an instance of a class defined by the attacker. In this situation, the class defined by the attacker could be used as a back door. At runtime, most VMs use dispatch-table indices to name methods. Successful invocation of a method depends only on this index and on passing the right number and type of arguments to the method. If the verifier allowed the attacker to treat an instance of the library class as an instance of the back-door class, then the attacker could call the instance's methods, including the private ones that do not perform access control checks.

As this example shows, a successful attack requires more than a broken verifier that accepts rogue JVML code. If, in our example, a VM had some runtime mechanism for preventing the use of an instance of an unexpected class, then the attack would fail. However, this mechanism might be expensive. Thus, the purpose of the verifier is to permit implementations of JVML that are fast without sacrificing safety. The primary customer of the verifier is the rest of the VM implementation: instruction interpretation code, JIT-compiler routines, and other code that assumes that illegal input code is filtered out by the verifier.

Given its importance for security, current descriptions of the verifier are deficient. The official specification of the verifier is a prose description [Lindholm and Yellin 1996]. Although good by the standards of prose, this description is ambiguous, imprecise, and hard to reason about. In addition to this specification, Sun distributes what could be considered a reference implementation of the verifier. As a description, this implementation is precise, but it is hard to understand and, like the prose description, is hard to reason about. Furthermore, the implementation disagrees with the prose description.

This article proposes using typing rules to describe the verifier. Typing rules are more precise than prose, easier to understand than code, and they can be manipulated formally. Such rules would give implementors of the verifier a systematic framework on which to base their code, increasing confidence in its correctness. Such rules would also give implementors of the rest of the VM an unambiguous statement of what they can and cannot assume about legal JVML code.

From a typing perspective, JVML is interesting in at least two respects:

--In JVML, a location can hold different types of values at different program points. This flexibility allows locations to be reused aggressively, allowing interpreters to save space. Thus, JVML contrasts with most typed languages, in which a location has only one type throughout its scope.

--JVML has subroutines to help compilers generate compact code, for example for Java try-finally statements. JVML subroutines are subsequences of the larger sequence of bytecode instructions that make up a method's body. The JVML instruction jsr jumps to the start of one of these subsequences, and the JVML instruction ret returns from one. Subroutines introduce two significant challenges to the design of the bytecode verifier: the verifier should ensure that ret is used in a well-structured manner, and it should support a certain

kind of polymorphism. We describe these challenges in more detail in Section 2.

This article addresses these typing problems. It defines a small subset of JVML, called JVML0, presents a type system and a dynamic semantics for JVMLO, and proves the soundness of the type system with respect to the dynamic semantics. JVML0 includes only nine instructions and ignores objects and several other features of JVML. This restricted scope allows us to focus on the challenges introduced by subroutines. Thus, our type system provides a precise, sound approach to bytecode verification, and a rational reconstruction of a delicate part of Sun's bytecode verifier.

We present the type system in three stages:

(1) The first stage is a simplified version for a subset of JVML0 that excludes jsr and ret. This simplified version provides an introduction to our notation and approach, and illustrates how we give different types to locations at different program points.

(2) The next stage considers all of JVML0 but uses a structured semantics for jsr and ret. This structured semantics has an explicit subroutine call stack for ensuring that subroutines are called on a last-in, first-out basis. In the context of this structured semantics, we show how to achieve the polymorphism desired for subroutines.

(3) The last stage uses a stackless semantics for jsr and ret in which return addresses are stored in random-access memory. The stackless semantics is closer to Sun's. It admits more efficient implementations, but it does not dynamically enforce a last-in, first-out discipline on calls to subroutines. Because such a discipline is needed for type safety, we show how to enforce it statically.

The next section describes JVML subroutines in more detail and summarizes our type system. Section 3 gives the syntax and an informal semantics for JVML0. Sections 4-6 present our type system in the three stages outlined above. Section 7 states and proves the main soundness theorem for our type system. Sections 8 and 9 discuss related work, including Sun's bytecode verifier. Section 8 also considers how our type system could be extended to the full JVML. Section 10 concludes. Several appendices contain proofs.

2. OVERVIEW OF JVML SUBROUTINES AND OUR TYPE SYSTEM

JVML subroutines are subsequences of a method's larger sequence of instructions; they behave like miniature procedures within a method body. Subroutines are used for compiling Java's try-finally statement.

Consider, for example, the Java method named bar at the top of Figure 2. The try body can terminate in one of three ways: immediately when i does not equal 3, with an exception raised by the call to foo, or with an execution of the return statement. In all cases, the compiler must guarantee that the code in the finally block is executed when the try body terminates. Instead of replicating the finally code at each escape point, the compiler can put the finally code in a JVML subroutine. Compiled code for an escape from a try body executes a jsr to the subroutine containing the finally code.

[Figure 2 MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Figure 2 illustrates the use of subroutines for compiling try-finally statements. It contains a possible result of compiling the method bar into JVML, putting the finally code in a subroutine in lines 13-16.

Figure 2 also introduces some of JVML's runtime structures. JVML bytecode instructions read and write three memory regions. The first region is an object heap shared by all method activations; the heap does not play a part in the example code of Figure 2. The other two regions are private to each activation of a method. The first of these private regions is the operand stack, which is intended to be used on a short-term basis in the evaluation of expressions. For example, the instruction iconst_3 pushes the integer constant 3 onto this stack, while ireturn terminates the current method and returns the integer at the top of the stack. The second private region is a set of locations known as local variables, which are intended to be used on a longer-term basis to hold values across expressions and statements (but not across method activations). Local variables are not operated on directly. Rather, values in local variables are pushed onto the stack and values from the stack are popped into local variables via the load and store instructions respectively. For example, the instruction aload_0 pushes the object reference in local variable 0 onto the operand stack, while the instruction istore_2 pops the top value off the operand stack and saves it in local variable 2.

Subroutines pose two challenges to the design of a type system for JVML:

--Polymorphism. Subroutines are polymorphic over the types of the locations they do not touch. For example, consider how variable 2 is used in Figure 2. At the jsr on line 7, variable 2 contains an integer and is assumed to contain an integer when the subroutine returns. At the jsr on line 18, variable 2 contains a pointer to an exception object and is assumed to contain a pointer to an exception object when the subroutine returns. Inside a subroutine, the type of a location such as variable 2 can depend on the call site of the subroutine. (Subroutines are not parametric over the types of the locations they touch; the polymorphism of JVML is thus weaker than that of ML.)

--Last-in, first-out behavior. In most languages, when a return statement in procedure P is executed, the dynamic semantics guarantees that control will return to the point from which P was most recently called. The same is not true of JVML. The ret instruction takes a variable as a parameter and jumps to whatever address that variable contains. This semantics means that, unless adequate precautions are taken, the ret instruction can transfer control to almost anywhere.

Using ret to jump to arbitrary places in a program is inimical to static typing, especially in the presence of polymorphism. For example, consider the illegal JVML method body in Figure 3. This method body has two calls to the subroutine that starts at line 13, which is polymorphic over variable 0. The first time this subroutine is called, it stores the return address into variable 2 and returns. The second time this subroutine is called, it returns to the old return address stored away in variable 2. This causes control to return to line 4, at which point the code assumes that variable 0 contains a pointer. However, between the first and second calls to the subroutine, an integer has been stored into variable 0. Thus, the code in Figure 3 will dereference an integer.

[Figure 3 MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Our type system allows polymorphic subroutines and enforces last-in, first-out behavior. It consists of rules that relate a program (a sequence of bytecode instructions) to static information about types and subroutines. This information maps each memory location of the VM to a type at each program point, identifies the instructions that make up subroutines, indicates the variables over which subroutines are polymorphic, and gives static approximations to the dynamic subroutine call stack.

Our type system guarantees the following properties for well-typed programs:

--Type safety. An instruction will never be given an operand stack with too few values in it, or with values of the wrong type.

--Program counter safety. Execution will not jump to undefined addresses.

--Bounded operand stack. The size of the operand stack will never grow beyond a static bound.

3. SYNTAX AND INFORMAL SEMANTICS OF JVML0

In JVML0, our restricted version of JVML, a program is a sequence of instructions:

P ::= instruction*

We treat programs as partial maps from addresses to instructions. We write ADDR for the set of all addresses. Addresses are very much like positive integers, and we use the constant 1 and the function + on addresses. However, to provide more structure to our semantics, we treat numbers and addresses as separate sets. When P is a program, we write Dom(P) for the domain of P (its set of addresses); P[i] is defined only for i [element of] Dom(P). We assume that 1 [element of] Dom(P) for every program P.

In JVML0, there are no classes, methods, or objects. There is no object heap, but there is an operand stack and a set of local variables. We write VAR for the set of names of local variables. (In JVML, these names are actually natural numbers, but we do not require this.) Local variables and the operand stack both contain values. A value is either an integer or an address.

JVML0 has only nine instructions:

instruction ::= inc | pop | push0 | load x | store x | if L | jsr L | ret x | halt

where x ranges over VAR, and L ranges over ADDR. Informally, these instructions behave as follows:

--The inc instruction increments the value at the top of the operand stack if that value is an integer. The pop instruction pops the top off the operand stack. The push0 instruction pushes the integer 0 onto the operand stack.

--The load x instruction pushes the current value of local variable x onto the operand stack. The store x instruction pops the top value off the operand stack and stores it into local variable x.

--The if L instruction pops the top value off the operand stack and either falls through when that value is the integer 0 or jumps to L otherwise.

--At address p, the jsr L instruction jumps to address L and pushes return address p + 1 onto the operand stack. The ret x instruction jumps to the address stored in x.

--The halt instruction halts execution.

4. SEMANTICS WITHOUT SUBROUTINES

This section introduces our approach and some notation. It presents static and dynamic semantics for the subset of JVMLO that excludes jsr and ret.

We use (partial) maps extensively throughout this article. When g is a map, Dom(g) is the domain of g; for x [element of] Dom(g), g[x] is the value of g at x, and g[x ?? v] is the map with the same domain as g defined by the following equation, for all y [element of] Dom(g):

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

That is, g[x ?? v] has the same value as g at all points except x, where g[x ?? v] has value v. We define equality on maps as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

That is, f = g exactly when f and g have the same domains and have equal values at all points.

We often use maps with domain ADDR. We call those maps vectors. When F is a vector and i is an address, we may write [F.sub.i] instead of F[i].

We also use strings. The constant [Epsilon] denotes the empty string. If s is a string, then v [multiplied by] s denotes the string obtained by prepending v to s. Sometimes we treat strings as maps from indices to integers. Then Dom(s) is the set of indices of s, and s[i] is the ith element of s from the right. Concatenation adds a new index for the new element but does not change the indices of the old elements; that is:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

4.1 Dynamic Semantics

We model a state of an execution as a tuple <pc, f, s>, where pc is the program counter, f is the current state of local variables, and s is the current state of the operand stack.

--The program counter pc is an address, that is, an element of ADDR.

--The current state of local variables f is a total map from VAR to the set of values.

--The current state of the stack s is a string of values.

All executions start from states of the form <1, f, [Epsilon]>, where f is arbitrary and where [Epsilon] represents the empty stack.

Figure 4 contains a small-step operational semantics for all instructions other than jsr and ret. This semantics relies on the judgment

P ?? <pc, f, s> [right arrow] <pc', f', s'>

[Figure 4 [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

which means that program P can, in one step, go from state <pc, f, s> to state <pc', f', s'>. By definition, the judgment P ?? <pc, f, s> [right arrow] <pc', f', s'> holds only when P is a program and <pc, f, s> and <pc', f', s'> are states. In Figure 4 (and in the rest of this article), n matches only integers while v matches any value. Thus, for example, the pattern "n [multiplied by] s" represents a nonempty stack whose top element is an integer. Note that there is no rule for halt--execution stops when a halt instruction is reached. Execution may also stop as the result of a dynamic error, for example attempting to execute a pop instruction with an empty stack.

4.2 Static Semantics

Our static semantics employs the following typing rules for values:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The type TOP includes all values. The type INT is the type of integers. Types of the form (ret-from L) include all addresses. However, the typing rules for programs of Sections 5 and 6 make a more restricted use of address types, preserving strong invariants. As the syntax (ret-from L) suggests, we use L as the name for the subroutine that starts at address L, and use (ret-from L) as the type of return addresses generated when L is called. Collectively, we refer to the types TOP, INT, and (ret-from L) as value types.

Types are extended to stacks as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where T is a value type and a is a string of value types. The empty stack is typed by the empty string; a stack with n values is typed by a string with n types, each type correctly typing the corresponding value in the string of values.

A program is well-typed if there exist a vector F of maps from variables to types and a vector S of strings of types satisfying the judgment:

F , S ?? P

The vectors F and S contain static information about the local variables and operand stack, respectively. The map [F.sub.i] assigns types to local variables at program point i. The string [S.sub.i] gives the types of the values in the operand stack at program point i. (Hence the size of [S.sub.i] gives the number of values in the operand stack.) For notational convenience, the vectors F and S are defined on all of ADDR even though P is not; [F.sub.j] and [S.sub.j] are dummy values for out-of-bounds j.

We have one rule for proving F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

where [Epsilon] is the map that maps all variables to TOP and [Epsilon] is (as usual) the empty string. The first two hypotheses are initial conditions; the third is a local judgment applied to each program point. Figure 5 has rules for the local judgment F, S, i ?? P. These rules constrain F and S at point i by referring to [F.sub.j] and [S.sub.j] for all points j that are control-flow successors of i.

[Figure 5 [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

4.3 One-Step Soundness Theorem

The rules above are sound in that any step from a well-typed state leads to another well-typed state:

THEOREM 1. For all P, F, and S such that F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

We do not prove this theorem, but we do prove similar theorems for programs including jsr and ret. (See the appendices.)

5. STRUCTURED SEMANTICS

This section shows how to handle jsr and ret with the kind of polymorphism described in Section 2. To isolate the problem of polymorphism from the problem of ensuring that subroutines are used in a well-structured manner, this section presents what we call the structured semantics for JVML0. This semantics is structured in that it defines the semantics of jsr and ret in terms of an explicit subroutine call stack. This section shows how to achieve polymorphism in the context of the structured semantics. The next section shows how to ensure that subroutines are used in a well-structured manner in the absence of an explicit call stack.

5.1 Dynamic Semantics

In the structured semantics, we augment the state of an execution to include a subroutine call stack. This call stack holds the return addresses of subroutines that have been called but have not yet returned. We model this call stack as a string [Rho] of addresses. Thus, the state of an execution is now a four-tuple <pc, f, s, [Rho]>.

Figure 6 defines the structured dynamic semantics of JVML0. The rules of Figure 6 use the subroutine call stack to communicate return addresses from jsr instructions to the corresponding ret instructions. Although ret takes an operand x, the structured dynamic semantics ignores this operand; similarly, the structured dynamic semantics of jsr pushes the return address onto the operand stack as well as onto the subroutine call stack. These definitions enable us to transfer the properties of the structured semantics of this section to the stackless semantics of the next section.

[Figure 6 [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

5.2 Static Semantics

The structured static semantics relies on a new typing judgment:

F, S [??.sub.s] P

This judgment is defined by the rule:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The new, auxiliary judgment

R, i ?? P labeled

is used to define what it means to be "inside" a subroutine. Unlike in most languages, where procedures are demarcated syntactically, in JVML0 and JVML the instructions making up a subroutine are identified by constraints in the static semantics. For the instruction at address i, [R.sub.i] is a subroutine label that identifies the subroutine to which the instruction belongs. These labels take the form of either the empty set or a singleton set consisting of an address. If an instruction's label is the empty set, then the instruction belongs to the top level of the program. If an instruction's label is the singleton set {L}, then the instruction belongs to the subroutine that starts at address L. Figure 7 contains rules for labeling subroutines. These rules do not permit subroutines to have multiple entry points, but they do permit multiple exits.

[Figure 7 [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

For some programs, more than one R may satisfy both [R.sub.1] = {} and the constraints of Figure 7 because the labeling of unreachable code is not unique. It is convenient to assume a canonical R for each program P, when one exists. (The particular choice of R does not matter.) We write [R.sub.P] for this canonical R, and [R.sub.P,i] for the value of [R.sub.P] at address i.

Much as in Section 5, the judgment

F, S, i [??.sub.s] P

imposes local constraints near program point i. For the instructions considered in Section 5 (that is, for all instructions but jsr and ret), the rules are the same as in Figure 5. Figure 8 contains rules for jsr and ret.

[Figure 8 [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The elements of F need not be defined on all variables. For an address i inside a subroutine, the domain of [F.sub.i] includes only the variables that can be read and written inside that subroutine. The subroutine is polymorphic over variables outside Dom([F.sub.i]).

Figure 9 gives an example of polymorphism. (Simpler examples could be found if the typing rules allowed a more liberal use of TOP.) The code, in the rightmost column of Figure 9, computes the number 2 and leaves it at the top of the stack. The code contains a main body and three subroutines, which we have separated with horizontal lines. The subroutine that starts at line 6 increments the value in variable 0. The subroutine that starts at line 11 also increments the value in variable 0 but leaves its result on the stack. These subroutines are implemented in terms of the one that starts at line 15, which increments the value at the top of the stack. This last subroutine is polymorphic over variable 0.

[Figure 9 [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

5.3 One-Step Soundness Theorem

To state the one-step soundness theorem for the structured semantics, two definitions are needed.

In the previous section, where programs do not have subroutines, the type of local variable x at program point pc is denoted by F [sub.pc][x]. However, for program points inside subroutines, this definition is not quite adequate. For x [element of] Dom([F.sub.pc]), the type of x is still [F.sub.pc][x]. For other x, the type of x depends on where execution has come from. A subroutine is polymorphic over local variables that are not in Dom(F pc): different call sites of the subroutine may have values of different types in those local variables. Therefore, we construct an assignment of types to local variables that takes into account where execution has come from. The expression:

F(F, pc, [Rho])[x]

denotes the type assigned to local variable x when execution reaches point pc with a subroutine call stack [Rho]. The type F(F, pc, [Rho])[x] is defined by the rules:

(tt0) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(tt1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Note that F(F, pc, [Rho])[x] is not defined when x is not in Dom([F.sub.pc]) or any of the Dom([F.sub.p]) for the p's in [Rho]. However, the theorems and lemmas in the following subsections consider only situations in which F is defined.

The second auxiliary definition for our theorem concerns an invariant on the structure of the subroutine call stack. The theorem of the previous section says that, when a program takes a step from a well-typed state, it reaches a well-typed state. With subroutines, we need to assume that the subroutine call stack of the starting state is well-formed. For example, if the address p is in the call stack, then it must be the case that the instruction at p - 1 is a jsr instruction. The new theorem says that if the program takes a step from a well-typed state with a well-formed subroutine call stack, then it reaches a well-typed state with a well-formed subroutine call stack. The following judgment expresses what we mean by a well-formed subroutine call stack:

WFCallStack(P, F, pc, [Rho])

This judgment is defined by the rules:

(wf0) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(wf1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Informally, a subroutine call stack [Rho] is well-formed when:

--given a return address p in [Rho], the instruction just before p is a jsr instruction that calls the routine from which p returns;

--the current program counter pc and all return addresses in [Rho] have the correct subroutine label associated with them;

--the return address at the bottom of the call stack is for code that can access all local variables;

--any variable that can be read and written by a callee can also be read and written by its caller.

With these two definitions, we can state a one-step soundness theorem for the structured semantics. The proof of this theorem is in Appendix A.

THEOREM 2. For all P, F, and S such that F, S [??.sub.s] P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

6. STACKLESS SEMANTICS

The remaining problem is to eliminate the explicit subroutine call stack of the previous section, using instead the operand stack and local variables to communicate return addresses from a jsr to a ret. As discussed in Section 2, when the semantics of jsr and ret are defined in terms of the operand stack and local variables, uncontrolled use of ret combined with the polymorphism of subroutines is inimical to type safety. This section presents a static semantics that rules out problematic uses of ret. This static semantics restricts programs to operate as if an explicit subroutine call stack like the one from the previous section was present. In fact, the soundness argument for the stackless semantics relies on a simulation between the structured semantics and the stackless semantics.

The static and dynamic semantics described in this section are closest to typical implementations of JVML. Thus, we consider these the official semantics of JVML0.

6.1 Dynamic Semantics

The stackless dynamic semantics consists of the rules of Figure 4 plus the rules for jsr and ret of Figure 10. The jsr instruction pushes the return address onto the operand stack. To use this return address, a program must first pop it into a local variable and then reference that local variable in a ret instruction. The ret x instruction extracts an address from local variable x; if x does not contain an address, then the program gets stuck (because <f[x], f, s> is not a state).

[Figure 10 [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

6.2 Static Semantics

To define the stackless static semantics, we revise the rule for F, S ?? P of Section 4. The new rule is:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The new, auxiliary judgment

C, i ?? P strongly labeled

constrains C to be an approximation of the subroutine call stack. Each element of C is a string of subroutine labels. For each address i, the string [C.sub.i] is a linearization of the subroutine call graph to i. Of course, such a linearization of the subroutine call graph exists only when the call graph is acyclic, that is, when subroutines do not recurse. (We believe that we can prove our theorems while allowing recursion, but disallowing recursion simplifies our proofs and agrees with Sun's specification [Lindholm and Yellin 1996, p. 124].) Figure 11 contains the rules for this new judgment; note that a subsequence is not necessarily a consecutive substring. Figure 12 shows an application of the rules of Figure 11 to the code from Figure 9. In this example, the order of 6 and 11 could be reversed in [C.sub.15], [C.sub.16], and [C.sub.17].

[Figure 11 [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

[Figure 12 [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

As with R, more than one C may satisfy both [C.sub.1] = [Epsilon] and the constraints in Figure 11. We assume a canonical C for each program P, when one exists. We write [C.sub.P] for this canonical C, and [C.sub.P,i] for the value of [C.sub.P] at address i. Programs that satisfy the constraints in Figure 11 also satisfy the constraints in Figure 7; we define [R.sub.P] from [C.sub.P] as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

On the other hand, programs with recursive subroutines may satisfy the constraints in Figure 7 but never those in Figure 11.

Figure 13 contains the rules that define F, S, i ?? P for jsr and ret; rules for other instructions are in Figure 5. The rule for jsr L assigns the type (ret-from L) to the return address pushed onto the operand stack. This type will propagate to any location into which this return address is stored, and it is checked by the following hypotheses in the rule for ret:

x [element of] Dom([F.sub.i])

[F.sub.i][x] = (ret-from L)

[Figure 13 [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Typing return addresses helps ensure that the return address used by a subroutine L is a return address for L, not for some other subroutine. By itself, ensuring that the return address used by a subroutine L is a return address for L does not guarantee last-in, first-out behavior. One also has to ensure that the only return address for L available inside L is the most recent return address, not one tucked away during a previous invocation. This is achieved by the following hypotheses in the rule for jsr:

(ret-from L) [is not an element of] [S.sub.i]

[inverted]Ay [element of] Dom([F.sub.L). [F.sub.L][[y]] [is not equal to] (ret-from L)

These hypotheses guarantee that the only value of type (ret-from L) available inside L is the most recent value of this type pushed by the jsr instruction. (These hypotheses might be redundant for reachable code; we include them because our rules apply also to unreachable code.) Except for the lines discussed above, the rules for jsr and ret are the same as those of the structured static semantics. As an example, we may simply reuse the one of Figure 9, since the typing given there remains valid in the stackless static semantics.

6.3 One-Step Soundness Theorem

The one-step soundness theorem for the stackless semantics requires some auxiliary definitions. We develop those definitions top-down, after stating the theorem. The proof of the theorem is in Appendix B.

THEOREM 3. For all P, F, and S such that F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

In this theorem, the judgment Consistent performs several functions:

--Like WFCallStack in the one-step soundness theorem for the structured semantics, Consistent implies an invariant on the structure of the subroutine call stack.

--In the stackless semantics, the subroutine call stack is not explicit in the dynamic state. The judgment Consistent relates an implicit subroutine call stack ([Rho] or [Rho]') to the state of an execution.

--The implicit subroutine call stacks of the stackless semantics are closely related to the explicit subroutine call stacks of the structured semantics. In particular, the following implication holds:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Thus, Consistent helps us relate the structured and the stackless semantics. The judgment Consistent is the conjunction of four other judgments:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The first judgment is the judgment WFCalIStack from the previous section. The next judgment adds more constraints on the subroutine call stack. The last two judgments constrain the values of return addresses in s (the operand stack) and f (local variables) to match those found in [Rho] (the implicit subroutine call stack).

The auxiliary function [Lambda] returns the sequence of caller labels associated with a sequence of return addresses. It is a partial function defined by two rules:

(10) (Empty hypothesis)/[Lambda] P ([Epsilon]) = [Epsilon]

(11) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

According to this definition, [Lambda]P ([Rho]) is the string of labels of subroutines that have been called in [Rho]. The function [Lambda]P([Rho]) is used only when P is well-typed and WFCallStack holds, in which case [Lambda]P([Rho]) is always defined.

The extensions of WFCallStack made by WFCalIStack2 are defined by two rules:

(wf20) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(wf21) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

These rules ensure that no recursive calls have occurred. They also ensure that the dynamic subroutine call stack is consistent with the static approximation of the subroutine call stack.

The judgment GoodStackRets constrains the values of return addresses in the operand stack:

(gsr0) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

(gr0) (Empty hypothesis)/GoodRet(K, p, [Epsilon], [Epsilon])

(gr1) (Empty hypothesis)/GoodRet(K, p, K [multiplied by] [Lambda], p [multiplied by] [Rho])

(Gr2) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

In these definitions, we introduce GoodRet as an auxiliary judgment. These definitions say that, when a subroutine L has been called but has not returned, the values of return addresses for L found in s (the operand stack) agree with the return address for L in [Rho] (the implicit subroutine call stack).

Similarly to the judgment GoodStackRets, the judgment GoodFrameRets constrains the values of return addresses in local variables:

(gr0) (Empty hypothesis) / GoodFrameRets(F, pc, [micro], f, [Epsilon], [Epsilon])

(gr1) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Because subroutines may not be able to read and write all variables, GoodFrameRets cannot constrain all variables simultaneously in the way that GoodStackRets constrains all elements of the operand stack. Instead, the rules for GoodFrameRets must work inductively on the subroutine call stack, looking at the return addresses available to each subroutine in turn.

7. MAIN SOUNDNESS THEOREM

Our main soundness theorem is:

THEOREM 4. For all P, F, and S such that F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

This theorem says that if a computation stops, then it stops because it has reached a halt instruction, not because the program counter has gone out of bounds or because a precondition of an instruction does not hold. This theorem also says that the operand stack is well-typed when a computation stops. This last condition is important because, when a JVML method returns, its return value is on the operand stack.

The proof of the main soundness theorem is in Appendix C.

8. SUN'S RULES

Sun has published two descriptions of the bytecode verifier, a prose specification and a reference implementation. This section compares our rules with both of these descriptions.

8.1 Scope

While our rules simply check static information, Sun's bytecode verifier infers that information. Inference may be important in practice, but only checking is crucial for type safety (and for security). It is therefore reasonable to study checking apart from inference.

JVML has around 200 instructions, while JVML0 has only 9. A rigorous treatment of most of the remaining JVML instructions should pose only minor problems. In particular, many of these instructions are for well-understood, arithmetic operations; small difficulties may arise because of their exceptions and other idiosyncrasies. The other instructions (around 20) concern objects and concurrency. Their rigorous treatment would require significant additions to our semantics--for example, a model of the heap. Fortunately, some of these additions are well understood in the context of higher-level, typed languages. As discussed below, Freund and Mitchell are currently extending our rules to a large subset of JVML.

8.2 Technical Differences

Our rules differ from Sun's reference implementation in the handling of recursive subroutines. Sun's specification disallows recursive subroutines, as do our rules, but Sun's reference implementation allows recursion in certain cases. We believe that recursion is sound in the sense that it does not introduce security holes. However, recursion is an unnecessary complication since it is not useful for compiling Java. Therefore, we believe that the specification should continue to disallow recursion and that the reference implementation should be corrected.

Our rules deviate from Sun's specification and reference implementation in a few respects.

--Sun's rules forbid load x when x is uninitialized or holds a return address. Our rules are more general without compromising soundness.

--Sun's rules allow at most one ret instruction per subroutine, while our rules allow an arbitrary number.

--According to Sun's rules, "each instruction keeps track of the list of jsr targets needed to reach that instruction" [Lindholm and Yellin 1996, p. 135]. Using this information, Sun's rules allow a ret to return not only from the most recent call but also from calls further up the subroutine call stack. Adding this flexibility to our rules should not be difficult, but it would complicate the structured dynamic semantics and would require additional information in the static semantics.

Finally, our rules differ from Sun's reference implementation on a couple of other points. Sun's specification is ambiguous on these points and, therefore, does not provide guidance.

--Sun's reference implementation does not constrain unreachable code. Our rules put constraints on all code. Changing our rules to ignore unreachable code would not require fundamental changes.

--When it comes to identifying what subroutine an instruction belongs to, our rules are more restrictive than the rules implicit in Sun's reference implementation. The flexibility of Sun's reference implementation is important for compiling finally clauses that can throw exceptions. Changing our rules to capture Sun's approach would not be difficult, but changing our soundness proof to support this approach may be.

9. OTHER RELATED WORK

In addition to Sun's, there exist several implementations of the bytecode verifier. Only recently has there been any systematic attempt to understand all these implementations. In particular, the Kimera project has tested several implementations, pointing out some mistakes and discrepancies [Sirer et al. 1997]. We take a complementary approach, based on rigorous reasoning rather than on testing. Both rigorous reasoning and testing may affect our confidence in bytecode verification. While testing does not provide an adequate replacement for precise specifications and proofs, it is a cost-effective way to find certain flaws and oddities.

More broadly, there have been several other implementations of the Java VM. Of particular interest is a partial implementation developed at Computational Logic, Inc. [Cohen 1997]. This implementation is defensive, in the sense that it includes strong (and expensive) dynamic checks that remove the need for bytecode verification. The implementation is written in a formal language and is intended as a model rather than for production use. Ultimately, one may hope to prove that the defensive implementation is equivalent to an aggressive implementation plus a sound bytecode verifier (perhaps one based on our rules).

There have also been typed intermediate languages other than JVML. Several have been developed for ML and Haskell [TIC 1997]. Here we discuss the TIL intermediate languages [Morrisett 1995; Morrisett et al. 1996] as representative examples. The TIL intermediate languages provide static guarantees similar to those of JVML. Although these languages have sophisticated type systems, they do not include an analogue to JVML subroutines; instead, they include constructs as high-level as Java's try-finally statement. Therefore, the main problems addressed in this article do not arise in the context of TIL.

Finally, the literature contains many proofs of type soundness for higher-level languages, and in particular proofs for a fragment of Java [Drossopoulou and Eisenbach 1997; Nipkow and von Oheimb 1998; Syme 1997]. Those proofs have not had to deal with JVML peculiarities (in particular, with subroutines); nevertheless, their techniques may be helpful in extending our work to the full JVML.

In summary, there has not been much work closely related to ours. We do not find this surprising, given that the handling of subroutines is one of the most original parts of the bytecode verifier; it was not derived from prior papers or systems [Yellin 1997]. However, interest in the formal treatment of bytecode verification is mounting; several approaches are currently being pursued [Freund and Mitchell 1998; Goldberg 1998; Hagiya and Tozawa 1998; Qian 1998; Saraswat 1997]. Goldberg, Qian, and Saraswat all develop other formal frameworks for bytecode verification, basing them on constraints and dataflow analysis; their work is rather broad and not focused on subroutines. Hagiya and Tozawa generalize our rules for subroutines. Building on our type system, Freund and Mitchell study object initialization and its problematic interaction with subroutines; work is under way on a subset of JVML that includes objects, classes, constructors, interfaces, and exceptions.

10. CONCLUSIONS

The bytecode verifier is an important part of the Java VM; through static checks, it helps reconcile safety with efficiency. Common descriptions of the bytecode verifier are ambiguous and contradictory. This article suggests the use of a type system as an alternative to those descriptions. It explores the viability of this suggestion by developing a sound type system for a subset of JVML. This subset, despite its small size, is interesting because it includes JVML subroutines, a source of substantial difficulty in the design of a type system.

Our results so far support the hypothesis that a type system is a good way to describe the bytecode verifier. Significant problems remain in scaling up to the full JVML, such as handling objects and concurrency. However, we believe that these problems will be no harder than those posed by subroutines, and that a complete type system for JVML could be both tractable and useful.

ACKNOWLEDGMENTS

We thank Luca Cardelli, Drew Dean, Sophia Drossopoulou, Stephen Freund, Cynthia Hibbard, Mark Lillibridge, Greg Morrisett, George Necula, Frank Yellin, and anonymous referees for useful information and suggestions.

REFERENCES

COHEN, R.. M. 1997. Defensive Java Virtual Machine version 0.5 alpha release. Web pages at http://www.cli.com/.

DROSSOPOULOU, S. AND EISENBACH, S. 1997. Java is type safe--probably. In Proceedings of ECOOP '97. Springer-Verlag, 389-418.

FREUND, S. N. AND MITCHELL, J. C. 1998. A type system for object initialization in the Java bytecode language. In OOPSLA '98 Conference Proceedings: Object-Oriented Programming, Systems, Languages, and Applications. ACM, New York, 310-327.

GOLDBERG, A. 1998. A specification of Java loading and bytecode verification. In 5th ACM Conference on Computer and Communications Security. ACM, New York, 49-58.

HAGIYA, M. AND TOZAWA, A. 1998. On a new method for datafiow analysis of Java Virtual Machine subroutines. In Static Analysis, 5th International Symposium, SAS `98. Springer-Verlag, 17-32.

LINDHOLM, T. AND YELLIN, F. 1996. The Java Virtual Machine Specification. Addison-Wesley.

MORRISETT, G. 1995. Compiling with types. Ph.D. thesis, Carnegie Mellon University.

MORRISETT, G., TARDITI, D., CHENG, P., STONE, C., HARPER, R., AND LEE, P. 1996. The TIL/ML compiler: Performance and safety through types. In Workshop on Compiler Support for Systems Software.

NIPKOW, T. AND YON OHEIMB, D. 1998. [Java.sub.light] is type-safe--definitely. In Proceedings of the 25th ACM Symposium on Principles of Programming Languages. ACM, New York, 161-170.

QIAN, Z. 1998. A formal specification of Java Virtual Machine instructions for objects, methods and subroutines. In Formal Syntax and Semantics of Java[TM], J. Alves-Foss,Ed. Springer-Verlag. To appear.

SARASWAT, V. 1997. The Java bytecode verification problem. Web page at http://www.research .att.com/~vj/bcv.html.

SIRER, E. G., MCDIRMID, S., AND BERSHAD, B. 1997. Kimera: A Java system security architecture. Web pages at http://kimera.cs.washington.edu/.

SYME, D. 1997. Proving Java type soundness. Tech. Rep. 427, University of Cambridge Computer Laboratory. June.

TIC 1997. ACM SIGPLAN Workshop on Types in Compilation (TIC97).

YELLIN, F. 1997. Private communication.

APPENDIX

A. PROOF OF SOUNDNESS FOR THE STRUCTURED SEMANTICS

First, we prove that a step of execution preserves WFCallStack. Next, we prove some lemmas about types. Finally, we prove the soundness theorem for the structured semantics (Theorem 2), first by showing that well-typing of the stack is preserved and then by showing that well-typing of local variables is preserved.

A.1 Preservation of WFCallStack

The following lemma states certain insensitivities of WFCallStack to the exact value of pc:

LEMMA 1. For all P, F, and S such that F, S [??.sub.s] P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. Assume that P, F, S, pc, pc, and [Rho] satisfy the hypotheses of the lemma. We do a case split on [Rho]:

(1) Case: [Rho] = [Epsilon]. In this case, the assumption WFCallStack(P, F, pc, [Rho]) must be true by rule (wf0), so Dom(F pc) = VAR. Given this and the assumption that Dom(F pc') = Dom(F pc), it follows that Dom(F pc') = VAR. Thus, by rule (wf0), we can conclude that WFCalIStack(P, F, pc', [Rho]).

(2) Case: [Rho] [is not equal to] [Epsilon]. In this case, we know that [Rho] = p [multiplied by] [Rho]' for some p and [Rho]'. Also, the assumption WFCallStack(P, F, pc, [Rho]) must be true by rule (wf1), so, for some L, we know that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Substituting R P, pc, for R P, pc and Dom(F pc') for Dom(F pc), we have all we need to establish WFCallStack(P, F, pc', [Rho]) by rule (wf1). [square]

Now we show that the structured dynamic semantics preserves WFCallStack:

RESTATEMENT, PART OF THEOREM 2. For all P, F, and S such that F, S [??.sub.s] P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. Assume that P, F, S, pc, f, s, [Rho], pc', f', s', and [Rho]' satisfy the hypotheses of the lemma. The assumption that a step of execution is possible starting from <pc, f, s, [Rho]> implies that P[pc] is defined; it also implies that P[pc] [is not equal to] halt. We do a case split on the instruction possible at P[pc], proceeding in three cases:

(1) The first case is instructions that do not affect the subroutine call stack: inc, if, load, store, pop, and push0. For these instructions, their structured dynamic semantics implies that [Rho]' = [Rho] and their structured static semantics implies that R P, pc = R P, pc, and Dom(F pc) = Dom(F pc'). Given these equalities, we can conclude WFCallStack(P, F, pc', [Rho]') by Lemma 1.

(2) The second case is P[pc] = jsr K for some K. In this case, the structured dynamic semantics implies that [Rho]' = (pc + 1) [multiplied by] [Rho], so we want to show that:

WFCallStack(P, F, K, (pc + 1) [multiplied by] [Rho])

To prove this judgment using rule (wf1), it suffices to show:

(a) that P[(pc + 1) - 1] = jsr K, which we are assuming;

(b) that K [element of] [R.sub.P, K], which is true because [R.sub.P,K] = {K} by the structured static semantics of jsr (specifically, line 3 of the jsr rule in Figure 7);

(c) that Dom([F.sub.K]) [subset or equal to] Dom(F (pc+1)), which also follows from the structured static semantics of jsr K (specifically, lines 2 and 3 of the jsr rule in Figure 8);

(d) that WFCallStack(P, F, pc + 1, [Rho]). We are assuming that

WFCallStack(P, F, pc, [Rho])

From the structured static semantics of jsr K we know that [R.sub.P,(pc+1)] = [R.sub.P, pc] (line 2, jsr rule, Figure 7) and Dom([F.sub.(pc+1)]) = Dom ([F.sub.pc]) (line 2, jsr rule, Figure 8). Under these conditions,

WFCallStack(P, F, pc + 1, [Rho])

follows from Lemma 1.

(3) The third case is P[pc] = ret x for some x. The assumption that a step of execution is possible starting from <pc, f, s, [Rho]> implies that [Rho] is not empty. If [Rho] is not empty, then by the structured dynamic semantics of ret we know that [Rho] = pc' [multiplied by] [Rho]'. Given the assumption that

WFCallStack(P, F, pc, [Rho])

by substitution we can conclude that

WFCallStack(P, F, pc, pc' [multiplied by] [Rho]')

This can be true only by rule (wf1), so

WFCallStack(P, F, pc', [Rho]')

must be true. [square]

A.2 Lemmas about Types

A few lemmas about types are needed. The first lemma helps us reason about the types of locations at program points that dynamically follow a ret instruction:

LEMMA 2. For all P, F, and S such that F, S [??.sub.s] P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. Assume that P, F, S, pc, f, s, p, [Rho]', and x satisfy the hypotheses of the lemma. The assumption WFCallStack(P, F, pc, p [multiplied by] [Rho]') must hold by rule (wf1), so there exists an L such that L [element of] [R.sub.P,pc] and P[p - 1] = jsr L. Given such an L, the conjuncts of the conclusion are instantiations of the quantified term found in the structured static semantics of ret (Figure 8). [square]

The following lemma states an insensitivity of F to the exact value of pc: LEMMA 3.

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. Assume that F, [Rho], y, T, pc, and pc' satisfy the hypotheses of the lemma. If y [element of] Dom(F pc), then the assumption that F(F, pc, [Rho]) [y] = T must be true by rule (tt0), so we can conclude that T = F pc[y]. Given y [element of] Dom(F pc), by assumption F pc'[y] = F pc[y], so F pc'[y] = T. Therefore, by rule (tt0)

F(F, pc', [Rho])[y] = T

If y [is not an element of] Dom(F pc), then F(F, pc, [Rho])[y] = T must be true by rule (tt1), so we can conclude, for some p and [Rho]', that [Rho] = p [multiplied by] [Rho] and that

(1) F(F, p, [Rho]')[y] = T

Given the assumptions that y [is not an element of] Dom(F pc) and Dom(F pc) = Dom(F pc'), it follows that y [is not an element of] Dom(F pc'). Given this and (1),

(1) F(F, pc', [Rho])[y] = T

follows from rule (tt1). [square]

The next two lemmas say that F does not change as a result of executing a jsr or ret instruction. The lemma for jsr is:

LEMMA 4. For all P, F, and S such that F, S [??.sub.s] P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. Assume that P, F, S, pc, f, [Rho], L, y, and T satisfy the hypotheses of the lemma. We do a case split on y:

--Case: y [element of] Dom([F.sub.L]), that is, y is one of the variables accessible in the subroutine being called. By the structured static semantics of jsr (line 3 of jsr rule in Figure 8), Dom([F.sub.L]) [subset or equal to] Dom(F pc), so y is an element of Dom(F pc) as well. Given that y [element of] Dom(F pc), the assumption that

F(F, pc, [Rho])[y] = T

could be true only by rule (tt0), so we know that T = F pc[Y]. From y [element of] Dom([F.sub.L]) and the structured static semantics of jsr (Figure 8, line 5) it follows that [F.sub.L][y] = F pc[y], so T [element of] [F.sub.L] [y]. From y [element of] Dom([F.sub.L]) and T = [F.sub.L][y], it follows from rule (tt0) that

F(F, L, (pc + 1) [multiplied by] [Rho][y] = T

--Case: y [is not an element of] Dom([F.sub.L]). We know that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The first two lines follow from the structured static semantics of jsr (Figure 8, lines 2 and 4); the last we are assuming. From Lemma 3 it follows that

F(F, pc + 1, [Rho])[y] = T

Since y [is not an element of] Dom([F.sub.L]), it follows from rule (tt1) that

F(F, L, (pc + 1) [multiplied by] [Rho]) [y] = T

[square]

The next lemma says for ret what the previous lemma says for jsr. Here, however, we need to assume that we have a well-formed subroutine call stack.

LEMMA 5. For all P, F, and S such that F, S [??.sub.s] P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. Assume that P, F, S, pc, f, p, [Rho]', x, y, and T satisfy the hypotheses of the lemma. We do a case split on y:

--Case: y [element of] Dom(F pc). Given the WFCalIStack assumption plus the assumptions that P[pc] = ret x and y [element of] Dom(F pc), we can conclude that [F.sub.p][y] = F pc[y] by Lemma 2. Given that y [element of] Dom(F pc), the assumption that

F(F, pc, p [multiplied by] [Rho]')[y] = T

must hold by rule (tt0), so T = F pc[y] and thus T = [F.sub.p][y]. The WFCallStack assumption could be true only by rule (wf1), so Dom(F pc) [subset or equal to] Dom([F.sub.p]) and y [element of] Dom([F.sub.p]). Given T = [F.sub.p][y] and y [element of] Dom([F.sub.p]):

F(F, p, [Rho]')[y] = T

follows from rule (tt0).

--Case: y [is not an element of] Dom(F pc). Given y [is not an element of] Dom(F pc), the assumption that

F(F, pc, p [multiplied by] [Rho]')[y] = T

could be true only by rule (tt1), so we can conclude

F(F, p, [Rho]')[y] = T

[square]

The final lemma says that F is defined when WFCallStack holds:

LEMMA 6. For all P, F, and S such that F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. This lemma holds because WFCallStack ensues that Dom([F.sub.p]) = VAR for some p in [Rho], so we are guaranteed that rule (tt0) will apply for at least this [Rho].

More formally, assume an arbitrary y in VAR and P, F, S such that F, S ?? P. We proceed by induction on [Rho]:

--In the base case, [Rho] = [Epsilon]. Assume pc such that WFCalIStack(P, F, pc, [Rho]). Because [Rho] = [Epsilon], this WFCallStack assumption could be true only by rule (wf0), so Dom(F pc) = VAR and thus y [element of] Dom(F pc). Therefore, by rule (tt0), F(F, pc, [Rho])[y] = F pc[y].

--In the inductive step, [Rho] = p [multiplied by] [Rho]' for some p and [Rho]'. Assume pc such that WFCallStack(P, F, pc, [Rho]).

If y [element of] Dom(F pc), then we can conclude F(F, pc, [Rho])[y] = F pc[y] by rule (tt0).

If y [is not an element of] Dom(F pc), then we need to use the induction hypothesis:

??q. WFCallStack(P, F, q, [Rho]') ?? [exists]T. F(F, q, [Rho]')[y] = T

Because [Rho] is not empty, the assumption WFCallStack(P, F, pc, [Rho]) could be true only by rule (wf1), so WFCallStack(P, F, p, [Rho]'). This matches the antecedent of the induction hypothesis, so we can conclude that F(F, p, [Rho]')[y] = T for some T. Thus, by rule (tt1), we can conclude that F(F, pc, [Rho])[y] = T. [square]

A.3 Preservation of Stack Typing

We now turn our attention back to the soundness theorem. This subsection shows that the typing of the operand stack is preserved; the next shows that the typing of variables is preserved.

RESTATEMENT, PART OF THEOREM 2. For all P, F, and S such that F, S [??.sub.s] P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. Assume that P, F, S, pc, f, s, [Rho], pc', f', s', and [Rho]' satisfy the hypotheses of the lemma. The assumption that a step of execution is possible starting from <pc, f, s, [Rho]> implies that P[pc] is defined; it also implies that P[pc] [is not equal to] halt. We do a case split on the instruction possible at P[pc]:

(1) The first case is ret. We know from the structured dynamic semantics of ret that s' = s and [Rho] = pc' [multiplied by] [Rho]'. Given the latter equality and the WFCallStack assumption, [S.sub.pc]' = S pc follows from Lemma 2. Given [S.sub.pc]' = [S.sub.pc], s' = s, and the assumption that s : [S.sub.pc], it follows that s' : [S.sub.pc]'.

(2) The second case is inc, pop, push0, if, jsr, and store. Consider inc, for example. We know from the structured dynamic semantics of inc that s = n [multiplied by] s" for some integer n and value string s", and that s' = (n + 1) [multiplied by] s". We know from the structured static semantics of inc that

[S.sub.pc] = [S.sub.pc]' = INT [multiplied by] [Alpha]

for some type string [Alpha]. Given s: [S.sub.pc], s = n [multiplied by] s", and S pc = INT [multiplied by] [Alpha], it follows that s" : [Alpha]. Given st" : [Alpha], s' = (n + 1) [multiplied by] s", and [S.sub.pc]' = INT [multipliedby] [Alpha], it follows that s' : [S.sub.pc]'.

As another example, consider store x. We know from the structured dynamic semantics of store that s = v [multiplied by] s' for some v, and we know from the structured static semantics that S pc = T [multiplied by] S pc' for some T. Given these equations and the assumption that s' : S pc, it follows that s': S pc'. The proofs for the other instructions in this category are along similar lines.

(3) The last case is load x. In this case, we use the assumptions to infer that f [x]: T for some T such that F(F, pc, [Rho])[x] = T. The structured static semantics for load says that [x] [element of] Dom(F pc), so F(F, pc, [Rho])[x] = T must be true by rule (tt0), so that T = F pc[x]. We know from the structured static semantics of load x that Spc' = F pc[x] [multiplied by] S pc' so S pc' = T [multiplied by] S pc. We know from the structured dynamic semantics of load x that s' = f[x] [multiplied by] s. Given f[x] : T, s : S pc, s' = f[x] [multiplied by] s, and S pc' = T [multiplied by] S pc, it follows that s': S pc'. [square]

A.4 Preservation of Local Variable Typing

RESTATEMENT, PART OF THEOREM 2. For all P, F, and S such that F, S [??.sub.s] P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

This statement is stronger than that in Section 5.3 in that the quantifier for y has been moved to the top level.

PROOF. Assume that P, F, S, pc, f, s, [Rho], pc', f', s', [Rho]', and y satisfy the hypotheses of the lemma. Let T be a type such that F(F, pc, [Rho])[y] = T and f[y]: T; by assumption, such a T exists. The assumption that a step of execution is possible starting from <pc, f, s, [Rho]> implies that P[pc] is defined; it also implies that P[pc] [is not equal to] halt. We do a case split on the instruction possible at P[pc]:

(1) The first case is instructions that do not modify y or [Rho]: inc, if, load, pop, pusho, and store x for x [is not equal to] y. For these instructions, it follows from their structured static semantics that Dom(F pc') = Dom(F pc) and that F pc'[y] = F pc[y] when y [element of] Dom(F pc). It follows from their structured dynamic semantics that [Rho]' = [Rho]. Thus, by Lemma 3, F(F, pc', [Rho]')[y] = T. What remains to be shown is f'[y] : T. It follows from the structured dynamic semantics of these instructions that f[y] = f'[y]; from f[y]: T and f[y] = f'[y] it follows that f'[y]: T.

(2) The next case is store y. By the structured static semantics of store y, y [element of] Dom(F pc'). Thus, by rule (tt0), F(F, pc', [Rho]')[y] = F pc'[y]. What remains to be shown is f'[y]: F pc'[y]. The structured static semantics for store y says that S pc = F pc'[y] [multiplied by] S pc'; the structured dynamic semantics for store y says that s = f[y] [multiplied by] S pc'. Given these equations and the assumption s: S pc, it follows that f'[y]: F pc'[y].

(3) The last case is jsr L and ret x. We can conclude that F(F, pc',[Rho]')[y] = T by Lemma 4 when P[i] is jsr L and by Lemma 5 when P[i] is ret x. What remains to be shown is f'[y]: T. From the structured dynamic semantics of these instructions we know that f' = f. Given fly]: T and f[y] = f'[y], it follows that f'[y] : T. [square]

B. PROOF OF SOUNDNESS FOR THE STACKLESS SEMANTICS

This section proceeds roughly top-down, first stating a proposition and two lemmas, next using these to prove the soundness theorem for the stackless semantics, then proving the two lemmas.

If F and S are a typing for P in the stackless semantics, then they are a typing for P in the structured semantics:

PROPOSITION 1. For all P, F, S, if F, S ?? P, then F, S [??.sub.s] P.

The proofs below implicitly use this proposition when they apply lemmas and theorems that have F, S [??.sub.s] P in their hypotheses.

The first lemma establishes a simulation between the structured semantics and the stackless semantics:

LEMMA 7. For all P, F, and S such that F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The second lemma states that Consistent is preserved by the structured dynamic semantics:

LEMMA 8. For all P, F, and S such that F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Given these two lemmas and the soundness theorem for the structured semantics, the soundness theorem of the stackless semantics follows directly.

RESTATEMENT, THEOREM 3. For all P, F, and S such that F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. Assume that P, F, S, pc, f, s, [Rho], pc', f', and s' satisfy the hypotheses of the theorem. Use Lemma 7 to pick [Rho]' such that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The first conjunct of the conclusion follows from our selection for [Rho]' and Lemma 8. The last two conjuncts follow from our selection for [Rho]' and Theorem 2. [square]

B.1 Analogous Steps

RESTATEMENT, LEMMA 7. For all P, F, and S such that F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. Assume that P, F, S, pc, f, s, [Rho], pc', f', and s' satisfy the hypotheses of the lemma. The assumption that a step of execution is possible starting from <pc, f, s> implies that P[pc] is defined; it also implies that P[pc] [is not equal to] halt. We do a case split on the instruction possible at P[pc]:

--For all instructions except jsr and ret, pick [Rho]' -- [Rho]. A comparison of the structured and stackless dynamics in Figure 10 and Figure 6 shows that the conclusion of our lemma follows directly from our assumptions.

--For jsr L, pick [Rho]' - (pc + 1) [multiplied by] [Rho]. Again, a comparison of Figures 10 and 6 shows that the conclusion of our lemma follows from our assumptions in this case as well.

--The case for the ret x instruction is the only complicated one. First, we show (by contradiction) that our assumptions imply that p has at least one element. If p had no elements, then the WFCallStack2 part of the Consistent assumption would have to be true by rule (wf20). This would require that Cp,pc be empty, but the stackless static semantics for ret x (line 2, ret rule, Figure 13) together with the relationship between [R.sub.P] and [C.sub.P] imply that [C.sub.P,pc] must have at least one element. Thus, WFCallStack2 cannot hold by rule (wf20). Instead, it must hold by rule (wf21), so [Rho] must have at least one element. Therefore, [Rho] = p. [Rho]" for some p and [Rho]". We let [Rho]' = [Rho]". To prove that this selection of [Rho]' satisfies the conclusion, we need to show that:

P [??.sub.s] <pc, f, s, [Rho]> [right arrow] <pc', f', s', [Rho]">

In order to establish this judgment, we must show that pc'= p. By the stackless dynamic semantics of ret x, pc' = f[x], so this reduces to showing that f[x] = p. Because [Rho] has at least one element, the WFCallStack assumption must be true by rule (wf1). From this we can infer that, for some L, P[p - 1] = jsr L and L [element of] [R.sub.P, pc].

--Given P[p - 1] = jsr L and [Rho] = p [multiplied by] [Rho]', we know by rule (l1) that [Lambda]p([Rho]) = L [multiplied by] [Lambda]' for some [Lambda]'. (As mentioned in Section 6.3, [Lambda] p([Rho]) is always defined for well-typed programs when WFCallStack holds.)

--Given that L [element of] R P,pc, we know by the stackless static semantics of ret x that F pc[x] = (ret-from L) (line 4, ret rule, Figure 13). We also know from the stackless static semantics of ret x that x [element of] Dom(F pc) (line 3).

Because [Rho] is not empty, the GoodFrameRets component of the Consistent assumption must be true by rule (gfr1). Given this, x [element of] Dom(Fpc), and

F pc[x] = (ret-from L)

it must be the case that

GoodRet(L, f[x], [Lambda]p([Rho]), p [multiplied by] [Rho]')

We know that [Lambda]p([Rho]) = L [multiplied by] [Lambda]', so by substitution:

GoodRet(L, f[x], L [mutliplied by] [Lambda]',p [multiplied by] [Rho]')

This could be true only by rule (gr1), so we can conclude that f[x] = p. [square]

B.2 Preservation of Consistent

Expanding the definition of Consistent, we want to prove the following lemma:

RESTATEMENT, LEMMA 8. For all P, F, and S such that F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

We prove this by assuming the hypotheses and proving separately each conjunct of the conclusion. We prove the first conjunct (preservation of WFCallStack) in Section A.1. We prove the remaining conjuncts below after we state and prove some miscellaneous lemmas.

B.2.1 Miscellaneous Lemmas. The following lemma describes how a program step can change [Lambda]p([Rho]):

LEMMA 9. For all P, F, and S such that F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. Assume that P, F, S, pc, f, s, [Rho], pc', f', s', and [Rho]' satisfy The hypotheses of the lemma. The assumption that a step of execution is possible starting from <pc, f, s, [Rho]> implies that P[pc] is defined; it also implies that P[pc] [is not equal to] halt. We do a case split on the instruction possible at P[pc]:

--jsr L: From the structured dynamic semantics of jsr L, [Rho]' = (pc + 1) [multiplied by] [Rho]. By assumption, P[pc] = jsr L. Thus, by rule (l1), [Lambda]p([Rho]') = L [multiplied by] [Lambda]p([Rho]).

--ret x: From the structured dynamic semantics of ret x, [Rho] = pc' [multiplied by] [Rho]'. Thus, the WFCallStack assumption must be true by rule (wf1), so P[pc' - 1] = jsr L for some L. Thus, by rule (l1), [Lambda]p([Rho]) = L [multiplied by] [Lambda]p([Rho]'). [square]

LEMMA 10. For all P, pc, [Lambda], and [Rho]:

WFCallStack2(P, pc, [Lambda], [Rho]) ?? [Rho] and [Lambda] have the same length

This can be proven by induction on the derivation of WFCallStack2(P, pc, [Lambda], [Rho]).

LEMMA 11. For all K, p, [Lambda], [Rho]:

K [is not an element of] [Lambda] [conjunction] [Lambda] and [Rho] have the same length

?? GoodRet(K, p, [Lambda], [Rho])

This lemma can be proven by induction on [Lambda] using rule (gr0) in the base case and rule (gr2) in the inductive step.

The following lemma states an insensitivity of GoodRet:

LEMMA 12.

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. Assume P, F, pc, p, [Rho]', L, [Lambda]', v, and K satisfy the hypotheses of our lemma. The WFCallStack2 assumption implies that [Rho]' and [Lambda]' have the same length (by Lemma 10). We proceed with a case split on [Rho]':

--When [Rho]' is empty, so is [Lambda]', thus GoodRet(K, v, [Lambda]', [Rho]') follows from rule (gr0).

--When [Rho]' is not empty, we proceed with a case split on K. When K [not equal to] L, the GoodRet assumption must be true by rule (gr2), so

GoodRet(K, v, [Lambda]', [Rho]')

The next case is K = L. Because [Rho]' is not empty, the WFCallStack2 Assumption must be true by rule (wf21), so L [is not an element of] [Lambda]' and thus K [is not an element of] [Lambda]'. Given this and the fact that [Lambda]' and [Rho]' have the same length, we can conclude by Lemma 11 that

GoodRet(K, v, [Lambda]', [Rho]')

[square]

The next lemma is used to prove the following lemma:

LEMMA 13.

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. We proceed by induction on [Rho]:

--In the base case, [Rho] = [Epsilon]. Pick arbitrary P, F, pc, [micro], f, f' x, and v. In this case, GoodFrameRets(F, pc', [micro], f', [Lambda]p ([Rho]), [Rho]) follows from rule (gfr0).

--In the inductive step, [Rho] = p [multiplied by] [Rho]' for some p and [Rho]'. Pick arbitrary P, F, pc, [micro], f, f' x, and v. Assume all hypotheses of Lemma 13 hold. The GoodFrameRets assumption could be true only by rule (gfr1), so

(2) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

and

(3) GoodFrameRets(F, p, Dom(F pc), f', [Lambda]P([Rho]'), [Rho]')

First, we want to show that

(4) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Assume y and L such that y is in Dom(F pc)\[Mu] and F pc[y] = (ret-from L). Because x [element of] [Rho], it must be that y [is not equal to] x, so by our assumption relating f to f', f[y] = f'[y]. Given this equality, GoodRet(L, f'[y], [Lambda]P([Rho]), [Rho]) follows from (2). Next, we want to show that

(5) GoodFrameRets(F, p, Dom(F pc), f', [Lambda]P([Rho]'), [Rho]')

We prove this using the induction hypothesis, which we instantiate as follows:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The first two antecedents of this instantiation of the induction hypothesis follow from the fact that the WFCallStack assumption must be true by rule (wf1). The next antecedent follows from the assumptions that x [element of] [micro] and [micro] [subset or equal to] Dom(F pc). The fourth antecedent we are assuming, and the last is (3). Thus, we can use the induction hypothesis to conclude (5). Given (4) and (5),

GoodFrameRets(F, pc, [micro], f', [Lambda]P([Rho]), [Rho])

follows from rule (gfr1). [square]

The final lemma in this set states an insensitivity of GoodFrameRets:

LEMMA 14.

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. Pick arbitrary P, F, pc, f, [Rho], pc', f', x, v, and T. We proceed by a case split on [Rho]:

--The first case is [Rho] = [Epsilon]. In this case, GoodFrameRets(F, pc', {}, f', [Lambda]P([Rho]), [Rho]) follows from rule (gfr0).

--The second case is [Rho] = p [multiplied by] [Rho]' for some p and [Rho]'. Assume all hypotheses of Lemma 14 hold. The GoodFrameRets assumption could be true only by rule (gfr1), so

(6) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

and

(7) GoodFrameRets(F, p, Dom(F pc), f, [Lambda]P([Rho]'), [Rho]')

First, we want to show that

(8) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

By assumption, Dom(F pc') = Dom(F pc), so (8) is equivalent to

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Assume y and L such that y is in Dom(F pc) and F pc' [y] = (ret-from L). For y [is not equal to] x, GoodRet(L, f'[y], [Lambda]P([Rho]), [Rho]) follows from (6) and our assumptions relating f to f' and F pc to F pc'. For y = x, GoodRet(L, f'[y], [Lambda]P([Rho]), [Rho]) follows from the assumption

??K. T = (ret-from K) ?? GoodRet(K, v, [Lambda] p([Rho]), [Rho])

Next, we want to show that

(9) GoodFrameRets(F, p, Dom(F pc,), f', [Lambda]P([Rho]'), [Rho]')

By assumption, Dom(F pc) = Dom(F pc'), so (9) is equivalent to

GoodFrameRets(F, p, Dom(F pc'), f', [Lambda] p([Rho]'), [Rho]')

We prove this using Lemma 13. Our assumptions imply the following conclusions:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The first two conclusions follow from the fact that [Rho] is not empty and thus the WFCallStack assumption can only be true by rule (wf1). The next two we are assuming directly. The last conclusion is (7). These conclusions are the antecedents of Lemma 13, so (9) follows. Given (8) and (9),

GoodFrameRets(F, pc', {}, f', [Lambda]p([Rho]), [Rho])

follows from rule (gfr1). [square]

B.2.2 Preservation of WFCallStack2

RESTATEMENT, PART OF LEMMA 8. For all P, F, and S such that F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

This restatement omits unneeded conjuncts in the hypotheses.

PROOF. Assume that P, F, S, pc, f, s, [Rho], pc', f', s', and [Rho]' satisfy The hypotheses of the lemma. The assumption that a step of execution is Possible starting from <pc, f, s, [Rho]> implies that P[pc] is defined; it also implies that P[pc] [is not equal to] halt. We do a case split on the instruction possible at P[pc]:

--For all instructions except jsr and ret, we have [Rho]' = [Rho] (by the structured dynamic semantics), so [Lambda]p([Rho]') = [Lambda]p([Rho]) (by substitution). Also, by the stackless static semantics of these instructions, [C.sub.P,pc'] = [C.sub.P,pc'] Given these equations,

WFCallStack2(P, pc', [Lambda]p([Rho]'), [Rho]')

follows from the WFCallStack2 assumption, whether that assumption is true by rule (wf20) or rule (wf21).

--In the case of jsr L, we have L = pc' (by the structured dynamic semantics). We first prove that the WFCallStack2 assumption implies that

(10) [Lambda]P([Rho]) is a subsequence of [C.sub.P,pc]

If [Rho] [is not equal to] [Epsilon], then WFCallStack2 must hold by rule (wf21), so (10) must be true in this case. If [Rho] = [Epsilon], then [Lambda] P([Rho]) = [Epsilon] (by rule (10)) and [C.sub.P,pc] = [Epsilon] (because WFCallStack2 must hold by rule (wf20)), so (10) must be true in this case as well.

We use rule (wf21) to show that WFCallStack2(P, pc', [Lambda]P([Rho]'), [Rho]') is true. According to this rule, it suffices to establish the following three intermediate results:

(1) L [is not an element of] [Lambda]P([Rho]). This follows from (10) and L [is not an element of] [C.sub.P,pc], which follows from the stackless static semantics of jsr L (line 2, jsr rule, Figure 11).

(2) [Lambda] P([Rho]') is a subsequence of [C.sub.P,pc']. From Lemma 9 we know that [Lambda]P ([Rho]') = L [multiplied by] [Lambda]P([Rho]). Together with (10), this implies that

(11) [Lambda] P ([Rho]') is a subsequence of L [multiplied by] [C.sub.P,pc]

From the stackless static semantics of jsr L, we know that

(12) L [multiplied by] [C.sub.P,pc] is a subsequence of [C.sub.P,pc']

(lines 4 and 5, jsr rule, Figure 11). Given (11) and (12), we can conclude that

[Lambda] P ([Rho]') is a subsequence of [C.sub.P,pc'],

(3) WFCallStack2(P, pc + 1, [Lambda]P ([Rho]), [Rho]). We prove this in two cases, the first when the WFCallStack2 assumption is true by rule (wf20), the second when the WFCallStack2 assumption is true by rule (wf21). In both of these cases we have [C.sub.P,pc+1] = [C.sub.P,pc], which follows from the stackless static semantics of jsr L (line 3, jsr rule, Figure 11).

If the WFCallStack2 assumption is true by rule (wf20), then [Rho], [[Lambda].sub.P]([Rho]), and [C.sub.P,pc] must be empty. Given [C.sub.P,pc+1] = [C.sub.P,pc], [C.sub.P,pc+1] must be empty too. Thus,

WFCallStack2(P, pc + 1, [[Lambda].sub.P]([Rho]), [Rho])

follows from rule (wf20).

If the WFCalIStack2 assumption is true by rule (wf21), then there exist p, [Rho]", K, and [Lambda]" such that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Because [C.sub.P,pc+1] = [C.sub.P,pc], K [multiplied by] [Lambda]" is a subsequence of [C.sub.P,pc+1] as well. Since K [is not an element of] [Lambda]", K [multiplied by] [Lambda]" is a subsequence of [C.sub.P,pc+1], and WFCallStack2( P, p, [Lambda]", [Rho]"),

WFCalIStack2(P, pc + 1, K [multiplied] [Lambda]", p [multiplied by] [Rho]")

follows from rule (wf21). Since [Rho] = p [multiplied by] [Rho]" and [[Lambda].sub.P]([Rho]) = K [multiplied by] [Lambda]", we obtain WFCallStack2(P, pc + 1, [[Lambda].sub.P]([Rho]), [Rho])

-In the case of ret x, we know from Lemma 9 and from the structured dynamic semantics that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

for some L. Applying these equations to the WFCallStack2 assumption, we can conclude that

WFCallStack2(P, pc, L [multiplied by] [[Lambda].sub.P]([Rho]'), pc' [multiplied by] [Rho]')

This judgment can be true only by rule (wf21), so we can conclude

WFCalIStack2(P, pc', [[Lambda].sub.P] ([Rho]'), [Rho]')

[square]

B.2.3 Preservation of GoodStackRets

RESTATEMENT, PART OF LEMMA 8. For all P, F, and S such that F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Again, this restatement omits unneeded conjuncts in the hypotheses. Also, for convenience, this restatement expands the definition of GoodStackRets.

PROOF. Assume that P, F, S, pc, f, s, [Rho], pc', f', s', and [Rho]' satisfy the hypotheses of the lemma. The assumption that a step of execution is possible starting from <pc, f, s, [Rho]> implies that P[pc] is defined; it also implies that P[pc] [is not equal to] halt. We do a case split on the instruction possible at P[pc]:

--First we look at all instructions except jsr and ret. For these instructions, [Rho] = [Rho]' by the structured dynamic semantics of the individual instructions; also, by substitution, [[Lambda].sub.P]([Rho]) = [[Lambda].sub.p] ([Rho]') as well.

Assume j and L such that j [element of] Dom(S pc') and S pc'[j] = (ret-from L); we need to show that

GoodRet( L, s'[j], [[Lambda].sub.P]([Rho]'), [Rho]')

This follows from rule (gr0) when [Rho]' = [Epsilon]. Assume that [Rho]' = [is not equal to] [Epsilon] and proceed with a case split on j:

--The first case is stack slots that are unchanged, that is, stack slots that are not written by the instruction but rather are carried over. The exact set is going to vary by instruction, but values of j in this set will satisfy the following:

j [element of] Dom(S pc) [conjunction] S pc'[j] = S pc[j] [conjunction] s'[j] = s[j]

For these stack elements, if S pc'[j] equals (ret-from L), then S pc[j] also equals (ret-from L). This implies GoodRet(L, s[j], [[Lambda].sub.P]([Rho]), [Rho]), which by substitution means GoodRet(L, s'[j], [[Lambda].sub.P] ([Rho]'), [Rho]').

--The second case is stack slots that are written by the instruction. It turns out that this case applies only to load for the following reasons:

--The instructions pop, if, and store do not write the stack, so this second case does not apply to these instructions.

--For inc and push0, if j is the element written (that is, the top element of s'), we know from the stackless static semantics that the type S pc' [j] is INT, which violates our assumption that S pc'[j] equals (ret-from L), so this second case does not apply to these instructions either.

For load x, we know from the stackless static semantics of load x that x [element of] Dom([F.sub.pc]) and [F.sub.pc][x] = S pc'[j]. Given the assumption that [Rho]' [is not equal to] [Epsilon], the GoodFrameRets assumption must be true by rule (gfr1). The antecedents of this rule and the assumptions that x [element of] Dom([F.sub.pc]) and [S.sub.pc]'[j] = F pc[x] = (ret-from L) imply that

GoodRet(L, f[x], [[Lambda].sub.P]([Rho]), [Rho])

By the structured dynamic semantics of load x, s'[j] = f[x] and [Rho]' = [Rho], so by substitution

GoodRet(L, s'[j], [[Lambda].sub.P]([Rho]'), [Rho]')

--For jsr and ret, assume j and L such that j [element of] Dom(S pc') and S pc'[j] = (ret-from L); we need to show that

(13) GoodRet(L, s'[j], [[Lambda].sub.P]([Rho]'), [Rho]')

But first we prove a helpful fact. For both jsr and ret, when j [element of] Dom(S pc) we can assume that S pc[j] = S pc'[j]. For jsr, this follows from the stackless static semantics (line 6, jsr rule, Figure 13). For ret, this follows from Lemma 2; to apply Lemma 2, we need that [Rho] = pc' [multiplied by] [Rho]', which follows from the structured dynamic semantics of ret.

Given S pc[j] = S pc'[j] for j [element of] Dom(S pc), S pc'[j] = (ret-from L), and the GoodStackRets assumption, we can conclude:

j [element of] Dom(S pc) ?? GoodRet(L, s[j], [[Lambda].sub.P]([Rho]),[Rho])

Given this and the fact that s[j] = s'[j] for j [element of] Dom(S pc) (which follows from the structured dynamic semantics of jsr and ret), we can conclude:

(14) j [element of] Dom(S pc) ?? GoodRet(L, s'[j], [[Lambda].sub.P]([Rho]), [Rho])

Now we return to proving (13) for the cases of interest.

--In the case of jsr K, we do a case split on j:

(1) Case: j [is not an element of] Dom(S pc), that is, j is the index of the top of the stack. In this case we know from the stackless static semantics of jsr K that K = L (line 6, jsr rule, Figure 13), we know from the structured dynamic semantics that s'[j] = pc + 1 and [Rho]' = (pc + 1) [multiplied by] [Rho], and we know by Lemma 9 that [[Lambda].sub.P]([Rho]') = L [multiplied by] [[Lambda].sub.P]([Rho]). Given these equations, (13) follows by rule (gr1).

(2) Case: j [element of] Dom(S pc). In this case, we know from the stackless static semantics of jsr K that K [is not equal to] L (line 7, jsr rule, Figure 13). We know from the structured dynamic semantics that [Rho]' = (pc + 1) [multiplied by] [Rho], and we know from Lemma 9 that [[Lambda].sub.P]([Rho]') = K [multiplied by] [[Lambda].sub.P]([Rho]). Given these equations and (14), (13) follows from rule (gr2).

--In the case of ret, we know from Lemma 9 and the structured dynamic semantics that:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

for some K. Given these equations, we prove (13) by cases on L:

(1) Case: K [is not equal to] L. By the stackless static semantics, Dom(S pc') is equal to Dom(S pc), so j [element of] Dom(S pc). Given this, the conclusion of (14) must be true. Given [[Lambda].sub.P]([Rho]) = L [multiplied by] [[Lambda].sub.P]([Rho]') and K [is not equal to] L, the conclusion of (14) can be true only by rule (gr2), so we can conclude

GoodRet( K, s'[j], [[Lambda].sub.P]([Rho]'), [Rho]')

(2) Case: K = L. From the WFCalIStack2 assumption and Lemma 10 we know that [[Lambda].sub.P]([Rho]) and [Rho] have the same length; given this, [[Lambda].sub.P]([Rho]') and [Rho]' must also have the same length. The assumption that a step of execution is possible starting from <pc, f, s, [Rho]> implies that [Rho] is not empty, thus the WFCallStack2 assumption must be true by rule (wf21), so L [is not an element of] [[Lambda].sub.P]([Rho]'). Thus, (13) follows from Lemma 11. [square]

B.2.4 Preservation of GoodFrameRets

RESTATEMENT, PART OF LEMMA 8. For all P, F, and S such that F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

This time, we need all conjuncts in the hypotheses.

PROOF. Assume that P, F, S, pc, f, s, [Rho], pc', f', s', and [Rho]' satisfy the hypotheses of the lemma. The assumption that a step of execution is possible starting from <pc, f, s, [Rho]> implies that P[pc] is defined; it also implies that P[pc] [is not equal to] halt.

When [Rho]' = [Epsilon], we have [[Lambda].sub.P]([Rho]') = [Epsilon] by rule (10). Thus, we can conclude

GoodFrameRets(F, pc', {}, f', [[Lambda].sub.P]([Rho]'), [Rho]')

by rule (gfr0). When [Rho]' is not empty, we do a case split on the instruction possible at P[pc]:

--The first case is inc, pop, push0, if, and load x for any x. It follows from the stackless static semantics of these instructions that Dom(F pc) = Dom(F pc') and, for all y in Dom(F pc), F pc[y] = F pc, [y]. It follows from the structured dynamic semantics of these instructions that f = f' and [Rho] = [Rho]'.

Because [Rho] is not empty, we can assume that [Rho]' = [Rho] [multiplied by] [Rho]' for some p and [Rho]". The GoodFrameRets assumption must be true by rule (gfr1), so we can conclude that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Using the equalities f = f', [Rho] = [Rho]', Dom(F pc') = Dom(F pc') and, for y in Dom(F pc), F pc[y] = F pc'[y], we can conclude:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

From this we can conclude

Good-FrameRets(F, pc', {}, f', [[Lambda].sub.P] ([Rho]'), [Rho]')

by rule (gfr1).

--The next case is store x. Let v and T satisfy s = v [multiplied by] s' and S pc = T [multiplied by] S pc'. We know these exist by the structured dynamic semantics and stackless static semantics of store x, respectively. We want to apply Lemma 14 to conclude:

GoodFrameRets(F, pc', {}, f', [[Lambda].sub.P]([Rho]), [Rho])

To do so, we must discharge the antecedents of Lemma 14. The first antecedent, WFCallStack, we are assuming. The second and third antecedents follow from the stackless static semantics and the structured dynamic semantics of store x, respectively. The fourth antecedent follows from the stackless static semantics. The fifth and sixth antecedents follow from the GoodStackRets and GoodFrameRets assumptions, respectively. Thus, we can indeed apply Lemma 14 to conclude

GoodFrameRets(F, pc', {}, f', [[Lambda].sub.P]([Rho]), [Rho])

The structured dynamic semantics of store x implies that [Rho] = [Rho]'. Substituting, we get

GoodFrameRets(F, pc', {}, f', [[Lambda].sub.P]([Rho]'), [Rho]')

--The next case is jsr K. From Lemma 9 and the structured dynamic semantics of jsr K we know that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

From the stackless static semantics we know that

Dom(F K) [subset or equal to] Dom(F pc) Dom(F pc+1) = Dom(F pc)

We want to prove:

GoodFrameRets(F, pc', {}, f', [[Lambda].sub.P] ([Rho]'), [Rho]')

Substituting, this is equivalent to

GoodFrameRets(F,K, {}, f,K [multiplied by] [[Lambda].sub.P]([Rho]), (pc + 1). [Rho])

To establish this via rule (gfr1), it suffices to show that

(15) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

and that

(16) GoodFrameRets(F, pc + 1, Dom(F K), f, [[Lambda].sub.P]([Rho]), [Rho])

To prove (15), we assume y and L such that y is in Dom(F K]) and F K[y] = (ret-from L) and show that

(17) GoodRet(L, f[y], K [multiplied by] [[Lambda].sub.P]([Rho]), (pc + 1) [multiplied by] [Rho])

To show this, first we show that

(18) Goodret(L, f[y], [[Lambda].sub.P]([Rho]), [Rho])

If [Rho] is empty, then (18) follows from rule (gr0). If [Rho] is not empty, then the GoodFrameRets assumption must hold by rule (gfr1). Given the first line of rule (gfr1) and the fact that y [element of] Dom(F K) and F K[y] = (ret-from L), (18) follows. We know K [is not equal to] L from the stackless static semantics of jsr K (line 8, Figure 13). Given this and (18), we can conclude (17) by rule (gr2).

Next, we need to prove (16). If [Rho] is empty, then (16) follows from rule (gfr0). Otherwise, we assume that [Rho] = [Rho] [multiplied by] [Rho]" for some [Rho] and [Rho]". Given [Rho] = p [multiplied by] [Rho]", to establish (16) via rule (gfr1), it suffices to show that

(19) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

and that

(20) GoodFrameRets(F, p, Dom(F pc+1), f, [[Lambda].sub.P] ([Rho]"), [Rho]")

Because [Rho] [is not equal to] [element of], the GoodFrameRets assumption must be true by rule (gfr1), so we can conclude that

(21) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

and that

(22) GoodFrameRets(F, p, Dom(F pc), f, [[Lambda].sub.P] ([Rho]"), [Rho]")

To prove (20), we simply observe that, because Dom(F pc+1) = Dom(F pc), (22) is equivalent to (20).

To prove (19), we assume y and L such that y is in Dom([F pc+1) = Dom(F K) and F pc+1[y] equals (ret-from L) and show that

(23) GoodRet(L, f[y], [[Lambda].sub.P]([Rho]), [Rho])

From the stackless static semantics of jsr K (line 4, jsr rule, Figure 13), because y is in Dom([F.sub.pc]+1)\Dom(F K), we can conclude that [F.sub.pc][y] equals F pc+1[y] which in turn equals (ret-from L). Also, given Dom([F.sub.pc]+1) = Dom(F pc), we can conclude that y [element of] Dom([F.sub.pc]]). Given (21), y [element of] Dom([F.sub.pc]), and [F.sub.pc][y] = (ret-from L), we can conclude (23).

--The last case is ret x. From Lemma 9 and the structured dynamic semantics of ret x we know that, for some K:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

We want to prove:

GoodFrameRets(F, pc', {}, f', [[Lambda].sub.P] ([Rho]'), [Rho]')

If [Rho]' is empty, then this follows from rule (gfr0).

Otherwise, we have that [Rho]' = p [multiplied by] [Rho]" for some p and [Rho]". We are assuming that

GoodFrameRets(F, pc, {}, f, [[Lambda].sub.P]([Rho]), [Rho])

Applying the equations above, this is equivalent to

GoodFrameRets(F, pc, {}, f', K. [[Lambda].sub.P][Rho]', pc'. [Rho]')

This must be true by rule (gfr1), so we can conclude that

(24) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

and that

(25) GoodFrameRets(F, pc', Dom([F.sub.pc]), f', [[Lambda].sub.p] ([Rho]'), [Rho]')

Because [Rho]' = p [multiplied by] p", (25) must be true by rule (gfr1), so

(26) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

and

(27) GoodFrameRets(F, p, Dom [F.sub.pc'), f', [[Lambda].sub.p] ([Rho]"), [Rho]")

Because [Rho] [is not equal to] [Epsilon], the WF CallStack assumption must be true by rule (wf1), so P[pc' - 1] = jsr M for M such that [R.sub.p], pc = {M}. Given this, by the stackless static semantics of ret x (lines 2 and 5, ret rule, Figure 13), we know that, for y in Dom([F.sub.pc]), [F.sub.pc'][y] = [F.sub.pc][y]. Combining this with (24), we get

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Using Lemma 12 and the WF CallStack2 assumption, we can conclude

(28) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Combining (26) and (28), we can conclude

(29) [MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Given (29) and (27), we can use rule (gfr1) to conclude GoodFrameRets(F, pc', {}, f', [[Lambda].sub.p](p'), p')

[square]

C. PROOF OF MAIN SOUNDNESS THEOREM

To prove Theorem 4, our main soundness theorem, we use two lemmas. This section first states these lemmas, proves the main soundness theorem, then proves the two lemmas.

The first lemma says that well-typed programs get "stuck" only at halt instructions:

LEMMA 15. For all P, F, and S such that F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

The next lemma takes invariants stated about individual steps of execution and extends them over multistep execution sequences:

LEMMA 16. For all P, F, and S such that F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

We use these lemmas to prove the main soundness theorem:

RESTATEMENT, MAIN SOUNDNESS THEOREM. For all P, F, and S such that F, SF P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. Assume that P, F, S, pc, f, and s satisfy the hypotheses of this theorem. Under this assumption, the second conjunct of the conclusion (s: Spc) follows directly from Lemma 16.

We prove the first conjunct of the conclusion by contradiction. Assume that P[pc] [is not equal to] halt. Our previous assumptions about P, F, S, pc, f, and s imply the hypotheses of Lemma 16, so we can conclude that there exists p such that P, F, S, pc, f, s, and p satisfy all hypotheses of Lemma 15. However, the conclusion of Lemma 15 contradicts our assumption that

??pc', f', s'. P ?? <pc, f, s> [right arrow] <pc', f', s'>

Thus, we are forced to conclude that P[pc] = halt. [square]

C.1 Making Progress

Lemma 15 says that a well-typed program does not get stuck unless it hits a halt instruction:

RESTATEMENT, LEMMA 15. For all P, F, and S such that F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. Assume that P, F, S, pc, f, s, and p satisfy the hypotheses of the lemma. We proceed with a case split on the instruction at P[pc], constructing pc', f', and s' satisfying the stackless dynamic semantics of the instruction at P[pc].

--For load x, push0, and jsr L, progress can always be made. For load x and push0, s' = s, f' = f, and pc' = pc + 1. For jsr L, s' = (L + 1) [multiplied by] S' f' = f, and pc' = pc + 1.

--For inc, if L, pop, and store x, progress can be made when the stack has at least one value of the appropriate type in it. The assumption that s: Spc plus the static constraints on Spc for each instruction ensure that the stack does indeed have an appropriately typed value on top. For all instructions, assume s = v [multiplied by] s" for some v and s". For inc, if L, and pop, we take f' = f; for store x, we take f' = f[x ?? v]. For if L, pop, and store x, we take s' = s". For inc, v must be some integer n, and we take s' = (n + 1) [multiplied by] s". For inc, pop, and store x, pc' = pc + 1. For if L, v must be some integer n, and we take pc' = pc + 1 if n is zero and we take pc' = L for other values of n.

--For ret x, progress can be made in states where f[x] is an address. The assumption that

[exists][Tau]. F(F, pc, [Rho])[x] = [Tau] [conjunction] f[x]: [Tau]

and the stackless static semantics of ret x imply that f[x] does contain an address. Thus, progress is possible, and we take pc' = f[x], f' = f, and s'= s.

[square]

C.2 Chained Soundness Theorem

Before proving Lemma 16, we state and prove one more invariant about individual steps of execution:

LEMMA 17. For all P, F, and S such that F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. Assume that P, F, S, pc, f, s, and [Rho] satisfy the hypotheses of the lemma. The assumption that a step of execution is possible starting from <pc, f, s> implies that P[pc] is defined; it also implies that P[pc] [is not equal to] halt. We do a case split on the instruction possible at P[pc]:

--For pop, push0, inc, load, and store, pc' = pc + 1, and pc+ 1 is constrained by the stackless static semantics of these instructions to be in Dom(P).

--For if L, pc' equals either pc + 1 or L, depending on which way the branch goes. Both pc + 1 and L are constrained by the stackless static semantics of if L to be in Dom(P).

--For j sr L, pc' = L, and L is constrained by the stackless static semantics of jsr L to be in Dom(P).

--For ret x, we first show that our assumptions imply that [Rho] has at least one element. If [Rho] had no elements, then the WFCallStack2 part of the Consistent assumption would have to be true by rule (wf20). This would require that Cp, pc be empty, but the stackless static semantics for ret x implies that [C.sub.p,pc] must have at least one element. Thus, WFCalIStack2 cannot hold by rule (wf20). Instead, it must hold by rule (wf21), so p must have at least one element.

Let [Rho] = p [multiplied by] [Rho]' for some p and [Rho]'. Using Lemma 7, we have that pc' = p. The WF CallStack part of the Consistent assumption must be true by rule (wf1), so P [p - 1] = jsr L for some L. From the stackless static semantics of jsr L we have that (i + 1) [element of] Dom(P) for all i such that P[i] = jsr L. Thus, substituting p - 1 for i, we can conclude that p, and thus pc', is in Dom(P).

Lemma 16 applies Lemma 17 and Theorem 3 to multistep executions starting from the initial state:

RESTATEMENT, LEMMA 16. For all P, F, and S such that F, S ?? P:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

PROOF. Assume P, F, and S such that F, S ?? P and proceed by induction on the number of execution steps in P ?? <1, fo, [Epsilon]> [right arrow]* <pc, f, s>:

--In the base case, <pc, f, s> equals <1, [f.sub.0], [Epsilon]>, that is, no steps of execution are taken. In Section 3, it is assumed that P[1] is defined for all programs, so we can conclude that pc [element of] Dom(P). We pick [Rho] = [Epsilon] and observe that:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Given the values of [F.sub.1], [S.sub.1], and [C.sub.p,1] it is not hard to check this.

--In the inductive step, let <[pc.sub.n], [f.sub.n], [s.sub.n]> be a state such that

P ?? <1, [f.sub.0], [Epsilon]> [right arrow]* <[pc.sub.n], [f.sub.n], [s.sub.n]> [right arrow]* (pc, f, s)

By induction, we know that there exists a [[Rho].sub.n] such that:

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Our assumptions and these conjuncts satisfy the hypotheses of Theorem 3 and Lemma 17, so we can conclude that

[MATHEMATICAL EXPRESSION NOT REPRODUCIBLE IN ASCII]

Authors' address: Systems Research Center, Digital Equipment Corporation, 130 Lytton Avenue, Palo Alto, CA 94301.

Permission to make digital/hard copy of all or part of this material without fee for personal or classroom use provided that the copies are not made or distributed for profit or commercial advantage, the ACM copyright/server notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the ACM, Inc. To copy otherwise, to republish, to post on servers, or to redistribute to lists requires prior specific permission and/or a fee.

Received December 1997; accepted April 1998

RAYMIE STATA and MARTIN ABADI

Digital Equipment Corporation

Printer friendly Cite/link Email Feedback | |

Author: | STATA, RAYMIE; ABADI, MARTIN |
---|---|

Publication: | ACM Transactions on Programming Languages & Systems |

Date: | Jan 1, 1999 |

Words: | 17365 |

Previous Article: | Specification and Verification of Fault-Tolerance, Timing, and Scheduling. |

Next Article: | Space-Efficient Scheduling of Nested Parallelism. |