A memory-based approach to recognizing programming plans.
Our interest lies in automatically extracting object-oriented design knowledge to support translating C programs to C++. In particular, we are interested in recognizing common objects and operations in C program and replacing them with libraries containing human-generated object-oriented code written in C++. This is an appealing task for automated program understanding for several reasons. First, even if it is possible only to automatically recognize and replace a small portion of the many objects and operations implicitly present in a set of real-world programs, it is a win over attempting to revise these programs entirely by hand. Second, hand translation of C to C++ requires constructing replacement C++ classes, so it is not unreasonable to assume that information about C implementations of these classes can be provided while constructing the replacement classes.
As an example, consider the problem of translating the C program in Figure 1 into its object-oriented C++ counterpart in Figure 2. Translating even this simple C program requires understanding its purpose, its design, and the mapping between its design and its implementation. Its purpose is to print the average distance between a set of x, y points. Its design divides into two pieces. One reads pairs of points, computes the distance between them, adds this distance to a running total, and updates a running count. The other calculates the average and prints the result. Its implementation stores point coordinates as separate variables, combines a while loop and a scanf to read the input points, computes the distance using a series of arithmetic operations and assignments, and prints the average using printf and an expression which provides an average of 0 if no values are read.
All of this design information is necessary to translate Figure 1 into Figure 2, as the C++ program has the same general design, but makes explicit that the program is manipulating points, calculating distances, computing averages, and so on. The difficulty is that this design must be inferred from relationships between elements of the source code, as the program is poorly documented and uses uninformative variable names.
Our approach has been to study programmers understanding short C codes and to try to build a program that mimics their understanding process. Because most C programmers are capable of quickly inferring the underlying design of the program in Figure 1, despite the difficulties inherent in the task, it seems sensible to use their understanding process as a guide to developing program understanding algorithms. This article will describe the lessons learned from our informal studies, the resulting model of program understanding, and how this model differs from current program understanding systems.
During the past decade, much work has been done on automated program understanding. This work shares several common features.
* An internal representation of the actual program instructions and the data and control flow between them. The particular representations of the program source range from closely coupled to the source code (such as abstract syntax trees annotated with data and control flow information  to abstractions of the source code (such as flow graphs that combine data and control flow information [15, 16]).
* A library of programming plans or cliches. These plans or cliches  can be thought of a describing design elements in terms of common implementation patterns. A program contains a design element when a portion of its source code matches one of its implementation patterns. For example, READ-PROCESS-LOOP  is the plan of reading all input values and performing some set of actions to each of them. In most structured languages, this plan can be implemented in one of two general ways, as a loop containing a read within its rest and the processing actions as its body, or as a loop preceded by a read and with a body that consists of the processing actions followed by another read. The particular representations for plans are varied and include flow graphs [15, 16], source code templates , and sets of logical constraints [6, 11].
* A resulting plan hierarchy that maps general plans to particular program actions. This plan hierarchy is usually some sort of tree (or lattice) structure, with program instructions at the leaves, programming plans in the nodes, and the goals the program achieves as the root. Figure 3 shows much of the plan hierarchy capturing the design of the program in Figure 1.
Despite a myraid of differences in detail, program underscores can be classified as working top-down or bottom-up. The top-down approach starts from knowledge about the goals the program might achieve, determines what plans can achieve these goals, and attempts to connect these plans to the actual program instructions [5, 6]. Typically, this process includes using matching rules or constraints to detect how these instructions achieve various subgoals within a plan, and difference rules to recognize how they differ from the instructions expected by a plan. In contrast, the bottom-up approach starts with the instructions themselves, determines which programming plans might have these instructions as components, attempts to infer higher-level goals from these plans, and repeats the process until the programmer's actual goals are recognized or the understander runs out of candidate plans to match against the program [7, 15, 16].
Both approaches to propgram understanding have shown significant promise, but both also have obvious limitations. The top-down approach requires detailed advance knowledge about the goals the program is supposed to achieve which is often unavailable for many real-world programs. It also is not well suited for performing partial understanding, since it understands a program fragment only when it is connected to a top-level goal. On the other hand, bottom-up approaches tend to suffer from a combinatorial explosion of possibly relevant plans. Each program instruction may be part of a wide variety of plans, which themselves can be part of a wide variety of plans, and so on. This explosion greatly limits the length and complexity of the programs that can be understood with this approach, unless a method can be found for limiting the search space.
Previous program understanders have avoided worrying about the size of the search space by either performing top-down searches for a limited number of plans or by performing bottom-up searches with a library containing a limited number of largely domain-independent plans. Unfortunately, understanding realworld software requires both a bottom-up search and a large library. The reason being these programs are most naturally described in terms of domain-specific objects and operations. which necessitates recognizing not only those programming plans that carry out these operations but also the programming plans representing the objects being manipulated. As a result, a program understander cannot rely primarily on a library of general programming plans, such as READ-PROCESS-LOOP or COMPUTE-INPUT-AVERAGE, although though these are often necessary. Instead, it must also have a library that contains a variety of domain-specific objections and operations, such as COMPUTE-DISTANCE and POINT. These domain-specific operations and objects form a domain model absolutely critical for understanding the programs built on top of them. However, as soon as domain-specific plans are necessary, the plan library tends to increase dramatically in size. As the plan library grows, so does the need to limit the number of plans that are considered during the program understanding process. Unlike current automated program understanding systems, human programmers appear to do this quite successfully. That has led us to observe student programmers performing bottom-up program understanding and to create a model of program understanding based on the process we have observed. Our belief is that capturing the process by which humans focus their search for programming plans has the potential to significantly improve automated plan recognition techniques.
A Study of Student Programmers
We studied a small group of student programmers attempting to understand a set of geometrically related functions. Our goal was to discover which plan library entries were considered during program understanding and what triggered the consideration of these plans. We chose functions composed from novel compositions of plans the students were likely to have seen before. This allowed us to focus on plan retrieval and recognition techniques without worrying about the additional problems inherent in understanding unplanful code (code that is only partially constructed from stereotypical plans) . We chose relatively simple functions without global variables rather than more realistic complete programs. This allowed us to avoid complications such as delocalized plans . Finally, we chose implementations that minimized the available syntactic indices to plans , such as variables names that accurately describe the use of a variable. This allowed us to focus on conceptual indices: particular combinations of action that suggest considering a particular plan.
Our methodology was to take think-aloud protocols of students attempting to understand these functions. Students were told to verbalize any question or hypothesis they generated about the code, as well as any conclusions they drew and what led them to draw that conclusion. They were given no advance knowledge of what the functions were supposed to do, and we were told to try and provide as succint a description as possible of what the code actually did. To better understand the process, we frequently asked them to explain why or how they generated a hypothesis or reached a conclusion.
The process these students went through in understanding one of these programs is described in detail in . As a result of this study, we learned several lessons that have significantly influenced our model of the process of program understanding:
Principle 1: Program understanders require explicit indexes from combinations of program actions to entries in the plan library. They should match only indexed plans against a given code fragment, not every possible plan known to contain that code fragment.
Our student programmers did not match every program action against every programming plan that might contain it. Instead, they tried to recognize a program plan only when they were remined of it by a combination of program actions or program plans. For example, a while whose test involved a variable and a relational operator other than equality consistently caused students to consider the plan ITERATE-OVER-RANGE. Similarly, a while whose test contained an input-reading function caused them to consider the plan READ-PROCESS-LOOP and a sqrt passed a SUM-OF-SQUARES caused them to consider the plan DISTANCE-BETWEEN-POINTS. In contrast, no students immediately considered ITERATE-OVER-RANGE every time they encountered an assignment, nor COMPUTE-DISTANCE every time they noticed a subtraction, even though assignments and subtractions are steps in these plans.
Principle 2: Program understanders require a specialization hierarchy of programming plans. They should then try to specialize any recognized program action or plan as far as possible before trying to understand the role of the next program action or plan.
Our student programmers tried to specialize the programming plans and actions they encountered in the code before moving on the next program fragment. For example, when the students encountered the MULTIPLYs involving t1 and t2 in Figure 1, they immediately pointed out they were instances of COMPUTE-SQUARE. Similarly, once they recognized READ-PROCESS-LOOP, they noticed the scanf was reading numeric values, specialized the plan to READ-PROCESS-VALUE-LOOP, and then took advantage of the EOF to specialize the plan further to READ-PROCESS-ALL-VALUES.
Principle 3: Program understanders require that programming plans contain explicit knowledge about additional plans that can be inferred once they are recognized. They should make these inferences before continuing on to try recognize other plans.
Our student programmers often drew conclusions after recognizing plans. For example, after recognizing the plan COMPUTE-DISTANCE, they immediately pointed out that this meant the values being subtracted represented coordinates within POINTs and that FETCH-INPUT-VALUES was really a FETCH-PAIR-OF-POINTS. These realizations were the crucial ones in understanding the real purpose of the C function they were given. Importantly, they did not try to infer points from pairs of variables, but waited until operations on points implied their existence.
Thus, our study suggests an organization for the plan library: a collection of discrimination nets of plans at various levels of abstraction, with explicit indices from different classes of program components to entries in these nets, and a set of explicit implication links from each plan to any other plans its existence implies. The remainder of this paper shows how this organization can be implemented and presents a recognition algorithm that takes advantage of it.
The Plan Library
Our plan library extends the representation and organization used in the well-described plan library developed by Andersen Consulting for understanding Cobol programs . This work represents programming plans as data structures containing two parts: a plan definition, which list the attributes of the plan that are filled in when instances of the plan are created, and a plan recognition rule, which lists the components of a plan and the constraints on these components. An instance of the plan is recognized when all its components have been recognized without violating the constraints.
Figure 4 shows how the plan READ-PROCESS-ALL-VALUES can be represented. This plan reads an entire set of input values and performs the same action to each value. It has two attributes: VALUE, which is the input variable, and PROCESS, which is the sequence of actions done to that variable. To recognize this plan, it is necessary to find three items in the code: an action that reads input (READER) a test whether end of file was reached (EOF-TEST), and a looping construct (REPEATER). These items must meet several constraints: the input reader and the comparison against end of file must be contained in the loop's test, and that comparison has a data dependency on the input reading action.
Unfortunately, this representation of a programming plan is not sufficient to support the process we have observed. The following are some difficulties.
1. It lacks indexing information about under what circumstances it is worth checking for the presence of READ-PROCESS-ALL-VALUES. As a result, a bottom-up search must check for this plan any time it encounters an action that reads input, a test for inequality, or a loop. 2. It contains no explicit representation of the differences between plans at different levels in the hierarchy. Although READ-PROCESS-ALL-VALUES is a specialization of the more general plan READ-PROCESS-LOOP, recognizing READ-PROCESS-LOOP does not eliminate the need for doing a complete match against all of READ-PROCESS-ALL-VALUES's components and constraints to recognize it. 3. It contains no knowledge about other related plans that can be inferred once the plans is recognized. For READ-PROCESS-ALL-VALUES that is irrelevant, but once a plan such as COMPUTE-DISTANCE is recognized. it is also necessary to recognize several instances of POINTs. With this scheme, however, additional matching must be done to try to recognize these other plans.
To alleviate these difficulties, we provide three additional pieces of information with each programming plan: indices, specialization constraints, and a list of implied plans, Figure 5 provides an example of this extended representation for program plans.
Plans can be indexed by program components as well as other plans in the library. An index to a plan is represented as a triggering plan and a set of indexing tests. For example, COUNT-LOOP-ITERATIONS is defined to be indexed from an INCREMENT (the triggering plan) that is contained within a LOOP (the indexing test). This means that each time the program understander encounters an INCREMENT action, it checks whether that action is within a LOOP. If it is, there is a possible occurrence of COUNT-LOOP-ITERATIONS, with its INCR-COUNTER component bound to the INCREMENT action, the REPEATER component bound to the LOOP, and one of its constraints fulfilled. The understander can then match this partial instantiation of COUNT-LOOP-ITERATIONS against the program to find the remaining components and test the remaining constraints.
Each action of plan may index a set of other plans. For example, a LOOP construct indexes several high-level plans, each with a different test: READ-PROCESS-LOOP (if the loop contains a FETCH-INPUT), ITERATING-LOOP (if the loop test corresponds to a relational test other than inequality between a variable and a constant), and FLAG-CONTROLLED-ITERATE (if the loop test is equally or inequality between a variable and a constant).
The idea behind indexing is to perform a small set of index tests when each program action or plan is examined and to match only the successfully indexed plans against the program. This focuses the search and has the potential to greatly reduce the search space. In addition, since the index is to a partially recognized plan, only the components not already recognized need to be searched for in the code, speeding up the matching process.
Plans can be represented in two ways: as a collection of components and constraints (as with COUNT-LOOP-ITERATIONS, READ-PROCESS-LOOP, and COMPUTE-DISTANCE), or as a specialization of an existing plan (as with READ-PROCESS-VALUE-LOOP, READ-PROCESS-CHAR-LOOP, READ-PROCESS-CHAR-LOOP, and READ-PROCESS-ALL-VALUES). A specialization is represented as a set of additional constraints on components and attributes of an existing plan. The plan READ-PROCESS-VALUE-LOOP for example, specializes READ-PROCESS-VALUE when its READER component is a FETCH-INPUT-VALUE. Similarly, the plan READ-PROCESS-ALL-VALUES specializes READ-PROCESS-VALUE-LOOP when its test attribute contains a comparison against EOF and a data dependency on the result of trying to read input.
Given these explicitly representing specialization relationships, once the program understander recognizes an instance of READ-PROCESS-LOOP, it can recognize READ-PROCESS-VALUE-LOOP simply by checking whether its READER component is a FETCH-INPUT-VALUE--without having to do a complete match on this new plan. Similarly, it can then recognize READ-PROCESS-ALL-VALUES simply by testing whether its TEST attribute contains a comparison against EOF and the appropriate data dependency.
Each plan may be specialized by a set of different plans. The general plan READ-PROCESS-LOOP, for example, can be specialized into READ-PROCESS-VALUE-LOOP, READ-PROCESS-CHAR-LOOP, and READ-PROCESS-LINE-LOOP, among others. The specialization constraints involve checking whether plan components are present, whether plan attributes are equivalent to specific values, and whether additional relationships hold between plan attributes or components. For example, the plan ZERO is a specialization of an ASSIGN, where the value assigned is a 0. Similarly, the plan SQUARE is a specialization of a MULTIPLY, where the values being multiplied are equivalent.
The idea behind specialization constraints is to allow plan recognition to proceed by following an index to a general plan and then to gradually specialize that plan. This simplifies indexing to every plan, and allows plan recognition to take advantage of commonalities between different plans. But there are several tradeoffs. One is that whenever the understander recognizes a plan, it must check all of the specialization constraints to the more specialized versions of that plan, making its plan recognition algorithm more complex. The other is that the plan library now contains potentially complex connections between plans, making it more difficult to construct and maintain.
Each plan may have a list of plans that are implied by its recognition. The implied plans are usually objects that are recognized from particular operations that can only be applied to a particular class of objects. For example, recognizing COMPUTE-DISTANCE implies having recognized two new instances of POINTs. These plans are not components of COMPUTE-DISTANCE, since they do not have to be recognized to recognize an instance of COMPUTE-DISTANCE. But once an instance of COMPUTE-DISTANCE is recognized, these other plans are now known to exist.
The idea behind implied plans is to avoid having to try to recognize these additional plans bottom-up from the code. Essentially, it avoids matching them with the code by taking advantage of the fact that their existence is implied from plans that have already been recognized.
In summary, each plan in the plan library may be now indexed by particular combinations of program instructions or program plans. The program understander considers a particular plan only when it encounters an index to it during understanding--not simply whenever one of its components appears in the code. Each programming plan may also be linked to specializations of that plan, with explicit tests to determine whether that specialization is present. Whenever a program understander recognizes a particular plan, it attempts to specialize it as far as possible. This avoids the need for explicit indexes to each individual specialization. Finally, each programming plan is also linked to other plans that can be inferred once its is recognized. The understander can then recognize these plans without explicitly matching them against the code.
A Mechanism for Recognizing Programming Plans
Our recognition algorithm assumes that the program has been translated into an abstract syntax tree with frames used to represent each program action and its relationship to other actions. An action is any unit a translator is capable of recognizing from language constructs, from low-level multiplication of a pair of values to higher-level constructs such as a loop with a test and a body. The contents of the frames and their slots vary with each type of action.
The task of the recognition algorithm is to construct a plan hierarchy that maps entries from the plan library onto this abstract syntax tree.
Figure 6 contains the recognition algorithm. It make use of four distinct lists of programming plans and components:
1. L is the program components or plans that remain to be processed. This is initially the entries appearing in the abstract syntax tree.
2. I is the programming plans that have been indexed by components but whose index tests check for the presence of plans which may not have been recognized yet. (The index test from a LOOP to READ-PROCESS-LOOP checks for the existence of a FETCH-INPUT in the LOOP's body, but the FETCH-INPUT may not have been recognized when the LOOP first indexes READ-PROCESS-LOOP.)
3. H is a set of plans hypothesized to be in the code. These are plans that have been indexed by components and had their index test succeed, suggesting they may be in the code, but that may contain component plans that have not been recognized yet.
4. P is the list of plans successfully recognized. It is empty at the start.
The basic idea is to run through the program components, specializing them, seeing if they index anything, matching anything they indexed against the code, and suspending any match requiring a plan not found in the program.
We will briefly illustrate how this process recognizes the first few plans in Figure 1. The recognition process starts by examining the declarations and assignments. None of these declarations can be specialized, nor do they index any plans. However, both assignments have a set of specialization tests, one of which is whether the assigned value is zero. This test succeeds, so both assignments are specialized to ZEROs and added to P.
The recognition process next examines the while (a LOOP action in the program's abstract syntax tree). It first checks whether this action can be specialized (such as to INFINITE-LOOP, which is a LOOP with the additional constraint that its test is always true), but here no specialization constraints holds. The recognition process then checks whether any of the indexing tests associated with the LOOP succeed. Here, there are three indexed plans: READ-PROCESS-LOOP, ITERATE-OVER-RANGE, and FLAG-CONTROLLED-ITERATE. The index tests for the latter two fail, so they are removed from further consideration. The index test for READ-PROCESS-LOOP involves a FETCH-INPUT plan, which is not found, so this plan is placed on I.
The next components are the ones inside the LOOP test. A specialization of the call to scanf is FETCH-INPUT, which has a further specialization to FETCH-INPUT-VALUES for this particular use of scanf. These plans are added to P. As a result of these specializations, the index test for the previously index READ-PROCESS-LOOP now completes, so it is removed from I, successfully matched against the program, and added to P. One of the specialization tests of READ-PROCESS-LOOP is whether it contains FETCH-INPUT-VALUES, which is true, so it is specialized to READ-PROCESS-VALUE-LOOP. A specialization test of this plan is whether its test compares the result attribute of FETCH-INPUT-VALUES with EOF; it does, so it is further specialized to READ-PROCESS-ALL-VALUES. This indexes no further plans and so is simply added to P.
The program understander then proceeds through the components in the LOOP body. It skips over the subtractions, since they have no specializations nor index any plans. It then processes the arguments to the sqrt, specializing the multiplications to COMPUTE-SQUARES, and the addition to COMPUTE-SUM-OF-SQUARES. Finally, it processes the call to sqrt, which together with COMPUTE-SUM-OF-SQUARES, indexes COMPUTE-DISTANCE. After matching COMPUTE-DISTANCE against the code, it then adds the plans it implies to the program-component list (the two POINTs) followed by the COMPUTE-DISTANCE plan itself.
It then continues in a similar fashion to recognize COMPUTE-RUNNING-TOTAL (which is indexed from an ADD-TO within the loop's body and then specialized to TOTAL-INPUT-FUNC), COUNT-LOOP-ITERATIONS (which is indexed from an INCREMENT and then specialized to COUNT-INPUT-ITERATIONS), and finally COMPUTE-AVERAGE (which is indexed by DIVIDE and specialized to CALL-INPUT-FUNC-AVG). This ties the entire program together.
This article has presented a modified bottom-up approach to the recognition of programming plans. The approach relies on a highly organized plan library, where each plan has indexing, specialization, and implication links to other plans. It uses an algorithm that takes advantage of these indices to suggest general candidate plans to match top-down against the code, specializations to refine these general plans once they are recognized, and implications to recognize other, related plans without doing further matching. While others have pointed out the potential usefulness of indexes (or "beacons") [1, 3], they have used them within a top-down approach to suggest plans other than the ones found by refining and decomposing a set of initial plans presumed to be in the code.
Our approach to hybrid bottom-up/top-down recognition is based on observations of student programmers understanding code. It has the potential to greatly reduce the number of plans that a program understander must try to match against the program. But by using indexes to make guesses about which plans might actually appear in the code, it is trading completeness (the ability to recognize every instance of every plan in the plan library that appears in the program) for efficiency (the ability to quickly recognize those plans it recognizes).
How well this approach will actually work in practice will depend significantly on whether it will be possible to accurately determine indexes to plans in the plan library and whether the necessary indexing, specialization, and constraint tests can always be implemented efficiently. Our approach to determining indices for our plan library has so far been to infer them from observations of programmers understanding programs, but this clearly is too time-consuming and too ad-hoc to be practical for real-world applications. As a result, we are currently investigating statistical methods for automatically generating indices for plans.
In addition, our approach uses a wide variety of different predicates to perform tests, some of which have the potential to be hideously inefficient when implemented in their full generality. Another current focus is in trying to construct a clean library of efficient predicates to use for making these tests. Finally, our prototype recognizer and plan library simplify many of our constraint checking predicates by assuming the abstract syntax tree is in a canonical form, with for and while loops replaced with a generic LOOP construct, all decisions replaced with ifs, variables moved to the right side of communicative operations and constants to the left, and so on. This strikes us as inelegant, and we are considering how to include the canonicalization as part of the recognition process, as in .
To this point, budgetary constraints have limited us to testing our model only with a relatively small library of approximately 100 plans and on a small collection of student programs. As a result, we are now directing our efforts toward creating a sizable plan library and using it to understand a collection of textbook programs involving primitive geometric objects and operations. This should give us an idea of the effort required to apply our model to understanding a large collection of programs, as well as allowing us to evaluate its performance with a library that is more realistic in size.
[1.] Brooks, R. Toward a theory of the comprehension of computer programs. Int. J. Man-Mach. Stud. 18, 6 (Jun. 1983), 543-554.
[2.] Detienne, F. and Soloway, E. An empirically-derived control structure for the process of program understanding. Int. J. Man-Mach. Stud. 33, 3 (Mar. 1990), 323-342.
[3.] Fickas, S. and Brooks, R. Recognition in a program understanding system. In Proceedings of the Sixth IJCAI, Tokyo, Japan, 1979, pp. 266-268.
[4.] Hartman, J. Understanding natural programs using proper decomposition. In Proceedings of ICSE-91, Austin, Tex., 1991, pp. 62-73.
[5.] Johnson, W.L. Intention Based Diagnosis of Novice Programming Errors. Morgan Kaufman, Los Altos, Calif., 1986.
[6.] Kozaczynski, W., Ning, J. and Engberts, A. Program concept recognition and transformation. Trans. Softw. Eng. 18, 12 (Dec. 1992), 1065-1075.
[7.] Letovsky, S. Plan analysis of programs, Ph.D. thesis, Yale University, New Haven, Conn., 1988.
[8.] Letovsky, S. and Soloway, E. Delocalized plans and program comprehension. HEEE Softw. 19, 3 (Mar. 1986), 41-48.
[9.] Littman, D., Pinto, J., Letovsky, S. and Soloway, E. Mental models and software maintenance. In Empirical Studies of Programmers, E. Soloway and S. Iyengar, eds., Ablex, Norwood, N.J., 1986.
[10.] Murray, W.R. Automatic Program Debugging For Intelligent Tutoring Systems. Morgan Kaufman, Los Altos, Calif., 1988.
[11.] Ning, J.Q. A knowledge-based approach to automatic program analysis. Ph.D. thesis. University of Illinois, 1989.
[12.] Quilici, A. A hybrid approach to recognizing programming plans. In Proceedings of the Working Conference on Reverse Engineering, Baltimore, Md, May, 1993, pp. 126-133.
[13.] Rich, C, and Waters, R. The Programmer's Apprentice. Addison Wesley, Reading, Mass., 1990.
[14.] Soloway, E. and Erdlich, K. Empirical studies of programming knowledge. IEEE Trans. Softw. Eng. 10, 5 (May, 1984), 595-609.
[15.] Wills, L.M. Automated program recognition by graph parsing. Ph.D. dissertation, MIT, Cambridge, Mass., 1992.
[16.] Wills, L.M. Automated program recognition: A feasibility demonstration. Art. Int., 45, 2 (Feb. 1990), 113-172.
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||excerpt from paper presented at the May 1993 Association for Computing Machinery/IEEE Computer Society's Working Conference on Reverse Engineering; use of automated program understanding to facilitate C to C++ migration|
|Publication:||Communications of the ACM|
|Date:||May 1, 1994|
|Previous Article:||Program understanding and the concept assignment problem.|
|Next Article:||Adaptive object-oriented programming using graph-based customization.|