The partial metrics system: modeling the stepwise refinement process using partial metrics.
The goal of the Partial Metrics Systsm is to provide a uniform framework in which each module can be developed and its progress assessed. The basic organization of the system is shown in Figure 1. The current version of a particular module is stored in the database. It is retrieved by the ECHO reasoning subsystem when the user decides to continue refining the design. The ECHO sybsystem supports a language-independent, metrics-based paradigm for module design and implementation. It attempts to elaborate on the current design based on information supplied by the user and rules for module refinement contained in its knowledge base.
The EASE subsystem supports the critique of the design at different levels of abstraction. The structural complexity of individual modules is assessed by the module critic using metric information about the module's structure . This information is used to diagnose the currernt state of the module and prescribe adjustments to the pseudocode when necessary. Examples of such adjustments can be found in previous articles . The project critic would then be able to pool information about the current status of individual modules to provide an assessment of the project's current status.
We will focus on the design of the ECHO subsystem. The structure of the subsystem supports a metrics-based approach to the stepwise refinement process. A particular class of metrics, called partial metrics, is used in this approach. The nature of these metrics will be discussed. Seventeen examples of the module refinement process will be described using partial metrics.
An analysis of these refinements produces a hierarchical categorization of refinement classes. Each class possesses a quantitative metric description. This hierarchical classification is used as the basis for the three-phase model of the refinement process. Only certain types of refinements are permissible, within each phase, for each class is distinguishable in metric terms. Therefor e, prior to the application of a refinement, its suitability is assessed in metrics terms. An example of how this model can be used to guide the application of refinements to a pseudocode module in the automatic programming environment provided by ECHO is given.
PARTIAL METRICS: REPRESENTING KNOWLEDGE
ABOUT THE PSEUDOCODE DESIGN PROCESS
The principle on which the Partial Metrics System is based is that aspects of the design process for a software module are reflected by changes in the pseudocode structure for that module during the design process. This perspective of monitoring the process is analogous to that of a physician monitoring the progress of a hospitalized patient. It is epitomized by the physician studying the patient's chart at bedside. The measurements that are charted must be easy to perform and made frequently. A classic example is taking the patient's body temperature. If the observed temperature moves outside of a given range, the physician will prescribe treatment that brings it back within acceptable limits. A substantial number of inferences can be made by an experienced physician based on the change in this single numerical value within a given context.
Similarly, it is this principle of simple but frequent measurement that guides the knowledge-acquisition process in the Partial Metrics System. After each pseudocode refinement step, a number of measurement are taken to assess the impact of the new refinement decisions on the pseudocode program's complexity. The measurements need to be simple enough to be made frequently, but complex enough to sustain a variety of inferences about the programming process. A variety of metrics are based on counts of program elements and are easy to generate. The metrics relate to control structure, program size, data structures, and program interfaces .
These metrics, however, were designed to measure the structure of completed programs. In order to monitor the stepwise refinement of a pseudocode module, they must be extended to describe the complexity of a partially completed program. Such a pseudocode program contains two classes of symbolic terms.
The first class of terms, called prescribed terms, corresponds to reserved words and symbols in the target language. The other class of terms stands for inferences about aspects of the program that remain to be instantiated. These are called projected terms. A projected term corresponds to a programmer's hypothesis that particular information needs to be inserted in the pseudocode at a certain position. Associated with each hypothesis are a number of variables that describe what the programmer intends to add.
For example, the programmer may have ideas about the syntactic structures to be used, and the operands and operators to be employed, among other information. If traditional metrics were applied to the pseudocode, each projected term would be treated as a prescribed term, and its associated information would be ignored. This would underestimate the complexity of the pseodocode program.
Partial metrics make explicit use of information about projected terms to augment the computation of a pseudocode program's complexity. Currently, the partial metrics used by the system represent traditional metrics that are adjusted to explicitly incorporate design decisions made by the programmer about projected terms. These decisions serve to relate the projected term to features of the target language. Thus a syntactic category in the target language can be associated with a corresponding projected term. If one considers the Ada language as an example, the following syntactic classes can be identified: Projected term Identifier class (1) DICT ator An operator (2) DICT RAND An operand (3) DICT STMT A statement (4) DICT PROC A procedure (5) DICT FINCT A functin (6) DICT PACK An Ada package (7) DICT TASK An Ada task (8) DICT EXPR An arithmetic expression (9) EXTERNAL An external reference
As the programmer refines a pseudocode module, he or she is able to place hypothesized types of the projected terms into a diction ary that is associated with the current refinement. This information is then used by the Partial Metrics System to compute the complexity measures. It is possible that a vareity of inferences might be associated with individual projected terms. Only the syntactic class of each projected term, however, is currently placed in the dictionary. An example of a partial program in the Ada target language as it would be described in the Partial Metrics System is given below. The example corresponds to a refinement of a pseudocode program, MATCHES, developed by Myers  to locate unresolved references in an external symbol tabel . $DICT MATCHES; UNMATCHED ITEM FOUND, POSS UNMATCH ITEM, FOUND A MATCH: DICT FUNC; SEARCH FOR A MATCH, IS SEARCH POSSIBLE, MARK IT AS MATCHED, OUTPUT UNMATCHED: DICT STMT; $END --END OF DICTIONARY IS SEARCH POSSIBLE for I in 1 .. SIZE loop if POSS UNMATCH ITEM then SEARCH FOR A MATCH; if FOUND A MATCH then MARK IT AS MATCHED; else OUTPUT UNMATCHED; endif; endif; exit when UNMATCHED ITEM FOUND; end loop;
The symbols UNMATCHED ITEM FOUND, POSS UNMATCH ITEM, and FOUND A MATCH are inferred by the user to represent Ada functions, while the other terms represent one or more Ada statements.
A partial metric makes explicit use of the inferences, contained in the dictionary, about the projected terms to compute an estimate of structural complexity for a module. The partial metrics currently used by the Partial Metrics System represent extensions to traditional metrics. They make adjustments based on the inferred class of each projected term.
For example, Halstead's software science consists of a set of metrics that measure the module's functional complexity (size) in terms of basic operator and operand counts . In order to extend them to deal with projected terms, the inferred class of each projected term was used as a basis to associate a certain number of implied operators and/or operands with each instance of a projected term. The exact number of implied operators and operands associated with an instance of a given class is fixed for all modules in an appliation project, but can vary from project to project. If the values are not specifically given when the new project is defined, a set of default values is associated with each inferred type. These values reflect a minimum expected contribution from each occurrence of a term for a particular class. The set of default values for each syntactic class in the Ada language would be as follows:
Since Halstead's metrics depend on the counts of operands and operators in a program, the corresponding partial metric calculations will provide an estimate of program complexity as if all of the projected terms have been replaced by the target code associated with their current syntactic class. The estimate will never be less than that provided by the traditional approach, and in the early stages of the refinement process, it can be much greater. This allows the designer to anticipate problems in program complexity long before it would be possible with the traditional version. For instance, the traiditional measure is described by the example for prescribed volume shown in Figure 2. For the example program given above, this estimate is calculated to be 80. The corresponding partial metrics calculation is 161, a little over twice as much. This is high for an early refinement in the design and allows one to anticipate that this module may need some reorganization in order to reduce its complexity later. An example of how partial metrics information can be used to anticipate and reorganize the design of a pseudocode module is given in .
A DESCRIPTION OF THE MODULE REFINEMENT
PROCESS IN TERMS OF PARTIAL METRICS
In order to develop a metrics-based model of module design, it is useful to define aspects of the overall process in metrics terms. Seventeen examples of the module refinement process were studied. These examples came from a variety of sources including software engineering texts, research papers, and programming texts [1-3, 6-12, 15, 16, 22-24]. A description of the 17 refinement used in the analysis is given in Table I.
An example module contains, on the average, 8 refinement steps. The maximum number of refinement steps is 14. Each pseudocode refinement is characterized by a set of partial metric values. A brief description of the metrics used is given in Figure 2. Details concerning the development and justification of these and other partial metrics can be found in .
A primary goal was to isolate basic phases of the design process in partial metric terms. Thus the difference in partial metric values between each refinement and its predecessor was recorded. A nearest-neighbor clustering algorithm with a euclidean distance of function was then used to group refinements together based on these differences. The partial metric, n (number of unique pseudocode terms), and N (the total count of pseudocode terms) were given the highest weights of .4 and .3, respectively, in computing the distance. Each of the other metrics was given a relative weight of .1.
If the partial metrics are able to reflect phases in the design process, the groups produced by the cluster analysis should bear some correspondence to the refinement sequence. The three groups extracted exhibit a strong correspondence with the refinement sequence. The first group consists of the initial refinement from each of the 17 examples. The second group contains those intermediate refinements that did not initiate or terminate a sequence. The third group contains the resulting refinements after the last of the projected terms had been removed.
There were only seven refinements in this category because the refinement process was not carried to completion in 10 of the examples. Rather than finishing with completed code, each had projected terms that required elaboration. Perhaps the assumption was that the underlying reasoning process of each refinement step was the same. If this were the case, there would have been no need to complete the sequence. One of the results of this analysis, however, was to demonstrate the existence of distinctive differences between groups of refinement steps.
The refinements within each of the three groups were then clustered, using the same scheme as before. If the previous clustering had factored out most of the temporal distinctions between refinements, the remaining variability would be primarily due to the presence of alternative refinement strategies within each phase. The group of 17 initial refinements factored into two groups of 7 and 10 refinements. The group of 111 intermediate refinements factored into three subgroups of 27, 34, and 50 refinements. The group of 7 terminating refinements factored into two groups of 3 and 4 each. The hierarchy of refinement categories that resulted is given in Figure 3. The average partial metric values associated with each class can be found in Table II. We will now describe the significance of each category in terms of its partial metric values.
The average difference in metric values between the initial, intermediate, and terminal refinement classes reflects certain overall trends in the refinement sequence. The initial refinements in each sequence introduce the most unique new pseudocode program terms (n), and possess the greatest increment in projected volume and the least increment of prescribed volume of the three classes. The intermediate refinements contribute fewer new terms. The subsequent emphasis is on the refinement of the terms that are already there. As a result, the volume of the projected component increases at a rate that is less than that of the prescribed component. The terminal refinements are distinctive in that the number of unique terms added is close to zero, whereas the size of the prescribed code added is the largest of the three classes. The change in projected volume is negative, reflecting the replacement of the remaining projected terms with prescribed code in the target language. The overall contribution to program volume is highest in this class as well.
The above metric distinctions between classes suggest three basic stages in the decision-making process that underlie the stepwise refinement of a pseudocode module. This idea is reinforced by an analysis of the subclasses for each of these parent categories. For example, the class of initial refinements is divided into two subclasses. The first subclass is called problem driven. Each of the 7 refinements in this class is derived in a straightforward manner from the initial problem description. The pseudocode terms, present in the refinements, are either the same as terms used in the problem description or easily inferred to be present from the description.
To illustrate, consider an initial problem to process daily data on pollution. The pseudocode terms embedded in the first refinement might be INITIALIZE, PROCESS DAILY DATA, and TERMINATE PROGRAM, PROCESS DAILY DATA derives immediately from the problem description, whereas the other two are immediately inferred from the need to start and finish the process. The volume added to the projected component exceeds that for the prescribed component, distinguishing this subclass; this is not the case in any of the other subclasses.
The second subclass of initial refinements also gives evidence for prior reasoning about the design. The refinements in this category, however, propose particular steps toward a solution. They provide an overall skeleton of control for a selected algorithm, an abstract description of objects to be manipulated, or both. As such, the increment to the projected component is 5 times that of its sibling class; the increment to the prescribed component is 40 times greater than that of its sibling class. These differences refelect a reasoning process that goes beyond simple reasoning about the problem description. It reflects an attempt to match the problem requirements with the ability of certain known programming concepts to support it.
The class of intermediate refinements is composed of three subclasses. The first subclass represents the refinements that focus on the expansion of the projected component of the pseudocode program. In each of these refinements, projected terms constitute the majority of new terms added to the design. As expected, these refinements are usually applied near the beginning of the design sequence, when additional conceptual detail is needed, and are most likely to follow initial refinements from the problem-driven subclass. This study showed that refinements from this subclass followed initial refinements from the problem-driven subclass on six out of seven occasions. The average step number of a refinement from this subclass in a sequence is 3.
In contrast, another subclass of intermediate refinements represents the expansion of the prescribed code via the replacement of corresponding projected terms. These refinements exhibit an average decrement in the projected volume and add few new unique code elements. Via code, each refinement functions to implement a projected term in the target language. This activity is performed at later stages in the sequence, when enough conceptual detail is present to allow for such a substitution. The average step number of a refinement in this subclass is 6.5.
The third subclass of intermediate code, called mixed, supports the expansion of both projected and prescribed code. The projected terms that are added, however, reflect more detailed decisions about the design than for those refinements that exclusively expand the projected component. The prescribed code that is added also supports these more detailed decisions. The average step number for a refinement from this subclass is 5. This means that refinements in the mixed subclass are, on the average, applied after those that focus on the expansion of the projected component and before those that focus on the expansion of the prescribed component.
The class of terminal refinements exhibits two basic subclasses. The first subclass, called simple consolidation, corresponds to the simple substitution of a target code sequence for each of one or more projected terms. The second subclass, called complex consolidation, corresponds to situations where conditional and/or loop structures are added in the final refinement. These additional structures are often necessary to support the implementation of an abstract data object. For example, once a decision is made to implement a list in terms of a fixed array structure in a target language, then additional control structures may be needed to explicitly initialize and/or process individual elements. Since the implementation of data structures is often left until the end of the refinement sequence, one might expect the addition of control structures to occur in situations where complex data objects are implemented.
It is apparent that partial metrics can provide a basis for the classification of several refinement categories. Each refinement type reflects a general class of design decisions and is applied at predictable points in the refinement sequence. It is suggested that these quantitative differences in refinement types correspond to the underlying differences in the types of decisions made by the designer in different phases of the design process. Based on these differences, a model of the decision-making activities that underlie the stepwise refinement process will be given.
A METRIC-BASED MODEL OF THE
STEPWISE REFINEMENT PROCESS
Many approaches to the development of automated programming environments have attempted to simulate the stepwise refinement process as a sequence of transformations to a pseudocode program [4, 5, 13, 17, 21]. Despite this, no standard quantitative model of the informational structure for such transformations has emerged. In this section a prototype for such a model is proposed, based on the refinement classes already discussed. The stepwise refinement process, as modeled here, consists of three phases:
(1) the conceptual design phase,
(2) the pseudocode design phase, and
(3) the consolidation phase.
Each of these phases derives from one of the three major classes of refinements as previously described.
The conceptual design phase corresponds to the group of "initial" refinements. These refinements embody the results of initial reasoning about the problem specification. This phase represents the decisions that lead to these initial refinements. Given an initial description of the module, the goal of this phase is to generate a structured set of concepts that are then embedded into pseudocode.
The pseudocode design phase describes the activity performed by the class of intermediate refinements. In this phase a projected term associated with a syntactic class in the target language is replaced by a set of new projected terms from the same or a more specific syntactic class. Associated with each new term is the prescribed code needed to embed that term into the program.
The consolidation phase describes the decisions that are associated with the class of terminal refinements. As such, this phase is distinguished by the need to resolve any inconsistencies or deficiencies in the design prior to completing the implementation. For simple modules this may not be necessary, and the final refinements may resemble those from the pseudocode design phase.
The majority of refinements occur in the pseudocode design phase. A pseudocode refinement is characterized by a set of elaborations. The general format for an elaboration is as follows: <projected term> * (<projected term>,<syntactic class>, <embedding>)*
That is, a projected term is replaced by a list of projected term tuples. Each element in the list is represented as an ordered triple: the name of the projected term, the inferred syntactic class, and the embedding for that term.
The embedding for a projected term corresponds to the prescribed code added to the pseudocode in association with that term. This prescribed code can perform two functions. The first function is to supply the prescribed code necessary to position or "embed" the projected term in the pseudocode program. For example, if a projected term from the statement syntactic class is added to an Ada pseudocode program, the prescribed code required to embed it correctly in the program is a ;. If the prescribed code associated with each tuple in an elaboration is only there to provide this type of function, the elaboration is classified as minimally embedded. When more prescribed code is present than is necessary to embed the particular projected term for any tuple in the elaboration, the embedding is said to be overloaded. The presence of an overloaded embedding reflects a specific implementation decision made in conjunction with the establishment of the new pseudocode term.
It is also possible that a projected term may be replaced entirely by prescribed code. This is represented by the presence of the generic projected term, nil, as the only new term in the elaboration. The embedding in this situation is described as maximally overloaded.
Every elaboration can be classified based on the nature of the embeddings that it contains. Similarly, refinements can be classified based on the nature of their constituent elaborations. A refinement is classified as minimally embedded if all of its constituent elaborations are minimally embedded. If all of its elaborations are maximally embedded, then it is a maximally embedded refinement. Otherwise, the refinement is said to be overloaded.
Each of the three classes of refinements defined corresponds to one of the three subclasses of the intermediate refinement class given earlier. For example, the subclass that expands projected code can be expressed in the model as minimal embeddings. Similarly, the refinements from the mixed subclass can be expressed as various types of overloaded refinements. The subclass of refinements that expand on prescribed code can be expressed as various forms of maximally overloaded refinements. As a result, the partial metric values that describe each subclass can be used to assess the complexity of refinements from each model class. When the ECHO subsystem enters the pseudocode design phase, refinements that are applied to pseudocode are first classified into one of the three model types. Next, the changes in partial metric values are calculated. These values are then used as the basis for inferring the acceptability of the refinement in metric terms. Each hypothesis of acceptability is expressed as a Prolog clause, where the arguments at the head of the clause correspond to each of the seven partial metrics. The conditions in the body of the clause must be satisfied in order for the refinement to be accepted.
There are two basic types of conditions in the body of the clause. The first type checks to see if each value is acceptably close to the average value for the metric in a class. The second type checks to see that certain relationships between variables are within an acceptable range. As a result, the standard format for a hypothesis is as follows: accept (LITTLE N, BIG N, PRESS VOL, PROJ VOL, PROG VOL, MCABE, MCCLURE) if within range (LITTLE N, BIG N, PRES VOL, PROJ VOL, PROG VOL, MCABE, MCCLURE) and ratio within range (LITTLE N, BIG N, PRES VOL, PROJ VOL, PROJ VOL, MCABE, MCCLURE)
In order to accept the refinement, two subgoals must be satisfied. The first, within range, checks to see whether each change is within a standard deviation from the expected average value. The second, ratio within range, checks to see whether certain relationships between variables are within a standard deviation from the range. The standard deviations by the within range subgoal that check refinements in the Ada target language are shown in Table III. Such tables can be developed for other languages as well.
In the ECHO subsystem, two types of refinements can occur: system supplied and user supplied. Refinements are not stored explicitly in ECHO. Instead, elaborations are stored. The system puts together a set of elaborations based on the current status of the design. Prior to applying the refinement, the system checks the partial metrics derived from it. If the values for a refinement are too large, it is reformatted by removing one or more of the elaborations. If the values are too low, system-supplied elaborations are added to the refinement. On the other hand, if no system-supplied refinements can be produced that are compatible with the current design state, a request is made to the user for a refinement. The user-submitted refinement is also checked using partial metrics.
A refinement for a pseudocode program called MATCHES was described earlier. Assuming the previous refinement consisted of only the projected term PERFORM MATCHES, then this refinement is classified by the system as overloaded since it contains more prescribed code than is necessary to simply embed each of the new projected terms. As such it will be rejected for a variety of reasons. For instance, from Table III, the change in McClure's control variable complexity exceeds the average change by over three standard deviations. Similarly, the volume of the added projected component is over one standard deviation away from the average for the class. Rejection is based on the premise that, if the partial metric values for a refinement change at a high rate, comprehension of the refinement is reduced. In addition, it suggests that more decisions are made than would be expected at the current stage in the design. If the system were to allow this refinement, other problems in the module's structural complexity probably would surface later. The reader is referred to , where the effects of not isolating this refinement are illustrated.
In an automatic programming environment, it is not enough to just support the refinement of existing code. It is important that the contribution of each new refinement be easy to understand in both a syntactic and semantic sense. The information added to a design, in a systematic and organized fashion, is necessary in a system that supports automatic programming.
Our study of different examples of the refinement process demonstrated certain consistencies in how designers added information to a design. This led to a hierarchical classification of refinement types based on the partial metric values. From this classification it was apparent that the quantitative nature of the information added to a design changed in a predictable way from the onset of the refinement sequence to its termination. As a result, a model of the refinement process was developed where each phase can be characterized by refinements that affect the partial metric description of a pseudocode module in a specified fashion. These phases are the conceptual design phase, the pseudocode design phase, and the consolidation phase. It was also shown that, within the pseudocode design phase, each refinement must fall into a subcategory, which places additional restrictions on its content.
This model is used by the ECHO inference subsystem to assess the complexity of any intended addition to the design. Since, in ECHO, a system-generated refinement is constructed from a set of basic elaborations, this assessment can be used to modify the refinement by adding or deleting elaborations.
This study has shown that it is possible to establish guidelines to monitor the pseudocode refinement process in partial metric terms. Future work will address the issues of extending the model to deal with the activities in the conceptual design and consolidation phases of the refinement process, and extending the model in order to use it to isolate different strategies for module refinement in partial metric terms.
|Printer friendly Cite/link Email Feedback|
|Author:||Reynolds, Robert G.|
|Publication:||Communications of the ACM|
|Date:||Nov 1, 1987|
|Previous Article:||A proposed solution to the problem of levels in error-message generation.|
|Next Article:||The vocabulary problem in human-system communication.|