Abstracts from other ACM publications.
ACM Transactions on Database Systems December 1987
IFO: A Formal Semantic Database Model
Serge Abiteboul and Richard Hull
A new, formally defined database model is introduced that combines fundamental principles of "semantic" database modeling in a coherent fashion. Using a graph-based formalism, the IFO model provides mechanisms for representing structured objects, and functional and ISA relationships between them. A number of fundamental results concerning semantic data modeling are obtained in the context of the IFO model. Notably, the types of object structure that can arise as a result of multiple uses of ISA relationships and object construction are described. Also, a natural, formal definition of update propagation is given, and it is shown that (under certain conditions) a correct update always exists.
For Correspondence: S. Abiteboul, Institut National de Recherche en Informatique et en Automatique, Rocquencourt 78153, France (part of this work was performed while visiting the University of Southern California); R. Hull, Computer Science Department, University of Southern California, Los Angeles, CA 90089-0782 (part of this work was performed while visiting I.N.R.I.A.).
Extending Relational Algebra and Relational Calculus with
Set-Valued Attributes and Aggregate Functions
G. Ozsoyoglu, Z. M. Ozsoyoglu, and V. Matos
In commercial network database management systems, set-valued fields and aggregate functions are commonly supported. However, the relational database model, as defined by Codd, does not include set-valued attributes or aggregate functions. Recently, Klug extended the relational model by incorporating aggregate functions and by defining relational algebra and calculus languages.
In this paper, relational algebra and relational calculus database query languages (as defined by Klug) are extended to manipulate set-valued attributes and to utilize aggregate functions. The expressive power of the extended languages is shown to be equivalent. We extend the relational algebra with three new operators, namely, pack, unpack, and aggregation-by-template. The extended languages form a theoretical framework for statistical database query languages.
For Correspondence: G. Ozsoyoglu and Z. M. Ozsoyoglu, Department of Computer Engineering and Science, and Center for Automation and Intelligent Systems, Case Institute of Technology, Case Western Reserve University, Cleveland, OH 44106. V. Matos, Department of Computer and Information Science, Cleveland State University, Cleveland, OH 44115.
The Use of Regression Methodology for the Compromise of
Confidential Information in Statistical Databases
Michael A. Palley and Jeffrey S. Simonoff
A regression methodology based technique can be used to compromise confidentiality in a statistical database. This holds true even when the DBMS prevents application of regression methodology to the database. Existing inference controls, including cell restriction, perturbation, and table restriction approaches, are shown to be generally ineffective against this compromise technique. The effect of incomplete supplemental knowledge on the regression methodology based compromise technique is examined. Finally, some potential complicators of this disclosure scheme are introduced.
For Correspondence: Michael A. Palley, Baruch College, The City University of New York, 17 Lexington Avenue, Box 513, New York, N.Y. 10010; Jeffrey S. Simonoff, Graduate School of Business Administration, New York University, 100 Trinity Place, New York, N.Y. 10006.
Concurrency Control Performance Modeling: Alternatives and
Rakesh Agrawal, Michael J. Carey and Miron Livny
A number of recent studies have examined the performance of concurrency control algorithms for database management systems. The results reported to date, rather than being definitive, have tended to be contradictory. In this paper, rather than presenting "yet another algorithm performance study," we critically investigate the assumptions made in the models used in past studies and their implications. We employ a fairly complete model of a database environment for studying the relative performance of three different approaches to the concurrency control problem under a variety of modeling assumptions. The three approaches studied represent different extremes in how transaction conflicts are dealt with, and the assumptions addressed pertain to the nature of the database system's resources, how transaction restarts are modeled, and the amount of information available to the concurrency control algorithm about transactions' reference strings. We show that differences in the underlying assumptions explain the seemingly contradictory performance results. We also address the question of how realistic the various assumptions are for actual database systems.
For Correspondence: R. Agrawal, AT&T Bell Laboratories, Murray Hill, NJ 07974; M. J. Carey and M. Livny, Computer Sciences Department, University of Wisconsin, Madison, WI 53706.
Multikey Access Methods based on Superimposed Coding
R. Sacks-Davis and A. Kent and K. Ramamohanarao
Both single-level and two-level indexed descriptor schemes for multikey retrieval are presented and compared. The descriptors are formed using superimposed coding techniques and stored using a bit-inversion technique. A fast-batch insertion algorithm for which the cost of forming the bit-inverted file is less than one disk access per record is presented. For large data files, it is shown that the two-level implementation is generally more efficient for queries with a small number of matching records. For queries that specify two or more values, there is a potential problem with the two-level implementation in that costs may accrue when blocks of records match the query but individual records within these blocks do not. One approach to overcoming this problem is to set bits in the descriptors based on pairs of indexed terms. This approach is presented and analyzed.
For Correspondence: R. Sacks-Davis and A. Kent, Department of Computer Science, Royal Melbourne Institute of Technology, 124 La Trobe St., Melbourne, Victoria, Australia 3000; K. Ramamohanarao, Department of Computer Science, University of Melbourne, Parkville, Victoria, Australia 3052.
ACM Transactions on Graphics October 1987
Digital Halftones by Dot Diffusion
Donald E. Knuth
This paper describes a technique for approximating real-valued pixels by two-valued pixels. The new method, called dot diffusion, appears to avoid some deficiencies of other commonly used techniques. It requires approximately the same total number of arithmetic operations as the Floyd-Steinberg method of adaptive grayscale, and it is well suited to parallel computation; but it requires more buffers and more complex program logic than other methods when implemented sequentially. A "smooth" variant of the method may prove to be useful in high-resolution printing.
For Correspondence: Computer Science Department, Stanford University, Stanford, CA 94305-3068.
Geometric Approaches to Nonplanar Quadric Surface Intersection
James R. Miller.
Quadric surfaces occur frequently in the design of discrete piece parts in mechanical CAD/CAM. Solid modeling systems based on quadric surfaces must be able to represent intersection curves parametrically and in a fashion that allows the underlying surfaces to be partitioned. An algebraic approach originally developed by Levin meets these needs but is numerically sensitive and based on solutions to fourth-degree polynomial equations. In this paper we develop geometric approaches that are robust and efficient, and do not require solutions to polynomials of degree higher than 2.
For Correspondence: Computer Science Department, University of Kansas, Lawrence, KS 66045.
An Enhanced Treatment of Hidden Lines
Tomihisa Kamada and Satoru Kawai
A new display scheme is presented in which three-dimensional objects are represented with a much higher degree of user control of the line-drawing format than those in conventional schemes. In this scheme, the attribute of a line is determined not at the time of definition but after the viewing transformation has been applied, taking the shielding environment of the line into consideration. The essential difference between our scheme and conventional hidden-line removal/indication lies in the treatment of the shielding environment. In conventional hidden-line removal/indication, only the visible/invisible information is available for the selection of the attribute of a line. In our scheme, the entire set of surfaces that hide a line can be used to determine the attribute of the line. A much higher degree of freedom in generating picture is achieved owing to the treatment of the shielding environment. We also describe a system called GRIP, which implements the new scheme.
For Correspondence: Department of Information Science, Faculty of Science, University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113 Japan.
ACM Transactions on Office Information Systems October 1987
Cooperating Knowledge-Based Assistants for the Office
A. Roger Kaye and Gerald M. Karam
This paper presents an approach to high-level support of office workers by embedding office knowledge in a network of distributed cooperating knowledge-based or expert "assistants" and servers. These knowledge-based systems incorporate both factual and procedural knowledge and are capable of making use of existing conventional office technology. They constitute a form of computer-supported cooperative work. We describe a common architecture for our assistants and servers that incorporates several key features. Our systems are capable of supporting concurrent multiple consultations or tasks and have facilities for the interruption and resumption of consultations as appropriate. The various assistants and servers, which may reside on different machines, cooperate in solving problems or completing tasks by passing messages. We propose a taxonomy of the general office knowledge normally used by office workers, together with a frame and rule-based knowledge representation scheme. We also describe an experimental system, written in PROLOG, that incorporates the above design principles.
For Correspondence: Department of Systems and Computer Engineering, Carleton University, Ottawa, Ont., Canada K1S 5B6.
Evolution of an Organizational Interface: The New Business
Department at a Large Insurance Firm
Andrew Clement and C. C. Gotlieb
This paper describes how the work organization and computer system of the New Business Department at a large life insurance firm have interacted and evolved over time. The dynamics of interaction are explained largely in terms of the economic incentive to reduce the length of transaction-processing chains and the more political goal of extending managerial control. It is argued that examining the interaction of organizations and computer systems can contribute to a better theoretical understanding of the development of large computer systems and offer guidance to designers of user-computer interfaces. A graphical technique for depicting organizational interfaces is presented.
For Correspondence: A. Clement, Department of Computer Science and Mathematics, Atkinson College, Work University, North York, Ont., Canada M3J 1P3; C.C. Gotlieb, Department of Computer Science, University of Toronto, Toronto, Ont., Canada M5S 1A4.
Strategies for Encouraging Successful Adoption of Office
Susan F. Ehrlich
The adoption of new computer communication systems into organizations requires behavioral change. Planning for successful adoption requires knowledge of individual organizational communication patterns and the relationship between those patterns and particular communication system solutions. This paper documents a sequence of studies of organizational communication. Needs for office communication systems were identified, as were social and psychological factors temporarily inhibiting their use. Strategies for assuring smooth adoption of such systems are highlighted.
For Correspondence: S. F. Ehrlich, Wang Laboratories-M/S 019-490, 1 Industrial Avenue, Lowell, MA 01851.
Behavioral Experiments on Handmarkings
John D. Gould and Josiane Salaun
Handmarkings or handwritten editing marks can be used as direct editing commands to an interactive computer system. Five exploratory experiments studied the potential value of handmarkings for editing text and pictures, as well as for some specific results. Circle are the most frequently used scoping mark, and arrows are the most frequently used operator and target indicators. Experimental comparisons showed that handmarkings have the potential to be faster than keyboards and mice for editing tasks. Their ultimate value will, however, depend on the style and details of their user-interface implementation.
For Correspondence: IBM Thomas J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY 10598; firstname.lastname@example.org
Augmenting Thesauri for Information Systems
Roy Rada and Brian K. Martin
A thesaurus can be a critical component of an office information system. Access to various sets of documents can be facilitated by thesauri and by the connections that are made among thesauri. In the projects described in this paper, the thesauri are stored and manipulated through a relational database management system. The system detects inheritance properties in a thesaurus and uses them to guide a human expert in decisions about how to augment the thesaurus. New strategies will extend our ability to augment existing thesauri.
For Correspondence: R. Rada, National Library of Medicine, Bethesda, MD 20894; B. K. Martin, 3420 Hinahina St., Honolulu, HI 96816.
Office-by-Example: An Integrated Office System and Database
Kyu-Young Whang, Art Ammann, Anthony Bolmarcich, Maria Hanrahan,
Guy Hochgesang, Kuan-Tsae Huang, Al Khorasani, Ravi Krishnamurthy,
Gary Sockut, Paula Sweeney, Vance Waddle, and Moshe Zloff
Office-by-Example (OBE) is an integrated office information system that has been under development at IBM Research. OBE, an extension of Query-by-Example, supports various office features such as database tables, word processing, electronic mail, graphics, images, and so forth. These seemingly heterogeneous features are integrated through a language feature called example elements. Applications involving example are processed by the database manager, an integrated part of the OBE system. In this paper we describe the facilities and architecture of the OBE system and discuss the techniques for integrating heterogeneous objects.
For Correspondence: K.-Y. Whang, A. Ammann, A. Bolmarcich, M. Hanrahan, G. Hochgesang, K.-T. Huang, G. Sockut, P. Sweeney, and V. Waddle, IBM Thomas J. Watson Research Center, P.O. Box 704, York-town Heights, N.Y. 10598; A. Khorasani, IBM, P.O. Box 790, Poughkeepsie, NY 12602; R. Krishnamurthy, M.C.C., 9430 Research Blvd., Austin, TX 78759; M. Zloff, M. M. Zloof, Inc., 186 Clinton Ave., Dobbs Ferry, NY 10522.
Journal of the Association for Computing Machinery January 1988
A Vertex-Allocation Theorem for Resources in Queuing Networks
Satish K. Tripathi and C. Murray Woodside
Abstract. A product-form queueing network with multiple open and closed chains is considered. Some of the closed chains, which have a single customer each, require allocation of resources in the network so as to maximize a weighted throughput performance criterion. Chains with more than one customer can be decomposed into many chains of one customer each. It is proved that an optimal allocation of resources lies on a vertex (extreme points) of the set of feasible allocations. This considerably reduces the search space for an optimal allocation. Applications of this result in distributed computing are discussed.
For Correspondence: S. K. Tripathi, Systems Design and Analysis Group, Department of Computer Science, University of Maryland, College Park, MD 20742; C. M. Woodside, Department of Systems and Computer Engineering, Carleton University, Ottawa, Ont., Canada.
Greatest Common Divisors of Polynomials Given by Straight-Line
Abstract. Algorithms on multivariate polynomials represented by straight-line programs are developed. First, it is shown that most algebraic algorithms can be probabilistically applied to data that are given by a straight-line computation. Testing such rational numeric data for zero, for instance, is facilitated by random evaluations modulo random prime numbers. Then, auxiliary algorithms that determine the coefficients of a multivariate polynomial in a single variable are constructed. The first main result is an algorithm that produces the greatest common divisor of the input polynomials, all in straight-line representation. The second result shows how to find a straight-line program for the reduced numerator and denominator from one for the corresponding rational function. Both the algorithm for that construction and the greatest common divisor algorithm are in random polynomial time for the usual coefficient fields and output a straight-line program, which with controllably high probability correctly determines the requested answer. The running times are polynomial functions in the binary input size, the input degrees as unary numbers, and the logarithm of the inverse of the failure probability. The algorithm for straight-line programs for the numerators and denominators of rational functions implies that every degree-bounded rational function can be computed fast in parallel, that is, in polynomial size and polylogarithmic depth.
For Correspondence: Department of Computer Science, Rensselaer Polytechnic Institute, Troy, N.Y. 12180-3590; Kaltofen@cs.rpi.edu (arpanet)
External Hashing with Limited Internal Storage
Gaston H. Gonnet and Per-Ake Larson
Abstract. The following problem is studied: How, and to what extent, can the retrieval speed of external hashing be improved by storing a small amount of extra information in internal storage? Several algorithms that guarantee retrieval in one access are developed and analyzed. In the first part of the paper, a restricted class of algorithms is studied, and a lower bound on the amount of extra storage is derived. An algorithm that achieves this bound, up to a constant difference, is also given. In the second part of the paper a number of restrictions are relaxed and several more practical algorithms are developed and analyzed. The last one, in particular, is very simple and efficient, allowing retrieval in one access using only a fixed number of bits of extra internal storage per bucket. The amount of extra internal storage depends on several factors, but it is typically very small: only a fraction of a bit per record stored. The cost of inserting a record is also analyzed and found to be low. Taking all factors into account, this algorithm is highly competitive for applications requiring very fast retrieval.
For Correspondence: Department of Computer Science, University of Waterloo, Ont., Canada N2L 3G1.
Automating Program Analysis
Timothy Hickey and Jacques Cohen
Abstract. The first part of the paper shows that previous theoretical work on the semantics of probabilistic programs (Kozen) and on the correctness of performance annotated programs (Ramshaw) can be used to automate the average-case analysis of simple programs containing assignments, conditionals, and loops. A performance compiler has been developed using this theoretical foundation. The compiler is described, and it is shown that special cases of symbolic simplifications of formulas play a major role in rendering the system usable. The performance compiler generates a system of recurrence equations derived from a given program whose efficiency one wishes to analyze. This generation is always possible, but the problem of solving the resulting equations may be complex. The second part of the paper presents an original method that generalizes the previous approach and is applicable to functional programs that make use of recursion and complex data structures. Several examples are presented, including an analysis of binary tree sort. A key feature of the analysis of such programs is that distributions on complex data structures are represented using attributed probabilistic grammars.
For Correspondence: Computer Science Department, Ford Hall, Brandeis University, Waltham, MA 02254.
A Theory of Reliability in Database Systems
Abstract. Reliable concurrent processing of transactions in a database system is examined. Since serializability, the conventional concurrency control correctness criterion, is not adequate in the presence of common failures, another theory of correctness is proposed, involving the concepts of commit serializability, recoverability, and resiliency.
For Correspondence: Department of Computer Science, University of Toronto, Toronto, Ontario, Canada M5S 1A4.
On Conjunctive Queries Containing Inequalities
Abstract. Conjunctive queries are generalized so that inequality comparisons can be made between elements of the query. Algorithms for containment and equivalence of such "inequality queries" are given, under the assumption that the data domains are dense and totally ordered. In general, containment does not imply the existence of homomorphisms (containment mappings), but the homomorphism property does exist for subclasses of inequality queries. A minimization algorith is defined using the equivalence algorithm. It is first shown that the constants appearing in a query can be divided into "essential" and "nonessential" subgroups. The minimum query can be nondeterministically guessed using only the essential constants of the original query.
For Correspondence: This work was supported in part by National Science Foundation grant MCS 81-02864. It was being refereed at the time of Professor Klug's untimely death and has been revised for publication by D. S. Johnson and M. Yannakakis.
Optimal Design and Use of Retry in Fault-Tolerant Computer
Yann-Hang Lee and Kang G. Shin
Abstract. In this paper, a new method is presented for (i) determining an optimal retry policy and (ii) using retry for fault characterization, which is defined as classification of the fault type and determination of fault durations. First, an optimal retry policy is derived for a given fault characteristic, which determines the maximum allowable retry durations so as to minimize the total task completion time. Then, the combined fault characterization and retry decision, in which the characteristics of a fault is estimated simultaneously with the determination of the optimal retry policy, are carried out. Two solution approaches are developed: one is based on point estimation and the other on Bayes sequential decision analysis.
Numerical examples are presented in which all the durations associated with faults (i.e., active, benign, and interfailure durations) have monotone hazard rate functions (e.g., exponential Weibull and gamma distributions). These are sandard distributions commonly used for modeling and analyses of faults.
For Correspondece: Y.-H. Lee, IBM Thomas J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY 10598; K. G. Shin, Division of Computer Science and Engineering, Dpartment of Electrical Engineering and Computer Science, the University of Michigan, Ann Arbor, MI 48109. All corespondence should be addressed to Prof. Kang G. Shin at the above address.
Equivalence and Optimazation of Relational Transactions
Serge Abiteboul and Victor Vianu
Abstract. A large class of relational database update transactions is investigated with respect to equivalence and optimization. The transactions are straight-line programs with inserts, deletes, and modifications using simple selection conditions. Several basis results are obtained. It is show that transaction equivalence can be decided in polynomial time. A number of optimality criteria for transactions are then proposed, as well as two normal forms. Polynomial-time algorithms for transaction optimization and normalization are exhibited. Also, an intuitively appealing system of axioms for proving transaction equivalence is introduced. Finally, a simple, natural subclass of transactions, called strongly acyclic, is shown to have particularly desirable properties.
For Correspondence: S. Abiteboul, Institut National de Recherche en Informatique et an Automatique, 78153 Le Chesnay, France, and V. Vianu, Department of Computer Science and Engineering, Mail Code C-014, University of California, San Digeo, La Jolla, Calif., 92093.
Abstract. Many-sorted unification is considered; that is, unification in the many-sorted free algebras of terms, where variables, as well as the domains and ranges of functions, are restricted to certain subsets of the universe, given as a potentially infinite hierarchy of sorts. It is shown that complete and minimal sets of unifiers may not always exit for many-sorted unification. Conditions for sort hierarchies that are equivalent for the existence of these sets with one, finitely many, or infinitely many elements are presented. It is also proved that being a forest-structured sort hierarchy is a necessary and sufficient criterion for the Robinson Unification Theorem to hold for many-sorted unification. An algorithm for many-sorted unification is given.
For Correspondence: Universitat Karlsuhe, Institut fur Informatik I, Postfach 6980, D 7500 Karlsruhe 1, Federal Republic of Germany.
The Complexity of Searching a Graph
N. Megiddo, S. L. Hakimi, M. R. Garey, D. S. Johnson, and
C. H. Papadimitriou
Abstract. T. Parsons originally proposed and studied the following pursuit-evasion problem on graphs: Members of a team of searchers traverse the edges of a graph G in pursuit of a fugitive, who moves along the edges of the graph with complete knowledge of the locations of the pursuers. What is the smallest number s(G) of searchers that will suffice for guaranteeing capture of the fugitive? It is shown that determining whether s(G) [is less than or =] K, for a given integer K, is NP-complete for general graphs but can be solved in linear time for trees. We also provide a structural characterization of those graphs G with s(g) [is less than or =] for K = 1, 2, 3.
For Correspondence: N. Megiddo, Dept, K53, The IBM Almaden Research Center, 650 Harry Road, San Jose, CA 95120-6099; S. L. Hakimi, Department of Electrical and Computer Engineering, University of California at Davis, Davis, Ca 95616; M. R. Garey, Room 2-D152, AT&T Bell Laboraties, Murray Hill, NJ 07974; D. S. Johnson, Room 2D-150, AT&T Bell Laboratories, Murray Hill, NJ 07974; and C. H. Papadimitriou, Department of Computer Science, University of California, San Diego, La Jolla, Ca 92093.
ACM Transactions on Database Systems March 88
A Dynamic Framework for Object Projection Views
User views in a relational database obtained through a single projection ("projection views") are considered in a new framework. Specifically, such views, where each tuple in the view represents an object ("object-projection views"), are studied using the dynamic relational model, which captures the evolution of the database through consecutive updates. Attribute sets that yield object-projection views are characterized using the static and dynamic funcitonal dependencies satisfied by the database. Object-projection views are then described using the static and dynamic functional dependencies "inherited" from the original database. Finally, the impact of dynamic constraints on the view update problem is studied in a limited context. This paper demonstrates that new, useful information about views can be obtained by looking at the evolution of the database as captured by the dynamic relational model.
For Correspondence: Department of Comuter Science and Engineering, Mail Code C-014, University of California, San Diego, La Jolla, CA 92093.
Timos K. Sellis
Some recently proposed extensions to relational database systems, as well as to deductive database systems, require support for multiple-query processing. For example, in a database system enhanced with inference capabilities, a simple query involving a rule with multiple definitions may expand to more than one actual query that has to be run over the database. It is an interesting problem then to come up with algorithms that process these queries together instead of one query at a time. The main motivation for performing such an interquery optimization lies in the fact that queries may share common data. We examine the problem of multiple-query optimization in this paper. The first major contribution of the paper is a systematic look at the problem, along with the presentation and analysis of algorithms that can be used for multiple-query optimization. The second contribution lies in the presentation of experimental results. Our results show that using multiple-query processing algorithms may reduce execution cost considerably.
For Correspondence: Department of Computer Science, University of Maryland, College Park, MD, 20742.
Physical Database DEsign for Relational Databases
S. Finkelstein, M. Schkolnick, and P. Tiberio
This paper describes the concepts used in the implementation of DBDSGN, an experimental physical design tool for relational databases developed at the IBM San Jose Research Laboratory. Given a workload for System R (consisting of a set of SQL statements and their execution frequencies), DBDSGN suggest physical configurations for efficient performance. Each configuration consists of a set of indices and an ordering for each table. Workload statements are evaluated only for atomic configurations of indices, which have only one index per table. Costs for any configuration can be obtained from those of the atomic configurations. DBDSGN uses information supplied by the System R optimizer both to determine which columns might be worth indexing and to obtain estimates of the cost of executing statements in different configurations. The tool finds efficient solutions to the index-selection problem; if we assume the cost estimates supplied by the optimizer are the actual execution costs, it finds the optimal solution. Optionally, heuristics can be used to reduce execution time. The approach taken by DBDSGN in solving the index-selection problem for multiple-table statements significantly reduces the complexity of the problem. DBDSGN's principles were used in the Relational Design Tool (RDT), an IBM product based on DBDSGN, which performs design for SQL/DS, a relational system based on System R. System R actually uses DBDSGN's suggested solutions as the tool expects because cost estimates and other necessary information can be obtained from System R using a new SQL statement, the EXPLAIN statement. This illustrates how a system can export a model of its internal assumptions and behavior so that other systems (such as tools) can share this model.
For Correspondence: S. Finkelstein, Department K55/801, IBM Almaden Research Center, 650 Harry Road, San Jose, CA 95120-6099; M. Schkolnick, IBM Thomas J. Watson Research Center, P.O. Box 704, Yorktown Heights, NY 10598; P. Tiberio, Dipartimento di Elettronica, Informatica e Sistemistica, University of Bologna, Viale Risorgimento 2, Bologna 40100, Italy.
Concurrent Search Structure Algorithms
Dennis Shasha and Nathan Goodman
A dictionary is an abstract data type supportin the acitons member, insert, and delete. A search structure is a data structure used to implement a dictionary. Examples include B trees, hash structures, and unordered lists. Concurrent algorithms on search structures can achieve more parallelism than standard concurrency control methods would suggest, by exploiting the fact that many different search structure states represent one dictionary state. We present a framework for verifying such algorithms and for inventing new ones. We give several examples, one of which exploits the structure of Banyan family interconnection networks. We also discuss the interaction between concurrency control and recovery as applied to search structures.
For Correspondence: D. Shasha, courant Institute of Mathematical Sciences, New York University, New York, NY 10012; N. Goodman, Kendall Square Research Corp., 1 Kendall Square, Bldg. 300, Cambridge, Ma 02139.
ACM Transactions on Computer Systems February 1988
Managing Stored Voice in the Etherphone System
Douglas B. Terry and Daniel C. Swinehart
The voice manager in the Etherphone system provides facilities for recording, editing, and playing stored voice in a distributed personal-computing environment. It provides the basis for applications such as voice mail, annotation of multimedia documents, and voice editing using standard text-editing techniques. To facilitate sharing, the voice manager stores voice on a special voice file server that is accessible via the local internet. Operations for editing a passage of recorded voice simply build persistent data structures to represent the edited voice. These data structures, implementing an abstraction caled voice ropes, are stored in a server database and consist of lists of intervals within voice files. Clients refer to voice ropes solely by reference. Interests, additional persistent data structures maintained by the server, serve two purposes: Frist, they provide a sort of directory service for managing the voice ropes that have been created. More importantly, they provide a reliable reference-counting mechanism, permitting the garbage collection of voice ropes that are no longer needed. These interests are grouped into classes; for some important classes, obsolete interests can be detected and deleted by a class-specific algorithm that runs periodically.
For Correspondence: Computer Science Laboratory, Xerox Palo Alto Research Center, 3333 Coyote Hill Road, Palo Alto, CA 94304.
801 Storage: Architecture and Programming
Albert Chang and Mark F. Mergen
Based on novel architecture, the 801 minicomputer project has developed a low-level storage manager that can significantly simplify storage programming in subsystems and applications. The storage manager embodies three ideas: (1) large virtual storage, to contain all temporary data and permanent files for the active programs; (2) the innovation of database storage, which has implicit properties of access serializability and atomic update, similar to those of database transaction systems; and (3) access to all storage, including files, bu the usual operations and types of a high-level programming language.
The IBM RT PC implements the hardware architecture necessary for these storage facilities in its storage controller (MMU). The storage manager and language elements required, as well as subsystems and applications that use them, have been implemented and studied in a prototype operating system called CPR, that runs on the RT PC. Low cost and good performance are achieved in both hardware and software. The design is intended to be extensible across a wide performance/cost spectrum.
For Correspondence: IBM T. J. Watson Research Center, Yorktown Heights, NY 10598.
Scale and Performance in a Distributed File System
John H. Howard, Michael L. Kazar, Sherri G. Menees, David A. Nichols,
M. Satyanarayanan, Robert N. Sidebotham, and Michael J. West
The Andrew File System is a location-transparent distributed file system that will eventually span more than 5000 workstations at Carnegie Mellon University. Large scale affects performance and complicates systems operation. In this paper we present observations of a prototype implementation, motivate changes in the areas of cache validation, server process structure, name translation, and low-level storage representation, and quantitatively demonstrate Andrew's ability to scale gracefully. We establish the importance of whole-file transfer and caching in Andrew by comparing its performance with taht of Sun Microsystem's NFS file system. We also show how the aggregation of files into volumes improves the operability of the system.
For Correspondence: M. Satyanarayanan, Department of Computer Science, Carnegie Mellon University, Pittsburg, PA 15213.
Recovery Management in QuickSilver
Roger Haskin, Yoni Malachi, Wayne Sawdon, and Gregory Chan
This paper describes QuickSilver, developed at the IBM Almaden Research Center, which uses atomic transactions as a unified failure recovery mechanism for a client-server structure distributed system. Transactions allow failure atomicity for related activities at a single server or at a number of independent servers. Rather than bundling transaction management into a dedicated language or recoverable object manager, QuickSilver exposes the basic commit protocol and log recovery primitives, allowing clients and servers to tailor their recovery techniques to their specific needs. Servers can implement their own log recovery protocols rather than being required to use a system-defined protocol. These decisions allow servers to make their own choices to balance simplicity, efficiency, and recoverability.
For Correspondence: IBM Almaden Research Center, 650 Harry Road, San jose, CA 95120-6099.
Fine-Grained Mobility in the Emerald System
Eric Jul, Henry Levy, Norman Hutchinson, and Andrew Black
Emerald in an object-based language and system designed for the construction of distributed programs. An explicit goal of Emerald is support for object mobility; object in Emerald can freely move within the system to take advantage of distribution and dynamically changing environments. We say that Emerald has fine-grained mobility because Emerald objects can be small data objects as well as process objects. Fine-grained mobility allows us to apply mobility in new ways but presents implementation problems as well. This paper discusses the benefits of fine-grained mobility, the Emerald language and run-time mechanisms that support mobility, and techniques for implementing mobility that do not degrade the performance of local operations. Performance measurements of the current implementation are included.
For Correspondence: E. Jul, DIKU, Dept of Computer Science, University of Copenhagen, Universitetparke 1, DK-2100 Copenhagen, Denmark; H. Levy, University of Washington, Dept. of Computer Science, FR-35, Seattle, WA 98195; N. Hutchinson, Dept. of Computer Science, University of Arizona, Tucson, AZ 85721; A. Black, Digital Equipment Corporation, 550 King St., Littleton, MA 01460.
Caching in the Sprite Network File System
Michael N. Nelson, Brent B. Welch, and John K. Ousterhout
The Sprite network operating system uses large main-memory disk block caches to achieve high performance in its file system. It provides non-write-through file caching on both client and server machines. A simple cache consistency mechanism permits files to be share by multiple clients withou danger of stale data. In order to allow the file cache to occupy as much memory as possible, the filed system of each machine negotiates with the virtual memory system over physical memory usage and changes the size of the file cache dynamically. Benchmark programs indicate that client caches allow diskless Sprite workstations to perfrom within 0-12 percent of workstations with disks. In addition, client caching reduces server loading by 50 percent and network traffic by 90 percent.
For Correcpondence: Computer Science Division, University of California at Berkeley, 571 Evans Hall, Berkeley, CA 94720.
ACM Transactions on Programming Languages and Systems January 1988
Incremental Data-flow Analysis Algorithms
Barbara G. Ryder and Marvin C. Paull
An incremental update algorithm modifies the solution of a problem that has been changed, rather than re-solving the entire problem. ACINCF and ACINCB are incremental update algorithms for forward and backward data-flow analysis, respectively, based on our equations model of Allen-Cocke interval analysis. In addition, we have studied their performance on a "nontoy" structured programming language L. Given a set of localized program changes in a program written in L, we identify a priori the nodes in its flow graph whose corresponding data-flow equations may be affected by the changes. We characterize these possibly affected nodes by their corresponding program structures and their relation to the original change sites, and do so without actually performing the incremental updates. Our results can be refined to characterize the reduced equations possibly affected is structured loop exit mechanisms are used, either singly or together, thereby relating richness of programming-language usage to the ease of incremental updating.
For Correspondence: Department of computer Science, Rutgers University, New Brunswick, NJ 08903.
An Overview of the SR Language and Implementation
Gregory R. Andrews, Ronald A. Olsson, Michael Coffin, Irving Elshoff, Kelvin
Nilsen, Titus Purdin, and Gregg Townsend
SR is a language for programming distributed systems rainging from operating systems to application programs. On the basis of our experience with the initial version, the language has evolved considerably. In this paper we describe the current version of SR and give an overview of its implementation. The main language constructs are still resources and operations. Resources encapsulate processes and variables that they share; operations provide the primary mechanism for process interaction. One way in which SR has changed is that both resources and processes are now created dynamically. Anoither change is that inheritance is supported. A third change is that the mechanisms for operation invocation--call and send--and operation--proc and in--have been extended and integrated. Consequently, all of local and remote procedure call, rendezvous, dynamic process creation, asynchronous message passing, multicast, and semaphores are supported. We have found this flexibility to be very useful for distributed programming. Moreover, by basing SR on a small number of well-integrated concepts, the language has proved easy to learn and use, and it has a reasonably efficient implementation.
For Correspondence: Gregory R. Andrews, Michael Coffin, Irving Elshoff, Kevin Nilsen, and Gregg Townsend, Department of Computer Science, The University of Arizona, Tucson, AZ 85721; Ronald A. Olsson, Division of Computer Science, The University of California at Davis, CA 95616; Titus Purdin, Department of Computer Science, Colorado State University, Fort Collins, CO 80523.
A Mathematical Approach to Nondeterminism in Data Types
Wim H. Hesselink
The theory of abstract data types is generalized to the case of nondeterministic operations (set-valued functions). Since the nondeterminism of operations may be coupled, signatures are extended so that operations can have results in Cartesian products. Input/output behavior is used to characterize implementation of one model by another. It is described by means of accumulated arrows, which form a generalization of the term algebra. Morphsms of nondeterministic models are introduced. Both innovations prove to be powerful tools in the analysis of input/output behavior. Extraction equivalence and observable equivalence of values are investigated. Quotient models for such equivalence relations are constructed. The equivalence relations are compared with each other, with separation of values by means of experiments, and with the separation property that characterizes a terminal model. Examples are given to show that the four concepts are different. In deterministic models the concepts coinside.
For Correspondence: University of Groningen, Department of Methamatics and Computing Science, P.O. Box, 9700AV Groningen, The Netherlands.
Specification and Verification of Liveness Properties of Cyclic,
Joylyn Reed and Raymond T. Yeh
A technique is described for software specification and verification of concurrent, distributed systems. The complete specification of a program is given in terms of a hierarchical structure of module specifications. Module external specifications are abstract; module internal specifications are descriptions of internal implementations, either in terms of submodules or actual code. The verfication that an implementation satisfies its specification is language independent for the former and language dependent for the latter. Distinguishing the liveness properties provided by a module and the liveness properties required by a module (from its comodules) allows the specification and verification of a given module to be independent from the specification and verification of its comodules.
For Correspondences: J. Reed, Oxford University Computing Laboratory, Programming Research Group, 8-11 Keble Road, Oxford OX1 3QD, United Kingdom: R. T. Yeh, International Software Systems Inc., 9420 Research Blvd., Echelon III, Suite 110, Austin, TX 78579.
Systems Semantics: Principles, Applications, and Implementation
Raymond T. Boute
Systems semantics extends the denotational semantics of programming languages to a semantics for the description of arbitrary systems, including objects that are not computations in any sense. By defining different meaning functions, the same formal description may be used to denote different system properties, such as structure, behavior, component cost, and performance aspects (e.g., timing).
The definition of these semantic functions also provides guiidance in language design, in particular for the match between language constructs and the system concepts to be expressed. Aiming at compositionality ensures useful properties for formal manipulation. In this fashion, the meaning functions can be made sufficiently simple to serve not only as a direct implementation on a machine but also as rules for reasoning about systems in a transformational manner. As the applications show, however, compositionality can be ensured only through careful consideration of the characteristics of the flow of information inside the system.
Two classes of application are discussed:
--Unidirectional systems, in particular digital systems, without feedback (combinational) and with feedback (sequential), and a certain class of analog systems.
--Nounidirectional systems, in particular two-port analog networks. The emphasis will be on the functional style of description and on formal reasonong (theorem proving, derivation of properties).
Implementation and rapid prototyping strategies in various system description environments are also briefly discussed. These would permit the concepts of system semantics to be explored without the need for a complete implementation.
For Correspondence: Department of Computer Science, university of Nijmegen, Toernooveld 1, 6525 ED Nijmegen, The Netherlands.
Technical correspondence: A Note on the Drinking Philosphers
Sandra L. Murphy and A. Udaya Shankar
A nonprobabilistic solution with symmetric rules exists to the drinking philosophers problem. The solution is fair, given an implementation which eventually executes any guarded command that is infinitely often enabled. We present modified versions of the solution that are fair under a weaker implemenation: namely, one which eventually executes any guarded command that is continuously enabled. In order for the modified algorithms not to have increased delays in satisfying requests for resources, we fined it necessary to either violate a nice read-only property of the existing solution or restrict the right of a philosopher to request resources.
For Correspondence: Department of Computer Science, University of Maryland, College Park, MD 20742.