Printer Friendly

Extensions to Starburst: objects, types, functions, and rules.

Database management system (DBMS) technology has made enormous strides over the last three decades, not only in performance--facilitated by impressive advances in the processors and storage devices of computer systems--but also in the functionality of DBMS software.

Relational DBMS, prototyped in the 1970s and now common in the marketplace, gained popularity largely because of their simple underlying concepts and high-level set-oriented query languages that simplified application development. As the traditional recordkeeping applications of DBMSs (e.g., banking, insurance, personnel) have become more sophisticated, and as nontraditional applications (such as engineering databases) have attempted to adapt off-the-shelf DBMSs, users have been frustrated by the difficulty and poor performance of adding new features "on top of" commercial DBMSs, i.e., in the applications rather than the DBMS itself. Yet their demands for new and increasingly complex features to be built into the DBMS have overwhelmed the capacity of DBMS vendors to enhance their already complex products. Influenced by advances in modern programming languages, users increasingly expect database languages to offer richer data types, user-defined functions, recursion, etc. Prompted by the cost-effectiveness and performance of workstations and networks, distributed databases are becoming more common. And experimentation with Expert Systems has stimulated interest in making the database a "knowledge base," which can store rules that control the data, react to changes in the data, and derive new information from it. To complicate matters, many applications need only a few enhancements, yet share their data with other applications needing other features.

The Starburst project at the IBM Almaden Research Center in San Jose, California was initiated in 1985 to address these problems. Its goal is to build from scratch an extensible DBMS prototype that will both (1) facilitate enhancing DBMSs with the functionality to support new applications efficiently, especially those not currently served well by commercially available DBMSs; and (2) provide a flexible testbed for our own ongoing research and experiments in new DBMS technology. Starburst evolved from our own firsthand experience in extending a system not originally designed for extensibility: during the early 1980s, we successfully adapted a version of the original System R prototype [19] to create a distributed relational DBMS prototype, called R * [40, 44, 70]. The lesson was clear: extensibility cannot be retrofitted; it must be a fundamental goal and permeate every aspect of the design.

A basic tenet of the Starburst extensible DBMS design is that the conflicting goals of application-specific facilities and information integration can best be satisfied by DBMSs that support the addition of domain-specific extensions in the context of a common data model. This hypothesis distinguishes Starburst from other work in extensible DBMSs. Both the GENESIS [10-12] and the EXODUS [15, 16] projects have built automated DBMS toolkits or generators for building a DBMS with a customized data model, storage system, query processing algorithms, etc., that best suit a particular domain. However, this approach may make it difficult to share and integrate data from several diverse applications. For example, in computer-integrated manufacturing, the goal is to integrate design drawings, textual documentation, real-time manufacturing information, inventory and financial data, and many other diverse data sources. Other projects in DBMS extensibility have based their work on alternative data models such as nested relational [23, 58], object-oriented [6, 37, 46], or functional [26]. Basing Starburst on the relational model and on extensions of a "standard" database access language allows us to exploit much of the proven relational DBMS technology and facilitates porting existing applications to Starburst. In this respect, Starburst resembles the POSTGRES [61, 63, 64] and SABRE [2] systems. However, Starburst's goal of building a complete DBMS that is extensible at every level--data storage methods, access methods, execution strategies, query optimization, query analysis, query language, . . .--goes beyond the goals of these systems. Naturally, some of the more fundamental of these extensions would have to be made by Database Customizers (DBCs)--experts knowledgeable in both database technology and the application domain that needed th extension.

For portability, we wrote Starburst in C. We started development of Starburst on IBM PC/AT machines, but we soon proved Starburst's portability by migrating to the AIX operating system (IBM's version of Unix V (1)) on RT/PC computers, IBM's first RISC-technology workstations. Recently, we have ported Starburst to the new RISC System/6000 computers, and have begun using C++. While, our development environment is a local-area network of workstations, it has not been a primary goal of this work to investigate client-server architectures or heterogeneous distributed DBMS.

Currently, Starburst executes all the standard SQL data definition and data manipulation commands, including SELECT, INSERT, DELETE, UPDATE, UNION ALL, and INTERSECT (2), as well as several extensions we have made to SQL. A great deal of hidden prerequisite infrastructure in query processing, concurrency control (Starburst supports multiple concurrent users), standard transactions and recovery [48, 56] (3), system catalogs, etc., is implemented and operational. In addition, all of the extensions described in this article are working and demonstrable, except for our extensible type system and the application program interface for structured complex objects (XNF). User-defined functions must be linked (statically) with the rest of Starburst. In the near future we plan to define and dynamically link new functions without having to relink and reboot. We are in the process of selecting a few external users to help us define requirements for specific extensions needed by a range of real applications. A prerequisite to making Starburst available to potential extenders is better documentation, more system testing and 'bullet-proofing', and a much more sophisticated application program interface that will permit storage of compiled SQL queries, dependence of ose queries on base objects such as tables, and invalidation of the compilation when any of its dependent objects change.

This article gives an overview of some of the extensions we have developed for Starburst to support hierarchies of user-defined types and functions, large unstructured and structured complex objects, and user-defined rules to respond to changes in the database. A description of how we made Starburst extensible and the details of query processing are beyond the scope of this article and can be found elsewhere [30-32, 34, 38, 42, 43, 52, 60]. The next section explains why we close to extend the relational model rather than take an object-oriented approach, and gives a quick overview of Starburst's extensibility mechanisms used by the extensions summarized in this article. The section immediately following, "User-defined Types and Functions" describes the Starburst type system and user-defined functions. "Complex Objects" details our approach to storing large, complex objects, either as unstructured 'long fields,' or as structured views comprised of data stored as tables. A mechanism that facilitates the constructure of these objects by speeding up parent-child is joins is also included in that section. The section "Active Database--User-defined Rules" summarizes two systems implemented in Starburst that permit users to define rules to react to specified changes in data or other events. The section entitled "Performance" provides some initial measurements of Starburst's execution performance. The final section of this article presents our conclusions and future direction.

Philosophy and Background

To fully appreciate the new features of Starburst described in the remainder of this article, one must first understand why we chose to build an extended relational DBMS rather than an object-oriented DBMS, as well as some of the extensibility mechanisms of the Starburst kernel used to implement those features.

Evolution vs. Revolution

The central thesis of Starburst, and of this article is similar to that articulated by the "Third-Generation Data Base System Manifesto" [22]:

One can provide the desirable enhancements of Object-Oriented, Logic, Deductive, Active, and other DBMS technologies, while still retaining all the strengths of a relational DBMS.

When we began developing Starburst, we considered and consciously rejected a revolutionary departure from the relational model, once we convinced ourselves that we could define efficient extensions to a relational DBMS that would support the novel features that applications required. Users of relational DBMSs have increasingly recognized the benefits of concentrating the shared aspects of data in the DBMS, rather than redundantly (and possibly inconsistently) including those aspects in every application that accesses that data. They have demanded that the DBMS provide more of the application-specific semantics and behavior that is normally associated with that data, including richer types, functions on those types, stronger constraints on the values those types may assume, and actions that may be triggered when those types are modified in certain ways (active database) [41].

In Starburst, we have tried to incorporate the best features of many database technologies as extensions to a relational kernel. Hence all of the important advances of relational DBMSs are retained: a sold theoretical foundation, a declarative query language (an extended form of SQL) that manipulates sets of objects and can be optimized automatically by the system, views (virtual data defined by queries), and the derivation of new and unanticipated implicit relationships between objects--based only upon the values of their attributes--via the relational JOIN operation. By extending SQL, we gain compatibility with an international standard, existing tools, and existing databases. The extensibility of this kernel has allowed us to augment the relational model with enhancements from other database paradigms.

From the object-oriented world, we have drawn a richer type system, enhanced performance using (system-maintained) pointers to related objects, encapsulation of behavior with the data, and object identifiers for stored objects (to facilitate updates and deletes to views--see "Semantics--XNF"). As in object-oriented DBMSs, Starburst allows users to create new types, to relate these types in a hierarchy, to define new functions on these types or inherit them from another type in the hierarchy, and to encapsulate those functions with the type.

From Logic and Deductive DBMSs (e.g., NAIL! [65] and LDL [21]) and Expert Systems, we are adding general recursion to SQL (which, together with SQL's functions, makes our extended SQL a superset of Datalog) and have implemented user-defined production rules that can trigger actions in response to database changes (active database). Since these rules can themselves modify the database, thus triggering other rules, new information (called intensional data in Deductive databases) can be derived from the facts stored in the database (extensional data) in a form of reasoning called forward chaining in the Artificial Intelligence literature.

Many of these new extensions to Starburst benefit traditional commercial DBMS applications as well as new applications [22]. For example, recursive queries are needed for assembling a parts explosion from a 'is-a-component-of' table. All applications benefit from associating more semantic meaning and behavior with application-specific types. For example, distinguishing a 'money' type from other real types such as 'percentages' allows the detection of meaningless relational joins--and other operations--between semantically disjoint domains that a limited type system must lump together (e.g., as reals).

In summary, the extensible architecture of Starburst, which we review in the following subsection, has allowed us to incorporate the best features of existing DBMS technology, as well as many other useful extensions.

Overview of Starburst Architecture

In order to understand the various extensions we have made to Starburst, one must first be familiar with Starburst's overall architecture and some of its generic extensibility mechanisms. Our description is necessarily cursory. for more details, see [30-32, 60].

Extensibility at the lowest level of data storage ('Core') is provided by Starburst's Data Management Extension Architecture [42]. In this architecture, each table may be stored in any one of several storage methods. Example storage methods include heap files, nonrecoverable temporary files, and B+-tree files. Every storage method must provide a well-defined st of operations to create and destroy an instance of that storage method, to insert and delete a record, etc. Additionally, each table can have associated with it zero or more instances of each of the available attachment types, which are automatically notified whenever changes are made to that table. A simple example of an attachment type is an access path such as a B+-tree or signature index [20]. In addition to tracking changes to keep auxilliary data up-to-date, attachments are permitted to veto table modification operations. This makes it possible to use the attachment interface to support operations such as checking integrity constraints (e.g., the NOT NULL constraint in SQL). Both storage methods and attachments make use of certain common services, such as concurrency control, transaction management, recovery, predicate evaluation, and event queues. Core provides a number of standard event queues for events--such as preparing, committing, or rolling back a transaction; creating or dropping instances of tables or attachments, etc.--but users may add queues for any event and/or add push actions onto any queue.

In addition, we have made a number of extensions to the SQL language, to make it more generic and orthogonal, to remove a number of implementation-dependent restrictions, and to increase the power of the language [24, 25, 32]. As an example of increased orthogonality, we have implemented table expressions [24]. A table expression is an SQL query that creates a named table (like a view) that may be referenced only in the query that contains its definition (unlike a view, it is not stored in the system catalogs). Several of the extensions described in this article, such as the definition of complex objects and production rules, required other extensions to SQL.

Of course, extensions to Starburst at any level are only useful it the Starbust query processor ('Corona') is aware of them and is able to take advantage of them when appropriate. This challenging task is the subject of many earlier papers on Starburst's extensible query processing [30-32, 34, 38, 43, 52] and a description of it is omitted for brevity.

User-Defined Types and

Functions

One of the biggest limitations of traditional relational DBMS products is their fixed set of base data types and functions on those types. The designers of extended relational DBMSs [22, 54, 57] as well as object-oriented DBMSs have recognized the power of many concepts from modern programming languages, such as abstraction and encapsulation hierarchies of user-defined types and inheritance of functions on those types, and have endeavored to incorporate these concepts into database systems.

Starburst currently has five kinds of functions, which may be characterized by the metatypes of their input and output, as shown in Figure 1. The simplest, scalar functions, have one or more scalar arguments (e.g., a column of a rows in a table, or a constant) and produce a scalar. An example in existing SQL is the (infix) multiplication operator (*), and an example extension not in current SQL might be any of the trigonometric functions, square root, etc. An aggregate function operates on a table input (usually, an entire column over all rows in a table), and produces a scalar output. In SQL, there is an aggregate function for average (AVG), but not one (yet) for standard deviation. A set predicate function is a special case of an aggregate function that returns a Boolean value, and hence can be part of a predicate. SQL's nested subqueries are constructed of such set predicates, and we envision extensions that could generalize the current universal (ALL) and existential (ANY) quantifiers to include, say, MAJORITY. Any function producing a table as output is called a table function in Starburst. There are no examples in existing SQL of table functions having scalar inputs, but in Starburst, scalar inputs are permitted. The relational calculus operations of UNION and INTERSECT, and the relational algebra JOIN, provide examples of table functions having tables as inputs. Table functions provide a convenient mechanism for importing data from outside the database and presenting it as a virtual table (much like relational views), so that it can be subsetted, joined with other tables, and similarly manipulated by the full power of SQL, without actually having to store that data in the database. For example, we have implemented the 1s (list files) table function, whose parameter is the directory to be searched, and whose implementation executes the Unix 1s command on the directory and formats the output as a table. A useful example of a table-input table function is Sample, which returns a user-specified number of randomly selected rows of the input table.

Currently, adding a user-defined function is possible but a bit inconvenient: The code implementing any user-defined function must be linked with the rest of the Starburst system. We plan to exploit dynamic linking in AIX Version 3.1 so that users may dynamically register a function with Starburst, without having to relink or restart Starburst. We have yet to implement the data definition language for defining a new function to the system in this way.

Although the type system for Starburst is still under design, the goals and some preliminary ideas can be reported. The approach is a departure from an earlier paper on types in Starburst [71], and has been inspired by the extensible type systems of other systems, POSTGRES, for example [57]. Starburst will support an extensible, hierarchical type system in which all user-defined types can be encapsulated, that is, all access to the type's representation is restricted to 'owners' of the type. A type may inherit functions and components (columns) from other types by designating those types as its supertypes (called multiple inheritance). We also wish to check the type correctness of expressions when compiling the SQL statements in which they appear (called compile-time checking), since we feel that applications should not experience type errors during execution. Database type systems are distinguished from those of programming languages in two important aspects: (1) they are persistent, and hence must support in the incremental addition of new types and functions on new and existing types over time; and (2) because DBMSs are shared facilites, they must be able to interface their type system with that of multiple application programming languages, each having its own type system. The latter two goals significantly complicate a type system for a DBMS.

Our application interface includes a new DBMS command to evaluate expressions on Starburst types--including values that have been retrieved from Starburst tables by normal SQL queries and host language variables that have been converted to Starburst types. This approach keeps manipulation of Starburst types under Starburst control, and avoids having to convert Starburst types to application types and having to replicate the same functions on those types in every application environment(4). As with existing commercial DBMSs, a Starburst preprocessor recognizes each database command embedded in an application program and replaces it with calls to a compiled version in the database run-time library. SQL queries are compiled as before, but expression commands are also compiled by the Starburst expression evaluator into a program that is passed to--and executed by--a copy of the expression evaluator run-time system that runs within the environment of the application program. In this way, Starburst types retain their representation and behavior after being retrieved from the database, by restricting their manipulation to expressions that involve their database-registered functions and that are compiled and executed exclusively by Starburst components.

Functions can be overloaded, that is, the same function name is used for different implementations, depending upon the types of the function's arguments. Since subtype substitutability allows subtypes to be substituted for their supertypes at any point, the type system must select the correct implementation (body of code) to execute for any given set of actual arguments (called dispatch). Unlike programming languages employing the message-passing model, which dispatches based upon a single, distinguished argument, Starburst dispatches based upon the types of all arguments (this is sometimes called multimethod dispatch). A novel aspect of Starburst's multimethods is that they are typechecked at compile time [4].

Complex Objects

Many applications wishing to store large objects in relational DBMSs (e.g., CAD/CAM) have been frustrated by limitations and poor performance. Several extensions in Starburst obviate these problems, while retaining the benefits of a relational DBMS. We support two alternatives for storing complex objects: (1) The entire object can be stored in a 'long field', in which representation of the object is entirely under user control. This means that the database therefore has minimal ability to apply selection predicates based upon the contents of this 'long field' in a query. (2) Alternatively, the object's atomic components can be stored as rows in tables, and the object can be constructed by composing these rows using a new kind of relational view, an extension to Starburst called the eXtended Normal Form (XNF). We next describe these two extensions, and an attachment that expedites the creation of an object via XNF (or any join). All of these extensions are operational, except for the application interface to XNF.

UNStructured Objects

Most relational DBMSs allow the storage of fairly sizable atomic objects as long field columns. From the perpective of the DBMS, a long field contains an uninterpreted sequence of bytes (much like a file), i.e., the DBMS does not model the contents or structure of the long field, so queries cannot apply functions to its contents. IBM's DB2, SQL/DS, and OS/2 Extended Edition DBMS products can store objects up to 32K bytes in a long field (5). While adequate for many applications, these size limitations usually preclude the storage of moderately large objects such as images, voice, and video.

Starburst's Long Field Manager (LFM) stores objects up to 1.5 gigabytes long as though they are just another value in a column, from the user's perspective. This is preferable to simply storing such long byte sequences in the AIX file system, because the LFM: (1) participates in concurrency control, transactions, and recovery of changed data (6) (although at the granularity of the entire long field); (2) allows more of these objects (tens or hundreds of millions) than the number of files allowed in Unix; and (3) uses a number of Starburst extensions and implementation tricks to store and retrieve thes objects more efficiently than if they were stored via the file system. In reality, only a Long Field descriptor is stored in the column (see Figure 2). The descriptor contains a map of the object, a set of pointers to large, contiguous chunks of disk that are managed directly by LFM. By using the 'buddy system' and an ingenious growth algorithm to minimize the number of contiguous chunks needed to store an object of any size (from 1 page up to 1.5 gigabytes), and by bypassing the AIX file manager and its buffering, LFM avoids unnecessary copying of large objects, minimizes the number of head seeks, and maximizes the I/O rate [39].

The LFM is given a separate minidisk (i.e., a separate partition) secured from the AIX file system at Starburst installation time. Thereafter, Starburst will automatically create an attachment for any table having one or more long fields. The attachment is invoked whenever an SQLINSERT, DELETE, or UPDATE is made to that table. The attachment invokes LFM to allocate (or deallocate) the appropriate space from the LFM minidisk, to copy the given object into that space, and to construct the descriptor and store it as the "value" of the column. When a long field is the subject of a SELECT statement, a system-supplied function is called to fetch the object from disk and assemble it in a user-designated space, a buffer for display, etc. Long fields can be manipulated with other user-defined functions at about the same level as file system operations: extract or replace bytes selectively, copy a long field to a file, append bytes to a long field, etc.

Structured Objects

Often application need the efficiency of accessing an entire object through a single call to the database, but also want the database to model and manipulate its constituent subobjects as well. Typically an object has a hierarchical structure with repetition of components as well as sharing of components both within and between structured objects. This more common (and challenging!) definition of "complex objects" is addressed by Starburst's XNF, described next.

Semantics--XNF. A complex (or composite) object is defined to be a single object, possibly composed of many subobjects of different types, in some relationships to each other, with each subobject possibly having many instances within the object (repeating groups). For example, an automobile engine is composed of a carburetor, engine block, etc., and the engine block is further composed of several pistons, a couple of camshafts, and so on. There are various approaches to modeling complex objects. The navigational approach, followed by object-oriented systems, directly represents relationships between components with links or pointers that must be defined and managed by methods. Only relationships that are defined in the static schema can be exploited, and membership in a relationship must be explicitly specified. The nested relation approach (Non-First Normal Form of [NF.sup.2]) [1, 23, 58] places hierarchically related components with the parent component. Access to subcomponents must be via the parent. Like the navigational approach, only statically defined nestings are typically available. In addition, components can be shared between different parents only by listing foreign keys, which must be joined just as in normalized relational tables. In another somewhat more flexible relational approach [35, 45], the user must define special columns to assign an identifier to a row, to contain its parent's identifier, and to reference another object's identifier. Joins between parents and children are then aided by system-maintained maps stored at the object's root that related the identifier of each subobject to its location. A similar approach is taken by systems such as [O.sub.2] [6], ORION [8], and EXODUS [15], in which there are preferred, pointer-based paths, as well as the ability to define new relationships via intensional specifications. In Cactis [36], any attribute of an object can have attribute evaluation rules that described how that attribute may be derived from other objects, and which are (re-)evaluated when the object is referenced. POSTGRES embeds the QUEL query that defines an object in its parent record, possibly caching its result [64].

Our approach to supporting structured objects in Starburst responds to a number of very ambitious goals. We want to support sharing of objects and their components (i.e., any complex object may be a component shared by many complex objects, or it may be even be shared within the same complex object). We also wish to support recursive objects (i.e., an object may have subobjects that may have subbjects, arbitrarily deep, permitting modeling of part/subpart, cell/subcell, organizational hierarchies, and path expression queries). Finally, we seek to support complex object views [69], allowing many complex objects to be defined over the same shared data. Sharing is very important because it lets an object play many roles in relationships with other objects, without having to be replicated for each role. As well shall soon see, sharing subobjects is not precluded by value-based relational DBMSs, as claimed by the "Object-Oriented Database System Manifesto" [5] (7). Furthermore, we wanted to retain the benefits of the relational model: declarative and user-managed memberships in collections, associative queries over complex objects, and closure (so that queries on complex objects produce complex objects that can be nested in other queries or objects.

Support for object views leads us to deriving explicitly linked complex objects from flat tables using intensional (nonprocedural) specifications in the form of a specially structured query, an approach that is also compatible with the declarative spirit of the relational model. Starburst's eXtended Normal Form (XNF) is based upon the concept of a relational view. A view is a single virtual (i.e., intensional) table: its definition is stored in the database, but not the table that it produces. The definition is a query over stored (i.e., extensional) tables or other views. In XNF, an object is defined by an XNF view, which extends relational views from a single resulting table to multiple tables, organized as a structured, heterogeneous collection of interrelated rows. As with standard SQL views, XNF views are defined by a query, using an extended form of SQL to define the various row-valued elements of the complex object and to characterize the relationships between rows that give the result its structure. This intensional definition of objects is in stark contrast to the extensional (or stored) representations of the nevigational and nested relation approaches. The advantages of deriving complex structures lie in their ability to define multiple structures over the same stored information and their use of value-based specifications of intercomponent relationships. The creation of new logical structures does not depend on explicit relationships (i.e., pointers). Instead, value-based relationships are used to specify the structure of the (virtual) complex objects.

In order to define structured result queries, the Starburst query language must be extended. An XNF query is identified by the keywords OUT OF and consists of : 1) definitions of components tables (called 'nodes'), identified by the keyword SELECT; and 2) definitions of directed relationships (called 'edges') between component nodes (identified by the keywords RELATE <nodes> WHERE <predicates>. The nodes are defined using standard SQL (i.e., flat) queries, and can therefore consist of any sets of rows derivable from the database using relational expressions. Relationships are predicates which, given an element from the parent nokde, select zero or more elements from the child node. The resulting nodes and relationships form a directed graph. In the remainder of this article, we assume this is a connected graph having a distinguished root. The relationship predicates can make use of data from not only the parent and child node records but also from other tables. For example, many-to-many relationships are typically modeled in relational DBMSs using mapping tables that need not appear in the results of the query.

Consider the example object 'Dept [underscore] K55 [underscore] Obj' given in Figure 3. The corresponding syntax for defining this object is given in Figure 4. The nodes are formed by queries to the Departments, Employees, Projects, and Skills tables. The relationships, based upon standard SQL join predicates, establish for any given department the employees it 'Employs,' the projects it 'Has,' and the skills that either one of its employees 'Possesses' or one of its projects 'Needs,' or both. The graph of instances (rows) on the right illustrates how only those subobjects that are reachable from the root object via directed relationships will be included in the structured result, but may continue to exist quite independently in their home tables, even if they belong to no objects. For example, skill s2 remains a legitimate skill in the Skills table, even though it is not included in object d1 because no employee of d1 'Possesses' that skill and no project of d1 'Needs' it. Notice also how we can model that skill s3 is both 'Possessed' by employee e2 and 'Needed' by projects p1 and p2, but occurs only once in the database.

XNF objects may be combined, proejcted, and restricted. Two objects may be combined by defining relationships between any node of one object and the root of another. Projection of a structured object is defined by listing the nodes to be retained and combining relationships whenever internal nodes essential to connectivity of the relationship graph are excluded. Both nodes and relationships can be restricted by defining additional predicates on the existing nodes and relationships.

To process XNF queries, Starburst's extensible query processor also had to be extended. Its query rewriting mechanism [34] converts an XNF query to a standard internal representation (Query Graph Model) that can be optimized by Starburst's rule-based optimizerm using all of its known processing strategies [38, 43]. In particular, the optimizer will exploit any IMS attachments, which are discussed in the next subsection. Since XNF queries delimit declaratively the entirety of the defined object(s) to be retrieved, the Starburst optimizer has a more global purview and thus can optimize that query better than the instance-at-a-time dereferencing (following unidirectional links) of navigational systems [29]. For example, a selective predicate on the children might convince the optimizer to access the children first, reducing the number of parents that must be fetched.

For accessing these object structures from an application program, we envision two modes of access. Some applications require that complex objects be in a particular format (e.g., browsing through the complex object by pointer dereferencing in C++). In this case, a structure loader loads the data into the application buffer in the desired format. It is anticipated that multiple versions of the structure loader will be needed to support applications with requirements for specific formats.

Alternatively, a row-oriented mode of access allows the object's components to be accessed individually using dependent cursors. The root node of the object has an associated cursor, which can be OPENed and FETCHed in the usual way to produce successive instances of the root node. In addition, each child node will have a dependent cursor that is OPENed with each FETCH from its parent cursor. When FETCHed from, each dependent cursor provides the next instance of the child node that satisfies the internode relationship to the current row of the parent cursor. Using dependent cursors, the structured result can be traversed by an application which FETCHes all the entries from dependent cursors.

A further extension to the structure loader is to support coaching of the materialized result, to support fast access to objects that contain many components. Attachments or rules on the underlying tables could trigger invalidation of the cached results or propagate incremental changes to the cached result.

We feel quite confident that our intentional (nonprocedural) approach to the specification of complex, structured objects provides a useful mechanism for modeling constructs of a wide range of applications. However, updating of structured results is less well understood. Recently, we have implemented techniques for updating XNF views, including those containing joins [55]; similar techniques were independently developed in [9]. We had to add a rudimentary form of object identifiers to map updates and deletes of a view back to rows in its base tables. We are exploring this problem further in the context of CHECKIN/CHECKOUT support for complex objects.

Performance--the IMS attachment. The construction of complex objects, described in the previous subsection, as well as traditional queries, require joins that are very fast. We have implemented in Starburst an attachment, called Incrementally Matched Sets (IMS), that accelerates joins between parents and their children stored relationally in tables by exploiting system-maintained pointers and row clustering [17] (8). Note that an iMS attachment, like any index, is an auxiliary access mechanism that is not logically part of the data stored in tables. Its presecence or absence does not alter what or how queries can be specified, or the result to any query. Only the query execution performance is affected by the use of IMS's pointers to evaluate a join predicate.

The IMS attachment enforces referential integrity between two tables and can (optionally) cause rows of the child to be clustered together with their parent. To create a new IMS attachment instance, the user must specify the parent and child tables, the column names to be joined, the name of a unique index on the parent table's join column(s), and the clustering mode desired. The index on the parent table's join column(s) is required to locate a parent row efficently, during insert of a child row. IMS provides three clustering options: NONE, in which related rows are linked together but no clustering is enforced; CHILD clustering, in which IMS attempts to enforce clustering of all child rows having a common parent; and FULL clustering, in which IMS attempts to cluster a parent row and all of its associated child rows.

When a user creates a new instance of an IMS attachment, Starburst extends the schemas of the parent and child tables to include several columns that are hidden to the user and contain pointers to (row identifiers of) parent, child, or sibling rows. Figure 5 shows the pointer structure maintained by an example IMS attachment that would be useful for evaluating our XNF example, in which the Departments table is the parent table and the Employees table is the child table. IMS attachments can be defined between any two joinable tables, so a user could also define IMS attachments between Departments and Projects (although Departments can only be clustered one way!) and even, say, Departments and Divisions, creating a three-level hierarchy.

The job of an IMS attachment can be viewed as incrementally joining related records from a pair of tables when changes are made to either table. Because IMS enforces referential integrity, a new parent row cannot have any children in the database yet; therefore inserting a parent row is easy. Deletions of parents having any children, or updates of the parent's join column, are vetoed by the IMS attachment, because it would violate referential integrity. IMS attachment routines are called both before and after a new child is inserted. The preinsert routine uses the index of the parent's join column to find the row identifier for that row's parent, and from that parent its pointer to its first child. Depending upon the clustering mode, IMS may provide either of these as hints to the storage method as to where it should inser the new row. The postinsert routine then updates the pointers in the hidden columns, once the actual row identifier for the inserted row is returned by the storage method. Updating and deleting children proceed similarly.

Experiments on the performance of IMS attachments were conducted on the Wisconsin benchmark database, varying the number of children per parent, the clustering mode, and the execution plan (including a variation to sort accesses to children into physical order). The results are detailed in [17]. In general, the best plan exploiting the IMS attachment was faster than the best non-IMS plan by an average factor of two, because the direct pointers save the cost of traversing an index to find matching rows in the other table. However, results were less clear-cut when other parameters were varied. The plans sorting row identifiers were almost always better than comparable plans without sorting, in some cases by an order of magnitude (especially with no IMS-enforced clustering), but were worse when the sorting was redundant of IMS-enforced clustering. IMS-enforced clustering often improved performance, by up to six times over unclustered access, but actually degraded performance when the column on which IMS-enforced clustering excluded clustering on a column on which a selective predicate was given.

Active Database--User-Defined

Rules

Another example of how Starburst concentrates and encapsulates behavior with the idea in the database is the concept of an 'active database', which allow users to specify rules that invoke processing in response to specific changes to the data, regardless of what application makes those changes. An active database is very useful for automatically enforcing integrity constraints [18], maintaining derived data, monitoring trends, and building efficient knowledge-base systems.

The idea was originally suggested by [27], but did not gain a great deal of attention until recently. Other projects that have explored ways to implement active data include HiPAC [47], POSTGRES [62], and Ariel [33]. These facilities all differ from the Starburst approach in that their rule language and semantics is based substantially on OPS5 [14]; hence, their rules respond to operations on a single row. Unlike other rule systems, Starburst's user-defined rules respond to aggregate or cumulative changes to the database. This matches more closely the set-oriented paradigm of relational systems, and leads to cleaner and more natural semantics, because typically many rules may be triggered at any given point.

We have implemented two complementary rule systems that make a Starburst database 'active': a relationally oriented production rule system that efficiently monitors sets of changes to base tables, and an object-oriented system, called Alert, that monitors objects and the invocation of applications ('methods', in object-oriented terminology). We will describe both in turn.

Production Rules

Starburst's user-defined production rules [67, 68] may be defined on a stored table using an extended form of SQL. As with other rule systems, these rules have a trigger clause, a condition clause, and an action clause. The trigger clause specifies one or more of the SQL operations INSERT, DELETE, or UPDATE on the trigger table (identified by the keyword ON). If, at the end of a transaction, (9) any of these operations have occurred on the rule's trigger table, evaluation of this rule's condition clause is triggered. The rule's condition clause, preceded by the keyword IF, is any SQL query on the current database state and the transition tables, described below. If this query is non-empty, the condition of the rule is satisfied, and Starburst executes the rule's action clause, which is any sequence of database commands (10) preceded by the keyword THEN. Actions can suppress changes to the database (by aborting the transaction), or perform further modifications of the database, which may trigger the same or other rules. This allows rules to implement a forward-chaining behavior that permits the modifications made by rules to be processed by other rules. Once rules are defined, a user may temporarily DEACTIVATE them and re-ACTIVATE them at any time.

The condition and action clauses may optionally reference transition (pseudo-) tables, which contain changes to the rule's table that were made since the beginning of the transaction or the last time that rule was processed--whichever happened more recently. The transition table INSERTED (DELETED) contains records inserted into (deleted from) the trigger table. Transition tables NEW UPDATED and OLD UPDATED contain new and old values of updated rows, respectively. Transition tables are implemented by a combination of: (a) rule attachments that append to a log those changes relevant to the rules on a table, and (b) table functions that generate the appropriate transition tables from this log [66].

As an example, suppose we wish to enforce referential integrity on our Departments and Employees database. Then a rule to list and rollback the deletion of any departments still having any employees in them would be defined as shown in Figure 6.

The rule processor is invoked at transaction completion, and it triggers rules based upon the net effect of all changes made within that transaction. For example, a transaction that inserts row R into table T and then deletes R before committing will trigger neither the INSERT nor the DELETE rules on T. This unique feature of Starburst's rules allows transactions to temporarily violate database integrity constraints enforced by rules, e.g., dropping below a minimum balance, detecting the violation, and transferring funds to exceed the minimum force committing the transaction.

Rules may also contain RECEDES and FOLLOWS clauses to specify a partial order for rule processing, which the system extends to a total order [3]. If the condition of the highest priority rule is satisfied, its action is executed. Changes to the database made by statements of the action may trigger other (or the same) rules, which Starburst carefully accumulates for the transition tables. In particular, changes may be relevant to several rules, some of which have already been processed, while others have yet to be considered. For example, if rule R1 has been processed, and then the processing of rule R2 (re)-triggers rule R1, the transition tables for the subsequent processing of R1 should contain only the new changes, while the transition tables for rules not yet processed must contain all changes since the last transaction commit.

One important feature of the implementation of the Starburst production rules is that they are fully integrated with Starburst [66]. Rules are stored in system catalogs, and require the usual authorizations through Starburst's extensible authorization scheme [31]. The Starburst locking mechanism is used for concurrency control as rules are created, dropped, and altered. Any data modification made by rule actions are part of a transaction and may be rolled back. The implementation also made heavy use of Starburst's extensibility mechanisms: Whenever the first rule is defined on a table, the system automatically creates a rule attachment to maintain the transition log. A call to the rule system is pushed onto the 'prepare-to-commit' event queue when rules may be triggered. The transition tables are implemented as table functions that access the transition log. Using the Starburst extensibility mechanisms both facilitated the implementation and demonstrated the usefulness and flexibility of these mechanism [66].

Alert

Starburst's Alert trigger system [59] allow users to define rules at a higher level of abstraction, and with different semantics. An application-specific operation, or method, typically updates many tables in the database, possibly in a sequence of transactions. For example, suppose a user has defined a method for filing an expense claim for a trip, which might increment the number of trips an employee has made in the Employees table, decrease the remaining travel budget of his or her department by the expense amount, etc. Thus it is not immediately obvious on what table or tables a Starburst production rule should be defined.

Alert solves this dilemma by allowing a method on any application-specific object (including a view or XNF view (see the section "Complex Objects") to specify any event(s) that will trigger an Alert rule (e.g. invocation of the expense claim method), rather than being limited to the invocation of lowerlevel, built-in operations on a stored table.

To capture the method invocation events, authors of methods must create a special kind of append-only table, called an active table, which may be shared by multiple methods. The methods append rows to this table containing any information about their invocations that might be used by Alert rules.

An Alert rule is created by defining an SQL view, i.e., a named SQL query, in which the FROM and WHERE clause express the rule's condition, and the SELECT clause contains the rule's actions as a list of either user-defined function to invoke (thereby causing the desired side effects) or data to be returned. The latter allows an Alert rule to use the output of one rule as input to another, by nesting their views.

For example, our expense claim method might append to its active table, called Expense[underscore]Journal, the employee's name, the trip's purpose, and the expense amount. Suppose Personnel has created a view of all supplemental employees in Almaden departures, including the manager of each, as shown in Figure 7.

Then a rule to notify the manager of any supplemented employee at Almaden who claims more than $1000 for a trip could be defined on view Alm[underscore]Supp using a user-defined function called Notify, as shown in Figure 8.

Unlike Starburst's production rules, Alert rules must be explicitly activated. When an Alert rule is activated, Alert is invoked to trigger that rule. Rules that reference only data existing in the database are passive queries and can be satisfied immediately when the rule is triggered in the usual fashion. Because the special, append-only nature of active tables allows us to keep a cursor open on them as rows are appended, queries on active tables (such as Profligate[underscore]Supp in our example) can also reference future data, by creating a perpetually open cursor on that active table. These queries, called active queries, require more to be done upon their activation. Alert automatically creates an attachment on the active table (if one does not already exist) that invokes the Alert rule processor whenever that table is modified. Alert will advance the cursor and evaluate the rule using the new rows from the active table. In this way, Alert can handle both passive and active queries with a single syntax. Optionally, Alert rules can be activated in DEFERRED mode, which defers evaluating the rule's condition until the user actively ASSERTs the rule. Note that any query that includes an active query also becomes an Alert rule.

Performance

A thorough analysis of Starburst's execution performance has not yet been made. However, some preliminary measurements of execution performance, for which no 'tuning' of the database was done, were reported in [17] and are extracted here. In addition, we have also evaluated and improved the time to optimize SQL queries, a major constituent of the time to compile a query. Both tests were run on an IBM RT/PC with a 10MHz processor and 8MB of physical memory, running the ALX version 2.2 operating system with only a single user. Starburst was allocated 25 4KB pages of I/O buffer, and a similar amount for the separate sort buffer. Note that since queries to XNF objects are ultimately reduced to sets of normal queries joining several tables, our performance measurements on standard relational join queries represent the cost of retrieving a large, structured object.

Execution (run-time) measurements were made on a synthetic database--the well-known Wisconsin benchmark [13]--on queries joining a 1,000-row table with either another 1,000-row table (occupying 67 4K byte pages of disk) or a 10,000-row table (occupying 667 pages). Joins between these tables were on columns having distinct values and an IMS attachment relating them, using CHILD clustering as described in the subsection "Performance--the IMS Attachment." Starburst executed a query joining the entire two 1,000-row tables in 7.8 seconds (requiring 140 I/Os), and the same query joining the 1,000-row table with the 10,000-row table in 55 seconds (785 I/Os). Much lower times and I/Os result when predicates limit the parent and/or child tables [17].

The optimization portion by compile-time costs was measured by porting to Starburst the schema, indexes, and statistics of a real customer's database, as well as 30 queries representative of the customer's workload. The queries ranged from simple single-table queries of complex and highly interconnected 8-table joins. Since the initial goals of Starburst had focused on the functionality and extensibility of query optimization [38, 43, 52], it did not surprise us that the initial optimization times for the more complex queries were as much as two orders of magnitude greater than those of a commercial database. We determined that the already significant processing requirements of the join enumeration algorithm, particularly for highly interconnected queries [53], were greatly increased by the wealth of partial plans that were being retained and the algorithms used to manipulate them. After improving the pruning of redundant partial plans and their storage structures, optimization times were reduced to within a factor of five of the commercial optimizer. Specifically, two-thirds of the queries had optimization times of 2 seconds or less, and the remaining third between 2 and 42 seconds. By comparison, one of the most complex queries with 8 tables and 13 predicates originally had required 503 seconds to optimize! We feel these latest optimization times represent acceptable performance, given the extensibility of the Starburst optimizer compared to that of other database systems, and that with further enhancements query compilation can be made even more efficient.

Conclusions and Future Work

The extensions described in this article for structured and unstructured complex objects, as well as user-defined types, functions, and rules, show that we can extend relational DBMSs to increase the system's functionality significantly. We have described how a relational DBMS, with the appropriate extensibility features, can support the features of object-oriented systems and much more. In fact, we are confident that any system benefits from extensibility, because it is often impossible to foresee the applications for which a system might be used.

The extensions presented in this article also show that Starburst is extensible, at least by extenders knowledgeable in database systems. The IMS attachment, described in the subsection "Performance--the IMS Attachment," was the first extension made by extenders not among the original developers of Starburst. While that experiment found and corrected a number of minor deficiencies in the attachment interfaces, it also confirmed Starburst's extensibility: IMS was implemented in just one three-month summer by only two (quite knowledgeable) people (Mike Carey and Eugene Shekita of the University of Wisconsin) [17]. We have learned other limitations from our experience extending Starburst, but overall we are quite satisfied with the mechanisms that Starburst provides [30].

In the near future, we plan to implement the Starburst-type system and a very sophisticated application program interface. We will continue to implement these and other missing pieces, and to "harden" Starburst with better documentation, thorough testing, and more error checking. We also intend to use Starburst a lot! By examining the requirements of several applications that normally do not use DBMSs, we hope to augment Starburst with the extensions needed to support these applications and eventually develop new applications using Starburst, with the assistance of selected joint study partners. Already, we are using Starburst to test new technologies. Studies of optimization heuristics, parallel query processing, main-memory database, "magic sets" [49-51], and many other DBMS research topics are already underway using Starburst as a testbed, and we expect its usefulness will continue to grow as we gain experience and add more extensions to it.

(1) Unix is a trademark of AT & T Corp.

(2) As of June 1991, we were still implementing nested subqueries, and some of the relational set operations (e.g., UNION DISTINCT, DIFFERENCE).

(3) Although the ARIES NT methodology implemented in Starburst allows nested transactions, we are not currently exploiting that capability.

(4) Although invoking user-defined methods written in a programmaing language can make SQL computationally complete, the goal here is efficient support of multiple existing application environments, not the replacement of--or isolation from--these environments.

(5) Since DB2 stores the long field in the row with other columns and a row cannot exceed the maximum page size of 32K bytes, the aggregate size of all columns for a given row cannot exceed 32K bytes.

(6) Unlike Unix, some operating systems, such as Locus, Transarc, and Tandem's Guardian, do this for files.

(7) That paper uses a simple nested relational representation to illustrate the multiple parents cannot share the same child, and does not consider representing the child with a foreign key (values uniquely identifying the child).

(8) This IMS attachment should not be confused with IBM's Information Management System product, a navigational system whose pointers must be managed by the user.

(9) We intend to generalize this to more flexible, user-specified assertion points.

(10) Eventually we will allow the action to be any user-specified procedure.

References

[1] Abiteboul, S. and Bidoit, N. Non first normal form relations to represent hierarchically organized data. In Proceedings of the ACM PODS Conference (Waterloo, Ontario, Canada, 1984).

[2] Abiteboul, S., Scholl, M., Gardarin, G. and Simon, E. Towards DBMSs for supporting new applications. In Proceedings of the Twelfth International Conference on Very Large Data Bases (Kyoto, Japan, Aug. 1986), pp. 423-435.

[3] Agrawal, R., Cochrane, R.J. and Lindsay, B. G. On maintaining priorities in a production rule system. In Proceedings of the 17th International Conference on Very Large Data Bases (Barcelona, Sept. 1991).

[4] Agrawal, R., DeMichiel, L.G. and Lindsay, B.G. Static typechecking of multimethods. In Proceedings of OOPSLA 91 (Oct. 1991).

[5] Atkinson, M. et al. The object-oriented database system manifesto, ALTAIR Tech. Rep. 30-89, LeChesnay, France, Sept. 1989. Also in Deductive and Object-oriented Databases, Elsevere Science Publishers, Amsterdam, Netherlands, 1990.

[6] Bancilhon, F. et al. The design and implementation of 02, an object-oriented database system. 00DBS2 Workshop Proceedings (Badmunster, RFA, 1988).

[7] Banerjee, J., Chou, H.T., Garza, J.F., Kim, W., Woelk, D. and Ballou, N. Data model issues for object-oriented applications. ACM Trans. Off. Inf. Syst. 5, 1 (Jan. 1987), 3-26.

[8] Banerjee, J., Kim, W., Kim, H.J. and Korth, H. Semantics and implementation of schema evolution in object-oriented databases. In Proceedings of ACM SIGMOD 87 (San Francisco, Calif., May 1987), pp. 311-322.

[9] Barsalou, T., Keller, A.M., Siambela, N. and Wiederhold, G. Updating relational databases through object-based views. In Proceedings of ACM SIGMOD (Denver, May 1991), pp. 248-257.

[10] Batory, D. GENESIS: A project to develop an extensible database management system. In Proceedings of 1986 International Workshop on Object-Oriented Database Systems (Asilomar, Sept. 1986).

[11] Batory, D. Extensible cost models and query optimization in GENESIS. IEEE Database Eng. 10, 4 (Dec. 1986).

[12] Batory, D. A molecular database systems technology. Tech. Rep. TR-87-23 Dept. of Comp. Sci., Univ. of Texas at Austin, June 1987.

[13] Bitton, D., DeWitt, D. and Turbyfill, C. Benchmarking database systems: A systematic approach. In Proceedings of 9th International Conference on Very Large Data Bases (VLDB) (Florence, Nov. 1983), pp. 8-19.

[14] Brownston, L., Rarrel, R., Kant, E. and Martin, N. Programming Expert Systems in OPS5: An Introduction to Rule-Based Programming. Addison-Wesley (Reading, Mass., 1985).

[15] Carey, M.J., DeWitt, D.J. and Vandenberg, S.L. A data model and query language for EXODUS. In Proceedings of ACM SIGMOD 88 (Chicago, Ill., June 1988), pp. 413-423.

[16] Carey, M., DeWitt, D., Graefe, G., Haight, D., Richardson, J. Schuh, D., Shekita, E. and Vandenberg, S. The EXODUS extensible DMBS project: An overview. Readings in Object-Oriented Databases, S. Zdonik and D. Maier, Eds., Morgan-Kaufman, 1989.

[17] Carey, M., Shekita, E., Lapis, G., Lindsay, B. and McPherson, J. An incremental join attachment for Starburst. In Proceedings of 16th International Conference on Very Large Data Bases (VLDB) (Brisbane, Australia, Aug. 1990) pp. 662-673. Also available as IBM Res. Rep. RJ7544, San Jose, Calif., June 1990.

[18] Ceri, S. and Wisdom, J. Deriving production rules for constraint maintenance. In Proceedings of 16th International Conference on Very Large Data Bases (VLDB) (Brisbane, Australia, Aug. 1990), pp. 566-577. Extended version available as IBM Res. Rep. RJ7348, San Jose, Calif., Mar. 1990.

[19] Chamberlin, D.D. et al. A history and evaluation of system R. Commun. ACM 24, 10 (Oct. 1981), pp. 632-646.

[20] Chang, W. and Schek, H-J. A signature access method for the Starburst database system. In Proceedings of 15th International Conference on Very Large Data BAses (VLDB) (Amsterdam, Aug. 1989), pp. 145-153. Also available as IBM Res. Rep. RJ6670, San Jose, Calif., Feb. 1989.

[21] Chimenti, D., Gamboa, R., Krishnamurthy, R., Naqvi, S., Tsur, S. and Zaniolo, C. The LDL system prototype. IEEE Trans. Knowl. Data Eng. (Mar. 1990), pp. 76-90.

[22] Committee for Advanced DBMS Function. Third-generation data base system manifesto. ACM SIGMOD Record 19, 3 (Sept. 1990), pp. 31-44. Also available as Mem. UCB/ERL M90/28, Berkeley, Calif., Apr. 9 1990.

[23] Dadam, P. et al. A DBMS prototype to support extended [NF.sub.2] relations: An integrated view on flat tables and hierarchies. In Proceedings of 1986 ACM SIGMOD Conference (Washington, DC, May 1986).

[24] Date, C.J. A critique of the SQL database language. ACM SIGMOD Record 14, 3 (Nov. 1984), pp. 8-54.

[25] Date, C.J. Some principles of good language design. ACM SIGMOD Record 14, 3 (Nov. 1984), pp. 1-17.

[26] Dayal, U. and Smith, J.M. PROBE: A knowledge-oriented database management system, On Knowledge Base Management Systems: Integrating Artificial Intelligence and Database Technologies, Brodie and Mylopoulos, Eds., Springer-Verlag, 1986.

[27] Eswaran, J.P. Specifications, implementations and interactions of a trigger subsystem in an integrated database system. IBM Res. Rep. RJ1820 (San Jose, Calif., Aug. 1976).

[28] Fishman, D.H. et al. Irish: An object-oriented database system. ACM Trans. Off. Inf. Syst. 5, 1 (Jan. 1987), pp. 48-69).

[29] Graefe, G. and Maier, D. Query optimization in object-oriented database systems: The REVELATION project. Tech. Rep. CS/E 88-025 (Oregon Graduate Center, Beaverton, Ore., July 1988).

[30] Haas, L.M., Chang, W., Lohman, G.M., McPherson, J., Wilms, P.F., Lapis, G., Lindsay, B., Pirahesh, H., Carey, M. and Shekita, E. Starburst mid-flight: As the dust Clears. IEEE Trans. Knowl. Data Eng. (Mar. 1990), pp. 143-160. Also available as IBM Res. Rep. RJ7278, San Jose, Calif., Jan. 1990.

[31] Haas, L.M., Cody, W.F., Freytag, J.C., Lapis, G., Lindsay, B.G., Lohman, G.M., Ono, K. and Pirahesh, H. An extensible processor for an extended relational query language. IBM Res. Rep. RJ6182 San Jose, Calif., Apr. 1988.

[32] Haas, L.M., Freytag, J.C., Lohman, G.M. and Pirahesh, H. Extensible query processing in Starburst. In Proceedings of ACM SIGMOD (Port-land, Ore., May 1989), pp. 377-388. Also available as IBM Res. Rep. RJ6610, San Jose, Calif., Dec. 1988.

[33] Hanson, E.N., An initial report on the design of Ariel: A DBMS with an integrated production rule system. ACM SIGMOD Record 18, 3 (Sept. 1989), pp. 12-19.

[34] Hasan, W. and Pirahesh, H. Query rewrite optimization in Starburst. IBM Res. Rep. RJ6367 (San Jose, Calif., Aug. 1988).

[35] Haskin, R.L. and Lorie, R.A. On extending the functions of a relational database system. In Proceedings of ACM SIGMOD 82 (June 1982), pp. 207-212.

[36] Hudson, S.E. and King, R. Cactis: A self-adaptive, concurrent implementation of an object-oriented database management system. ACM Trans. Database Syst. 14, 3 (Sept. 1989), pp. 291-321.

[37] Kim, W., Banerjee, J., Chou, H.-T., Garza, J. and Woelk, D. Composite object support in an object-oriented database system. In Proceedings of ACM Conference on Object Oriented Programming Systems, Languages, and Applications (Orlando, Fla., Oct. 1987).

[38] Lee, M., Freytag, J.C. and Lohman, G.M. Implementing an interpreter for functional rules in a query optimizer. In Proceedings of 14th International Conference on Very Large Data Bases (VLDB) (Long Beach, Calif., Aug. 1988), pp. 218-229. Also available as IBM Res. Rep. RJ6125, San Jose, Calif., Mar. 1988.

[39] Lehman, T.J. and Lindsay, B.G. The Starburst long field manager. In Proceedings of 15th International Conference on Very Large Data Bases (VLDB) (Amsterdam, Aug. 1989), pp. 375-383.

[40] Lindsay, B. G. A retrospective of R*: A distributed database management system. IBM Res. Rep. RJ4859 (San Jose, Calif., Sept. 1985).

[41] Lindsay, B. and Haas, L. Extensibility in the Starburst experimental database system, Database Systems of the 90s, A. Blaser, Ed., Springer-Verlag, 1990, pp. 217-248. Also available as IBM Res. Rep. RJ 7570, San Jose, Calif., July 1990.

[42] Lindsay, B., McPherson, J. and Pirahesh, H. A data management extension architecture. In Proceedings of ACM SIGMOD (San Francisco, Calif., May 1987), pp. 220-226. Also available as IBM Res. Rep. RJ5436, San Jose, Calif., Dec. 1986.

[43] Lohman, G.M. Grammar-like functional rules for representing query optimization alternatives. In Proceedings of ACM SIGMOD (Chicago, May 1988), pp. 18-27. Also available as IBM Res. Rep. RJ5992, San Jose, Calif., Dec. 1987.

[44] Lohman, G.M., Mohan, C., Haas, L.M., Daniels, D., Lindsay, B.G., Selinger, P.G. and Wilms, P. Query processing in R*. Query Processing in Database Systems. W. Kim, D. Reiner, and D. Batory, Eds., Springer-Verlag, 1985, pp. 18-27. Also available as IBM Res. Rep. RJ4272, San Jose, Calif., Apr. 1984.

[45] Lorie, R. and Plouffe, W. Complex objects and their use in design transactions. In Proceeding of 1983 ACM SIGMOD Conference, Engineering Design Applications (May 1983), pp. 115-121.

[46] Maier, D. Stein, J., Otis, A. and Purdy, A. Development of an object-oriented DBMS. In Proceedings of ACM Conference on Object Oriented Programming Systems, Languages and Applications (Portland, Ore., Sept. 1986).

[47] McCarthy, D.R. and Dayal, U. The architecture of an active database management system. In Proceedings of ACM SIGMOD 89 (Portland, Ore., May 1989), pp. 215-224.

[48] Mohan, C., Haderle, D., Lindsay, B., Pirahesh, H. and Schwarz, P. ARIES: A transaction recovery method supporting fine-granularity locking and partial rollbacks using write-ahead logging. IBM Res. Rep. RJ6649 (San Jose, Calif., Jan. 1989). To be published in ACM Trans. Database Syst.

[49] Mumick, I.S., Finkelstein, S.J., Pirahesh, H. and Ramakrishnan, R. Magic conditions. Ninth ACM Symposium on Principles of Database Systems (Nashville, Tenn., Apr. 1990), pp. 314-330.

[50] Mumick, I.S., Finkelstein, S.J., Pirahesh, H. and Ramakrishnan, R. Magic is relevant. In Proceedings of ACM SIGMOD (Atlantic City, N.J., May 1990), pp. 247-258.

[51] Mumick, I.S., Pirahesh, H. and Ramakrishnan, R. Magic of duplicates and aggregates. In Proceedings of 16th International Conference on Very Large Data Bases (VLDB) (Brisbane, Australia, Aug. 1990), pp. 264-277.

[52] Ono, K. and Lohman, G.M. Extensible enumeration of feasible joins for relational query optimization. IBM Res. REp RJ6625 (San Jose, Calif., Dec. 1988).

[53] Ono, K. and Lohman, G.M. Measuring the complexity of join enumeration in query optimization. In Proceedings of 16th International Conference on Very Large Data Bases (VLDB) (Brisbane, Australia, Aug. 1990), pp. 314-325.

[54] Osborne, S. and Heaven, T. The design of a relational system with abstract data types as domains. ACM Trans. Database Syst. (Sept. 1986), pp. 357-373.

[55] Pirahesh, H. and Mohan, C. Evolution of relational DBMSs toward object support: A practical viewpoint. In Proceedings of Datenbanksysteme in Buro, Technik und Wissenchaft (BTW) (Univ. Kaiserslautern, Germany, Mar. 1991).

[56] Rothermel, K. and Mohan, C. ARIES/NT: A recovery method based on write-ahead logging for nested transactions. In Proceedings of 15th International Conference on Very Large Data Bases (VLDB) (Amsterdam, Aug. 1989), pp. 337-346. A longer version of this paper is available as IBM Res. Rep. RJ6650, San Jose, Calif., Jan. 1989.

[57] Rowe, L.A. and Stonebraker, M.R. The POSTGRES data model. In Proceedings of 13th International Conference on Very Large Data Bases (Brighton, England, Sept. 1987), pp. 83-96.

[58] Schek, H.-J. and Scholl, M.H. The relational model with relation-valued attributes. Inf. Syst. 11, 2 (1986). Also available as Tech. Univ. of Darmstadt. Rep. DVSI-1984-TI, Darmstadt, Germany, 1984.

[59] Schreier, U., Pirahesh, H., Agrawal, R. and Mohan, C. Alert: An architecture for transforming a passive DMBS into an active DBMS. In Proceedings of 17th VLDB (Barcelona, Sept. 1991).

[60} Schwarz, P.M., Chang, W., Freytag, J.C., Lohman, G.M., McPherson, J., Mohan, C. and Pirahesh, H. Extensibility in the Starburst database system. In Proceedings of the International Workshop on Object-Oriented Database Systems (Asilomar, Calif., Sept. 1986). Also available as IBM Res. Rep. RJ5311, San Jose, Calif., Sept. 1986.

[61] Stonebraker, M. and Rowe, L.A. The design of POSTGRES. In Proceedings of ACM SIGMOD 86 (Washington, D.C., May 1986), pp. 340-355.

[62] Stonebraker, M., Jhingran, A., Goh, J. and Potamianos, S. On rules, procedures, caching and views in data base systems. In Proceedings of ACM SIGMOD 90 (Atlantic City, N.J., May 1990), pp. 281-290.

[63] Stonebraker, M., Rowe, L.A. and Hirohama, M. The implementation of POSTGRES. IEEE-Trans. Knowl. Data Eng. (Mar. 1990), pp. 125-142.

[64] Stonebraker, M. and Kemnitz, G. The POSTGRES Next-Generation DBMS. Commun. ACM, Oct. 1991.

[65] Ullman, J.D. Implementation of logical query languages for databases. ACM Trans. Database Syst. (Sept. 1985), pp. 289-321.

[66] Widom, J., Cochrane, R.J. and Lindsay, B.G. Implementing set-oriented production rules as an extension to Starburst. In Proceedings of 17th VLDB (Barcelona, Sept. 1991).

[67] Widom, J. and Finkelstein, S.J. A syntax and semantics for set-oriented production rules in relational database systems (Extended Abstract). SIGMOD Record, Special Issue on Rule Management and Processing in Expert Database Systems 18, 3 (Sept. 1989), pp. 26-45.

[68] Widom, J. and Finkelstein, S.J. Set-oriented production rules in relational database systems. In Proceedings of ACM SIGMOD (Atlantic City, N.J., May 1990), pp. 259-270. Extended version available as IBM Res. Rep. RJ6880, San Jose, Calif., June 1989, revised Mar. 1990.

[69] Wiederhold, G. Views, objects, and databases. IEEE Comput. 19, 12 1986), pp. 37-44.

[70] Williams, R. et al. R*: An overview of the architecture. In Proceedings of 2d International Conference on Databases: Improving Usability and Responsiveness (Jerusalem, June 1982), pp. 180-192. Published in Improving Usability and Responsiveness, P. Scheuermann, Ed., Academic Press, N.Y., pp. 1-27. Also available as IBM Res. Rep. RJ3325, Dec. 1981.

[71] Wilms, P., Schwarz, P.M., Schek, H.-J. and Haas, L.M. Incorporating data types in an extensible database architecture. In Proceedings of 3d International Conference on Data and Knowledge Bases (Jerusalem, June 1988), pp. 180-192. Also available as IBM Res. Rep. RJ6405, San Jose, Calif., Aug. 1988.

GUY M. LOHMAN is research staff member and manager of Relational Support for Advanced Applications at IBM's Alamaden Research Center. His current research interests include query optimization, extensible DBMSs, performance evaluation, novel applications of DBMSs, and distributed and parallel databases.

BRUCE LINDSAY is a research staff member at IBM's Alamaden Research Center. His current research interests include all aspects of extensible DBMSs (including type systems, application interfaces, complex objects, and production rule systems) and distributed DBMSs.

HAMID PIRAHESH is a research staff member at IBM's Alamaden Research Center. His main activities have been in various areas of database management systems, including object-oriented support, query semantic modeling and optimization, DBMS extensibility, parallelism, and concurrency control and recovery.
COPYRIGHT 1991 Association for Computing Machinery, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 1991 Gale, Cengage Learning. All rights reserved.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:IBM Research's Starburst relational database management system; one of six articles on next-generation database management systems
Author:Lohman, Guy M.; Lindsay, Bruce; Pirahesh, Hamid; Schiefer, K. Bernhard
Publication:Communications of the ACM
Date:Oct 1, 1991
Words:11372
Previous Article:The POSTGRES next-generation database management system.
Next Article:Database systems: achievements and opportunities.
Topics:

Terms of use | Privacy policy | Copyright © 2022 Farlex, Inc. | Feedback | For webmasters |