Impedance mismatch in databases: Mary Finn - Intersystems Corp. (Database Systems).With the maturation and wide acceptance of Java, object-oriented programming object-oriented programming, a modular approach to computer program (software) design. Each module, or object, combines data and procedures (sequences of instructions) that act on the data; in traditional, or procedural, programming the data are separated from the has moved to the foreground of the application development landscape. Because of their rich data models and support for productivity-enhancing concepts such as encapsulation (1) In object technology, the creation of self-contained modules that contain both the data and the processing. See object-oriented programming.(2) The transmission of one network protocol within another. , inheritance, and polymorphism polymorphism, of minerals, property of crystallizing in two or more distinct forms. Calcium carbonate is dimorphous (two forms), crystallizing as calcite or aragonite. Titanium dioxide is trimorphous; its three forms are brookite, anatase (or octahedrite), and rutile. , object technologies like Java, C++, and COM (1) (Computer Output Microfilm) Creating microfilm or microfiche from the computer. A COM machine receives print-image output from the computer either online or via tape or disk and creates a film image of each page. , are favoured by today's application developers. However, much of the world's data still resides in relational databases. Developers of database applications (that is, any application that accesses stored data) often find themselves fighting impedance mismatch The difficulty of storing the many-to-many relationships of an object model in a traditional relational database. See O-R mapping. : the inherent disconnect between the object and relational data models (database) relational data model - (Or "relational model") A data model introduced by E.F. Codd in 1970, particularly well suited for business data management. In this model, data are organised in tables. The set of names of the columns is called the "schema" of the table. . Efforts to "map" relational data into a usable object format are often detrimental to both programmer productivity and application performance. However, impedance mismatch can be mitigated by the proper choice of database technology. This paper defines impedance mismatch and gives two simple examples of how it affects application development. It then discusses, with regards to impedance mismatch, the pros and cons pros and cons Noun, pl the advantages and disadvantages of a situation [Latin pro for + con(tra) against] of three kinds of database: relational, object, and Cache, the multidimensional database See OLAP. from InterSystems. Understanding Impedance Mismatch Impedance mismatch is a term borrowed from electrical engineering electrical engineering: see engineering. electrical engineering Branch of engineering concerned with the practical applications of electricity in all its forms, including those of electronics. but in the software world it refers to the inherent difference between the relational and object data models. Very simply put, the relational model See relational database. relational model - relational data model organizes all data into rows and columns. Every row represents a record, and the columns represent the various data items in a record. If the data is too complex to be described by a two-dimensional grid, additional tables are created to hold "related" information. Thus, every table in a relational schema will hold some, but not all, of the data items for a great many records. The object data model is not constrained to keeping data in rows and columns. Instead, the developer creates a definition _ a template _ that completely describes a certain class of information. Every record (object) is a specific instance of that class. Thus each object contains all the data items for one, and only one, record. But that's not all. Class definitions may also include pieces of code, called methods, which act upon the data described by the class. There is no analogous construct in the relational model. A simple example To illustrate the difference between the two data models, assume that you are developing an accounts receivable accounts receivable n. the amounts of money due or owed to a business or professional by customers or clients. Generally, accounts receivable refers to the total amount due and is considered in calculating the value of a business or the business' problems in paying application. Your application will undoubtedly need to keep track of a number of invoices, each of which will have some header information (such as the invoice date Invoice date Usually the date when goods are shipped. Payment dates are set relative to the invoice date. ), an invoice number, and one or more line items. Every line item will include, among other things, information about the product ordered, and the quantity of product ordered. One way of modeling the invoice in a relational database is to create two tables. One table _ called Invoice _ includes the header information that only appears once on each invoice. Another table _LineItems _ contains columns for the Invoice_Parent Line_Item_Product_Code, and Line_Item Quantity. The first of these is especially important because it is this value that "relates" the line items to information in the Invoice table. Note that neither table contains all the information about any given invoice. Instead, each contains some of the information about many invoices. If your application is designed, for example, to print an invoice, it must access both the Invoice and LineItems tables for the header and detail information respectively. Also note that the tables do not contain any instructions about how to format the data for printing. Those instructions exist outside the database itself. In the object model, data need not fit into rows and columns, so the Invoice class definition will look like a list of all the data items that make up an invoice. There will be properties containing the header information such as InvoiceDate, InvoiceNumber, etc., and a collection of one or more instances of the LineItem class. The LineItem class includes the properties ProductCode and LineItemQuantity. Class definitions are merely blueprints of the data format. Each individual invoice is one specific instance of the invoice class, and contains the specific instances of the LineItem class which belong to it. Thus, every Invoice object contains all the information for a given invoice, and only information for that invoice. But class definitions may also contain methods that act upon the data described by the class. For example, your Invoice class may include a Print() method that dictates how to format invoice information for printing. Persistent objects will include some sort of Save() method that specifies how objects are stored in the database. The default implementation of the Save() method will be determined by the structure of the database engine, and provided by the database vendor. Impedance mismatch when manipulating the database Consider the case of creating a new invoice with one line item in your accounts receivable application. If you were programming against a relational database, your code would look something like that shown in Example #1. It would include two Insert statements: one to add the header information to the Invoice table, and another to add the detail information to the LineItems table. Insert is a standard SQL SQL in full Structured Query Language. Computer programming language used for retrieving records or parts of records in databases and performing various calculations before displaying the results. command, and the relational database vendor will provide for its implementation.
Example #1:
Creating a new Invoice using
the relational model
If (flag="New") {
Insert Into Invoice
(Invoice_Date, Invoice_Number)
Values (Today,:NewInvoiceNumber)
Insert Into LineItems
(Invoice_Parent, Line_Item_Quantity,
Line_Item_Product Code)
Values (:NewInvoiceNumber,:Quantity,:ProdOrdered)
}
The code for saving an invoice with one line item using an object model is shown in Example #2. Except for syntax details, it looks quite similar to the relational example. The main difference is that the Save() method is only called once.
Example #2:
Creating a new invoice using the object model
If (flag="New") {
objInv=new Invoice()
objInv.InvoiceDate=Today
objInv.InvoiceNumber=NewInvoiceNumber
objLI=new ObjInv.LineItem()
objLI.LineItemQuantity=Quantity
objLI.ProductCode=ProdOrdered
objInv.Save()
}
Now imagine that you want to write the business logic for your application in an object-oriented language object-oriented language - object-oriented programming such as Java or C++, but you need to store your data in a relational database. To accomplish this for your invoice, the SQL Insert statements must be programmed within the Save() method of your Invoice class definition. Here is one manifestation of impedance mismatch -- an object class with a collection having to be translated to the disparate tables of a relational database engine. Impedance Mismatch in Design Another form of impedance mismatch can crop up during the application design process. In addition to enabling a richer, more intuitive way of modeling data, object technology encompasses several concepts that significantly enhance programmer productivity. In particular, object technology supports the concepts of inheritance and polymorphism. Inheritance refers to the fact that one class definition can be derived from another. For example, in your accounts receivable application you might create a generic InvoiceTemplate class, and have the more specific SoftwareInvoice and HardwareInvoice classes inherit properties and methods from InvoiceTemplate. (They may also include non-inherited properties and methods that are specific to each class.) As the application evolves, if changes are made to InvoiceTemplate, inheritance dictates that those changes are automatically reflected in the SoftwareInvoice and HardwareInvoice class definitions. Polymorphism refers to the fact that different implementations of a method can share a common interface. For example, the Print() method in SoftwareInvoice and HardwareInvoice may include different instructions for formatting, etc. However, to print an invoice, your application only needs to load an object into memory, and call its Print() method. Thanks to polymorphism, the object will "know" how to format itself for printing, depending on which class it belongs to. Neither inheritance nor polymorphism exist in the relational model. Some large relational database vendors such as Oracle, MicroSoft, and IBM (International Business Machines Corporation, Armonk, NY, www.ibm.com) The world's largest computer company. IBM's product lines include the S/390 mainframes (zSeries), AS/400 midrange business systems (iSeries), RS/6000 workstations and servers (pSeries), Intel-based servers (xSeries) have attempted to implement object-oriented design Transforming an object-oriented model into the specifications required to create the system. Moving from object-oriented analysis to object-oriented design is accomplished by expanding the model into more and more detail. concepts, but the results generally fall short of the capabilities expected by object programmers. Approaches to Mitigating Impedance Mismatch The two examples of impedance mismatch given above are very simplistic sim·plism n. The tendency to oversimplify an issue or a problem by ignoring complexities or complications. [French simplisme, from simple, simple, from Old French; see simple , but they serve to demonstrate the problem. The work required to "normalize normalize to convert a set of data by, for example, converting them to logarithms or reciprocals so that their previous non-normal distribution is converted to a normal one. " impedance mismatch can be significant, and grows dramatically as application complexity increases. However, the effects of impedance mismatch can be substantially reduced by the proper choice of database technology. Let's consider three options for data storage: a relational database, a "pure" object database, and the Cache multidimensional database. Using a relational database This paper has already discussed how trying to use a relational database with an application grounded in object technology poses serious impedance mismatch problems. But sometimes developers don't have a choice. They may need to access existing data that resides in a relational database. In that case, one option is to use an "object-relational mapping See O-R mapping. " tool, whether it be a stand-alone tool, or the mapping capabilities built in to some so-called "object-relational" databases. In essence, mapping tools create a file _ a map _ that contains the code for translating between objects and relational tables. Developers must specify exactly how that translation is to be done, that is, which object properties correspond to which data columns in which tables, and vice versa VICE VERSA. On the contrary; on opposite sides. . Once created, the map is saved, and invoked every time the application moves data to or from the database. Some object-relational mapping tools provide a runtime caching component to help counteract the performance penalty introduced by translating data between objects and relational forms. Aside from any runtime performance object-relational mapping can significantly slow down application development. Most mapping tools do not implement, or only partially implement, object modeling concepts such as inheritance polymorphism, etc. Therefore, as an application is adapted and modified, new, updated object-relational maps must be created. Developers battling the impedance mismatch between object-oriented applications and relational databases might want to consider migrating the data into a more object-friendly data store. They must weigh the one-time effort required to reformat (1) To change the record layout of a file or database. (2) To initialize a disk over again. transfer the data against the ongoing work an performance losses of using an object-relational map. Using an object database At first glance, it would appear that impedance mismatch can be totally eliminated by storing data in "pure" object database. That is _ partly _ tree. general, it is easy for an object-oriented application to interact with an object database. However, in scenario, impedance mismatch occurs when you want to run an SQL query against the database. SQL is by far the world's most widely used query language A generalized language that allows a user to select records from a database. It uses a command language, menu-driven method or a query by example (QBE) format for expressing the matching condition. , and it assumes that data is stored in relational tables. Some object database vendors provide data access via an object query language Object Query Language (OQL) is a query language standard for object-oriented databases modelled after SQL. OQL was developed by the Object Data Management Group (ODMG). Because of its overall complexity no vendor has ever fully implemented the complete OQL. (OQL (Object Query Language) A query language that supports complex data types (multimedia, spatial, compound documents, etc.) that are stored as objects. Defined by the ODMG, it is a superset of the SQL-92 query language. ), but these languages do not enjoy widespread acceptance. Tin order to be compatible with common data analysis and reporting applications, an object database must support ODBC (Open DataBase Connectivity) A database programming interface from Microsoft that provides a common language for Windows applications to access databases on a network. and JTDBC, and must therefore provide some mechanism for projecting data as relational tables. The typical solution is, once again, mapping. The drawbacks of mapping _ performance losses an the lack of support for data model evolution _ still apply. The upside is that the map only need be invoked when an SQL query is run against the database. Using a multidimensional Cache with Unified Data Architecture There is a third option for data storage Cache, the multidimensional database from InterSystems. Although multidimensional databases are often thought of as playing in the data warehousing See data warehouse. data warehousing - data warehouse arena, Cache is designed to be part of transaction processing Updating the appropriate database records as soon as a transaction (order, payment, etc.) is entered into the computer. It may also imply that confirmations are sent at the same time. Transaction processing systems are the backbone of an organization because they update constantly. applications. And it implements a unique approach to reducing impedance mismatch _ the Unified Data Architecture. Thanks to the Unified Data Architecture, the object and relational data models "share" Cache's multidimensional data. Multidimensional arrays are easily projected as tables because tables are nothing more than two-dimensional arrays. Similarly, there is easy correlation between objects and multidimensional arrays because neither is constrained to the rows-and-columns format of relational technology. The translation between data forms is automated, and becomes part of the compiled data definition. To the developer, every table is effectively an object, and every object is one or more tables. Some other attributes of Caehe's Unified Data Architecture: * Full concurrency Operations that are performed simultaneously within the computer. For example, dual-core CPUs provide complete overlapping of two independent processes. See dual core, hyperthreading, multiprocessing, multitasking, multithreading, SMP and MPP. concurrency - multitasking Updates to the data made through the relational interface are instantly accessible via the object interface and vice versa. * Support for data model evolution Changes to the data structure definition are automatically reflected in both the object and relational representations. * Full SQL support SQL DDL (1) (Data Description Language) A language used to define data and their relationships to other data. It is used to create the data structure in a database. Major database management systems (DBMSs) use a SQL data description language. , DML A 4GL programming language from Ross Enterprise, the ERP division of CDC Software, Atlanta, GA (www.rossinc.com). DML is the primary scripting and form definition language for its GEMBASE runtime engine. , and DCL (1) (Digital Command Language) Digital's standard command language for the VMS operating system on its VAX series. (2) (Data Compression L commands are all supported. * Full object support Object modeling concepts such as simple and multiple inheritance In object-oriented programming, a class that can contain more than one parent. Contrast with single inheritance. (programming) multiple inheritance - In object-oriented programming, the possibility that a sub-class may be derived from multiple parent classes which are , polymorphism, advanced data types, and method generators are all supported. * Object serving Objects defined in the unified data architecture can be served up as Java, C++, or COM objects, providing compatibility with a variety of object-oriented technologies. www.intersystems.com |
|
||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion