Printer Friendly
The Free Library
14,670,786 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Comparing two data warehouse methodologies.


It's important to recognize that the Corporate Information Factory (CIF (1) (Common Intermediate Format) A standard video format used in videoconferencing. CIF formats are defined by their resolution, and standards both above and below the original resolution have been established. The original CIF is also known as Full CIF (FCIF). ) is not the only business intelligence (BI) architecture. Another architecture worth noting is Dr. Ralph Kimballs Ralph Kimball, is an author on the subject of data warehousing and business intelligence. He is known for long-term convictions that data warehouses must be designed to be understandable and fast.  multidimensional mul·ti·di·men·sion·al  
adj.
Of, relating to, or having several dimensions.



multi·di·men
 (MD) architecture. This feature offers a brief description of the MD and CIF architectures and then highlights the significant similarities and differences between the two by using the criteria of scope, perspective, data flow, implementation speed and cost, volatility, complexity, and functionality.

We believe that a combination of the data-modeling techniques found in the two architectural approaches works best, ERD or normalization In relational database management, a process that breaks down data into record groups for efficient processing. There are six stages. By the third stage (third normal form), data are identified only by the key field in their record.  techniques for the data warehouse, and the star schema A data warehouse design that enhances the performance of multidimensional queries on traditional relational databases. One fact table is surrounded by a series of related tables. Data is joined from one of the points to the center, providing a so-called "star query." See OLAP.  data model for multidimensional data marts A subset of a data warehouse for a single department or function. A data mart may have tens of gigabytes of data rather than hundreds of gigabytes for the entire enterprise. See data warehouse. . However, it is important that BI architects study their situation, politics, and culture to determine what works best in their environment.

The Multidimensional Architecture

The MD architecture (Figure 1) is based on the premise that all BI analyses have at their foundation a multidimensional design. The star schema is an elegant data model that layers multidimensional meta data over what is basically a two-dimensional data store (columns and rows), making it act to the user as if it were multidimensional. The star schema gave BI a solid and much needed push into the mainstream when it first appeared. It is still one of the most popular and useful designs for usage in strategic decision-making environments.

[FIGURE 1 OMITTED]

One of the more significant differences between the MD and CIF architectures is in the definition of the data mart. For the MD architecture, the aggregated data mart star schema is approximately the same as the data mart in the CIF architecture. The atomic-level data mart star schema contains the detailed data roughly equivalent to the content in the CIF's data warehouse. However, the design of the atomic-level data marts (star schemas) is significantly different from the design of the CIF data warehouse (denormalised ERD schema). These data-modeling differences constitute the main design differences in these two architectures.

All star schema-based data marts may or may not reside within the same database instance. A collection of these schemas Schemas
Fundamental core beliefs or assumptions that are part of the perceptual filter people use to view the world. Cognitive-behavioral therapy seeks to change maladaptive schemas.
 in a single database instance is called the Data Warehouse Bus Architecture. Unlike the CIF, a separate and physically distinct data warehouse does not exist. The MD architecture is divided into two groups of components and processes- the back room and front room.

The back room is where the data-staging and data acquisition processes take place. Mapping to the operational systems and the technical meta data surrounding these maps is also part of the back room. It is roughly equivalent to the CIF's 'Getting Data In' components with some notable exceptions. One is the lack of an ERD-based data warehouse, as mentioned, and the other is the presence of atomic and aggregated star schema data marts. The latter appears in both the back and front rooms.

The data-staging area contains the conformed dimensions but it is also the place where surrogate keys (database) surrogate key - A unique primary key generated by the RDBMS that is not derived from any data in the database and whose only significance is to act as the primary key.

A surrogate key is frequently a sequential number (e.g.
 are generated, maps to the operational systems are kept, current loads of operational data are stored, and any atomic data not currently used in the data marts is stored. Most of the heavy lifting performed by the ETL (Extract, Transform, Load) The functions performed when pulling data out of one database and placing it into another of a different type. ETL is used to migrate data, often from relational databases into decision support systems.  tools occurs here as well.

The Data Warehouse Bus Architecture consists of two types of data marts:

Atomic Data Marts. These data marts hold multidimensional data at the lowest common denominator low·est common denominator
n.
1. See least common denominator.

2.
a. The most basic, least sophisticated level of taste, sensibility, or opinion among a group of people.

b.
 level (lowest level of detail available throughout the environment). They may contain some aggregated data as well to improve query, performance. The data is stored in a star schema data model.

Aggregated Data Marts. These data marts contain data related to a core business process such as marketing, sales, or finance. Generally, the atomic data marts supply the data to be aggregated for these data marts but that is not mandatory. It is possible to create an aggregated data mart directly from the data-staging area. As with the atomic data marts, data is stored in the aggregated data marts in star schema designs.

Your need for both types of data marts depends on your business requirements and the performance of each of these structures in your environment. However, it is important to understand that the MD architecture starts and ends with its focus primarily on the individual business unit(s) or group of business users with a specific BI requirement. This singular SINGULAR, construction. In grammar the singular is used to express only one, not plural. Johnson.
     2. In law, the singular frequently includes the plural.
 focus is reflected in the structure of the data, which is optimized to accommodate that unit or group of users perfectly. No two star schemas are exactly alike-each provides an optimal way of accessing data for a specific set of requirements. As unit after unit or group after group is added to the list of BI recipients, either new star schemas must be built to accommodate them specifically or the existing design must be reconstructed re·con·struct  
tr.v. re·con·struct·ed, re·con·struct·ing, re·con·structs
1. To construct again; rebuild.

2.
 to expand its functionality.

The front room is the interface for the business community. We see it as roughly equivalent to the CIF's "Getting Information Out" components. It is clear that the decision support interfaces (called Access Services) and their corresponding end user access tools belong in this part of the architecture. The two types of data marts also appear in the front room as the source of data for these interfaces and tools. The basic tenet TENET. Which he holds. There are two ways of stating the tenure in an action of waste. The averment is either in the tenet and the tenuit; it has a reference to the time of the waste done, and not to the time of bringing the action.
     2.
 of the front room is to mask or hide the complexity going on in the back room from the business community since it is believed by these authors that users of these components neither know nor care about the significant amount of energy, time, and resources poured into creating the back room.

It is in the front room that we begin to see personal data marts (also called "spreadmarts") popping up, as well as disposable data marts (data marts created for a specific short-lived business requirement). Care should be taken in both cases to ensure that these do not supplant sup·plant  
tr.v. sup·plant·ed, sup·plant·ing, sup·plants
1. To usurp the place of, especially through intrigue or underhanded tactics.

2.
 or replace the real data marts; otherwise, you end up with chaos again, The end user access tools consist of OLAP (OnLine Analytical Processing) Decision support software that allows the user to quickly analyze information that has been summarized into multidimensional views and hierarchies. OLAP tools are used to perform trend analysis on sales and financial information.  engines, reporting and querying tools, and maybe even some data-mining tools. We caution the reader here that the process of building a star schema limits the usefulness of these data marts for complete and unbiased data mining and statistical analyses, as well as for exploration analyses. If the data is stored in only star schemas, then it becomes impossible to find unrelated patterns or correlations in the raw data. Because the star contains only known relationships, then patterns or correlations between unrelated data sets cannot be performed.

The front room also contains the query management and activity-monitoring services. These are very useful in maintaining the appropriate performance for each data mart installation. Query management involves services such as query retargeting, aggregate awareness, and query governing gov·ern  
v. gov·erned, gov·ern·ing, gov·erns

v.tr.
1. To make and administer the public policy and affairs of; exercise sovereign authority in.

2.
. Activity monitoring captures information about the usage of these databases to determine if performance and user support are optimal.

There are many other services embedded Inserted into. See embedded system.  in the front room that we do not list here. For the full set, please refer to the books by Ralph Kimball et al. Suffice suf·fice  
v. suf·ficed, suf·fic·ing, suf·fic·es

v.intr.
1. To meet present needs or requirements; be sufficient: These rations will suffice until next week.
 it to say that much of what is captured in the CIF Operations and Administration Service Management function is also captured in parts of this architecture as well.

Because the approach is predominately a bottom-up one, it is easy to violate the corporate or enterprise business rules when constructing the star schema. If there is no insistence that top-down design A design technique that starts with the highest level of an idea and works its way down to the lowest level of detail. See top-down programming.

(programming) top-down design - (Or "stepwise refinement").
 work be performed, the star schemas can easily become stovepipe implementations, lacking in the ability to link together, producing inconsistent and, perhaps worse, conflicting, intelgence across the enterprise. Strong and experienced multidimensional modelers, just like experienced ERD modelers, overcome this because their experience allows them to recognize the need to do so. In addition, over the years, the Years, The

the seven decades of Eleanor Pargiter’s life. [Br. Lit.: Benét, 1109]

See : Time
 MD approach has been modified in attempts to overcome the shortcoming short·com·ing  
n.
A deficiency; a flaw.


shortcoming
Noun

a fault or weakness

Noun 1.
 of the lack of an enterprise view, by ensuring that the various data mart star schemas 'conform' to some enterprise standards.

Conformed dimensions are one way to overcome this shortcoming. According to according to
prep.
1. As stated or indicated by; on the authority of: according to historians.

2. In keeping with: according to instructions.

3.
 Kimball et al., a conformed dimension is one that means the same thing to every possible fact table to which it can be joined. Ideally, this means that a conformed dimension is identical in every star schema that uses it. Examples of these are Customer, Product, Time, and Locations dimensions.

Another workaround (jargon, programming) workaround - A temporary kluge used to bypass, mask or otherwise avoid a bug or misfeature in some system. Customers often find themselves living with workarounds for long periods of time rather than getting a bug fix.  the shortcoming was the creation of a data-staging area (not shown in Figure 1.0). In this data store, the designer consolidates all of a dimension's attributes into a single conformed dimension to be replicated to all the requesting star schemas. It is the responsibility of the design team to create, publish, maintain, and enforce the usage of these conformed dimensions throughout all data marts. Once consolidated, the conformed dimensions are permanently stored in the data-staging area. This retrofit ret·ro·fit  
v. ret·ro·fit·ted or ret·ro·fit, ret·ro·fit·ting, ret·ro·fits

v.tr.
1. To provide (a jet, automobile, computer, or factory, for example) with parts, devices, or equipment not in
 of an enterprise standard mitigates the possible inconsistencies and discrepancies that occur in dimensions with no enterprise consideration. The data warehouse bus design concept was developed for this purpose.

The Corporate Information Factory Architecture

Figure 2.0 is a simplified version of the CIF, showing these two functions and the components and processes involved in each.

[FIGURE 2 OMITTED]

The staging area staging area
n.
A place where troops or equipment in transit are assembled and processed, as before a military operation.

Noun 1.
 (hOt shown in Figure 2.0) in the CIF: includes persistent tables for storing the key conversion information and other reference tables used in the data acquisition process. Replicated operational data not yet used in the warehouse may also be stored there, waiting for integration and loading into the warehouse. The staging area may or may hot be separate from the data warehouse but if it is on the same platform as the warehouse, it should be in its own database instance.

In the MD architecture, the back room is completely off-limits to the business community. Unlike the data-staging area in the back room of the MD architecture, business community access to the CIF data warehouse is discouraged dis·cour·age  
tr.v. dis·cour·aged, dis·cour·ag·ing, dis·cour·ag·es
1. To deprive of confidence, hope, or spirit.

2. To hamper by discouraging; deter.

3.
, but exceptions for special exploration or one-time extraction needs are permitted. Other than security restrictions that you may want to implement, there is nothing to prevent its usage since the data is completely documented, integrated, and validated val·i·date  
tr.v. val·i·dat·ed, val·i·dat·ing, val·i·dates
1. To declare or make legally valid.

2. To mark with an indication of official sanction.

3.
. However, the data model is complicated, and the business user must understand an ERD model and how to 'walk a relational database' in order to use it.

Comparison of the CIF and MD Architectures

Figure 3.0 is an adaptation of a slide from Laura Reeves of StarSoft (www.starsoftinc.com) comparing the CIF and MD architectures. The significant points in this figure are that access is generally not allowed above the diagonal line in both architectures, and there is no physical repository equivalent to the data warehouse in the MD architecture. The 'data warehouse bus" shown for the MD architecture is the collection of the atomic and aggregated data marts. Both the CIF and MD architectures have a staging area, meta data management, and sophisticated data acquisition processing. The designs of the data marts are predominantly pre·dom·i·nant  
adj.
1. Having greatest ascendancy, importance, influence, authority, or force. See Synonyms at dominant.

2.
 multidimensional for both architectures, though the CIF is not limited to just this design and can support a much broader set of data mart design techniques.

[FIGURE 3 OMITTED]

What's missing in the MD architecture is a separate physical data warehouse. The "data warehouse" in this architecture as mentioned earlier is virtual and consists of the collection of all the individual data marts and their corresponding data (both atomic level and aggregated levels). The closest thing to the CIF data warehouse seems to be the "data-staging area" in the MD architecture, which, in his August 1997 DBMS (DataBase Management System) Software that controls the organization, storage, retrieval, security and integrity of data in a database. It accepts requests from the application and instructs the operating system to transfer the appropriate data.  Magazine article "A Dimensional Modeling Dimensional modeling (DM) is the name of a logical design technique often used for data warehouses. It is different from, and contrasts with, entity-relationship modeling (ER). According to Prof.  Manifesto MANIFESTO. A solemn declaration, by the constituted authorities of a nation, which contains the reasons for its public acts towards another.
     2. On the declaration of war, a manifesto is usually issued in which the nation declaring the war, states the reasons
," Ralph Kimball states is often designed using ERD or third normal form data models.

Now, let's look more closely at the major comparison topics for the MD and CIF architectures: scope, perspective, data flow, implementation speed and cost, volatility, flexibility, functionality, and ongoing maintenance.

Scope

BI is about discovery. CIF and MD architectures both help an enterprise satisfy its basic need for more information about both itself and the environment in which it exists. CIF and MD both assume that BI requirements will emerge from business units of an organization, as well as from the organization as a whole. To illustrate how enterprise data can differ from business unit data, consider that, for a bank, 'customer" might mean an individual account holder to Finance, a household of account holders to Marketing, and a non-account-holder to Customer Service. To the enterprise, "customer" means all of these and more, and distinct terms and definitions for each type of customer may be needed. Such differences in meaning are synonymous with synonymous with
adjective equivalent to, the same as, identical to, similar to, identified with, equal to, tantamount to, interchangeable with, one and the same as
 differences in scope. While neither of the architectures ignores enterprise scope or business unit scope, each favours one over the other. CIF places a higher priority on enterprise scope, and MD places a higher priority on business unit scope. Hence, the scope of the first few projects under the CIF architecture may be a bit larger than the scope for an MD architectural project.

Perspective

CIF proponents frequently say that the historic problem with BI implementations is that the BI source data is difficult to locate, gather, integrate, under- stand, and deliver. Given an enterprise scope, they emphasize the perspective of supplying enterprise data. IT is often centralized cen·tral·ize  
v. cen·tral·ized, cen·tral·iz·ing, cen·tral·iz·es

v.tr.
1. To draw into or toward a center; consolidate.

2.
 and experienced at maintaining data at the enterprise level, so IT tackles the problems of supplying BI source data from an enterprise point of view. CIF proponents favour the needs of the enterprise and advocate getting the BI source data modeled for the enterprise as a prerequisite pre·req·ui·site  
adj.
Required or necessary as a prior condition: Competence is prerequisite to promotion.

n.
 for any BI implementation. Note though, that this does not mean that the entire enterprise data must be dealt with during the first project. On the contrary, a subset A group of commands or functions that do not include all the capabilities of the original specification. Software or hardware components designed for the subset will also work with the original.  of the overall enterprise's data is selected, predominantly from a subject area like Customer or Product, and the data warehouse data model and resulting database are implemented for just this small part of the overall set of enterprise data.

MD proponents frequently say the same thing about the historic problem with BI implementations. Using the same words, and given their business unit scope, they emphasize the perspective of consuming business unit data. Business units that consume BI data, such as Sales or Finance, are experienced with their individual needs and views. If another business unit has different needs and views, that's okay. They just don't value other business unit needs and views as much as they do their own. MD proponents favor the needs of the business unit and advocate getting the BI source data modeled for the business unit as a prerequisite for any BI implementation. It is important to note that the multidimensional modeler must strive to achieve consensus on the definition of the conformed dimensions across the enterprise, however. He or she concentrates only on those dimensions pertinent PERTINENT, evidence. Those facts which tend to prove the allegations of the party offering them, are called pertinent; those which have no such tendency are called impertinent, 8 Toull. n. 22. By pertinent is also meant that which belongs. Willes, 319.  to the facts being loaded. Where a new fact is introduced that requires new dimensions not previously defined, the multidimensional modeler must again take an enterprise view and gain a consensus definition among those business areas that have some stake in that dimension.

Data Flow

To create a sustainable BI environment, one must understand the iterative it·er·a·tive  
adj.
1. Characterized by or involving repetition, recurrence, reiteration, or repetitiousness.

2. Grammar Frequentative.

Noun 1.
 nature of the projects and the relationship the ultimate environment has with the sources of data supplied to the enterprise. Like the chicken and egg paradox paradox, statement that appears self-contradictory but actually has a basis in truth, e.g., Oscar Wilde's "Ignorance is like a delicate fruit; touch it and the bloom is gone. , BI questions create answers that create more BI questions (Figure 4.). Even though BI source data starts and ends at the same places for CIF and MD, given these two architectures' unique scopes and perspectives, they view BI data flow differently. It's a matter of push versus pull. In general, the CIF approach is top- down. CIF suppliers of enterprise BI data use the business requirements to push the data from the operational systems to where it's needed. The focus is on integrating the enterprise data for usage in any data mart from the very first project.

[FIGURE 4 OMITTED]

By contrast, the MD approach is bottom-up. MD consumers of business unit BI data use the business requirements to pull the data from the operational systems to where it's needed. The focus is on getting business-unit-specific data quickly into the hands of the users with minimal regard for the overall enterprise usage until such a need is demonstrated.

CIF and MD both seek to minimize BI implementation time and cost. Both benefit greatly from a prototype of decision support interface functionality. The difference between the two in terms of implementation speed and cost involves long-term and short-term trade-offs.

Because of CIF's enterprise scope, the first CIF project will likely require more time and cost than the first MD project, due to increased overhead for making parts of the subject area and business data models as compatible across the enterprise as practically possible. CIF developers should be cautioned against both losing sight of the business unit requirements and trying to perfect the enterprise data model. In contrast, subsequent CIF projects tend to require less time and cost than subsequent MD projects, especially for business units that utilize existing, robust subject areas. MD developers should be reminded that each subsequent MD project might include nontrivial nontrivial - Requiring real thought or significant computing power. Often used as an understated way of saying that a problem is quite difficult or impractical, or even entirely unsolvable ("Proving P=NP is nontrivial"). The preferred emphatic form is "decidedly nontrivial".  changes to the already implemented conformed dimensions. Expediting the requirements-gathering and implementation processes may complicate com·pli·cate  
tr. & intr.v. com·pli·cat·ed, com·pli·cat·ing, com·pli·cates
1. To make or become complex or perplexing.

2. To twist or become twisted together.

adj.
1.
 the task of providing consistent and reliable data throughout the BI environment.

The detailed data generally appears once in the CIF (though some denormalisation may occur for loading and data delivery performance reasons) and is readily available for any and all data marts, thus minimizing storage space requirements. This nonredundancy precludes storing data (except foreign keys) in multiple places. This feature of the data model also minimizes or may eliminate update or delete To remove an item of data from a file or to remove a file from the disk. See file wipe, trash and undelete.

1. (operating system) delete - (Or "erase") To make a file inaccessible.
 anomalies that could occur during cascading processes with redundant data content. These benefits are comprised in the MD architecture.

Volatility

The multidimensional model, especially for the aggregated data marts, is dependent on a determination of the possible questions that might be asked in order to eliminate or reduce the need to reconstruct re·con·struct  
tr.v. re·con·struct·ed, re·con·struct·ing, re·con·structs
1. To construct again; rebuild.

2.
 the fact tables should new or changed dimensions be needed. If a change occurs in a business process (that is, the queries change), then the multidimensional model must be reshuffled or reconstructed. The multidimensional model can certainly be extended to accommodate some unexpected new data elements such as new facts (as long as they are at the same level of granularity The degree of modularity of a system. More granularity implies more flexibility in customizing a system, because there are more, smaller increments (granules) from which to choose.  as the rest of the fact table) and new dimensional attributes. However, at the atomic level, this can be a severe penalty. The fact tables can contain many hundreds of millions or even billions of rows, so a rebuild is not advised. Generally, a new (and mostly redundant) star schema is created when this happens.

For the CIF approach, the data warehouse data model is process-free, which removes any biases or hard-coded relationships due to process influences. The data model is dependent on the enterprise's business rules-not what queries will be run against it-for its design. The data model is also far more forgiving of processing changes in the business environment due to a lack of processing bias. Because the model is not designed with any questions in mind, it can supply information for the ultimate data marts through the relatively trivial TRIVIAL. Of small importance. It is a rule in equity that a demurrer will lie to a bill on the ground of the triviality of the matter in dispute, as being below the dignity of the court. 4 Bouv. Inst. n. 4237. See Hopk. R. 112; 4 John. Ch. 183; 4 Paige, 364.  process of data delivery. If an established data mart requires changes or enhancements, it can be reasonably and quickly rebuilt from the detailed data stored in the data warehouse.

Flexibility

The MD architecture puts a stake in the ground in terms of the design of the entire BI environment. That stake is that all components (except the data- staging area) must be multidimensional in design. This might make sense from an academic standpoint The Standpoint is a newspaper published in the British Virgin Islands. It was originally published under the name Pennysaver, largely as a shopping-coupon promotional newspaper, but since emerged as one of the most influential sources of journalism in the ; however, we find in practice that significant and useful technologies can be deployed without this stringent restriction. This is analogous analogous /anal·o·gous/ (ah-nal´ah-gus) resembling or similar in some respects, as in function or appearance, but not in origin or development.

a·nal·o·gous
adj.
 to someone saying that all they have is a hammer and therefore everything must be a nail. If you design your environment using multidimensional designs, then all you will ever do are multidimensional analyses. Nothing more sophisticated or advanced.

The CIF architecture makes no such claim and, in fact, goes to extremes to include the possibility of many different forms of BI analyses. The data warehouse as we have described in this book can support technologies that are not multidimensional in nature. Technologies like memory resident BI tools are certainly not multidimensional. In fact, they require no data model whatso- ever. Bitmapped indexes and token databases have no need for multidimensional designs. Finally, true statistical analytical analytical, analytic

pertaining to or emanating from analysis.


analytical control
control of confounding by analysis of the results of a trial or test.
 tools require flat files or data sets that are not dependent upon multidimensional designs. All are supported with no caveats, biases, or false preconditioning preconditioning

preparation of 6 to 8 months old range-reared, recently weaned beef calves for entry into a feedlot and an intensive fattening program. Includes castration, dehorning and branding 3 weeks before and all vaccinations 2 weeks before weaning, and weaning 3 to 4 weeks
 by the CIF data warehouse.

Complexity

Complexities tend to cause fewer problems for CIF than for MD, because the architecture starts with an enterprise-focused, complex data model and then uses it in multiple situations that are usually simpler in design. In the case of creating the multidimensional data marts from the CIF data warehouse, you pull data from a more-complex, multipurpose mul·ti·pur·pose  
adj.
Designed or used for several purposes: a multipurpose room; multipurpose software.


multipurpose
Adjective
 model into a less-complex one. The data model for the CIF data warehouse minimizes the risk of data inconsistencies because the detailed data in the data warehouse is process-free. In other words Adv. 1. in other words - otherwise stated; "in other words, we are broke"
put differently
, it has not been set up for a specific set of questions, functions, or processes; rather, it is able to supply data for any question or query. For the MD approach, the multidimensional or star schema data model is easy to understand by the business community. The data model is generally less complex and resembles the way many business community members think about their data-that is, they think in terms of multiple dimensions, for example, 'Give me all the sales revenues for each store, in each city and state, by market segment over the last two months." Thus, it is also easier to construct by the IT data modelers. However, given the complexity of an enterprise view of the data as you go from data mart implementation to data mart implementation, retrofitting is significantly harder to accomplish for this architecture. That is why the CIF architecture places the star schema designs in the data marts only-never in the data warehouse itself.

Functionality

The multidimensional architecture provides an ideal environment for relationally oriented o·ri·ent  
n.
1. Orient The countries of Asia, especially of eastern Asia.

2.
a. The luster characteristic of a pearl of high quality.

b. A pearl having exceptional luster.

3.
 multidimensional processing, ensuring good performance for complex 'slice and dice,' drill-up, -down, and -around queries. All dimensions are equivalent to each other, meaning that all queries within the bounds of the star schema are processed with roughly the same symmetry symmetry, generally speaking, a balance or correspondence between various parts of an object; the term symmetry is used both in the arts and in the sciences. . We recommend that it be used for the majority of CIF data mart implementations. But do remember that multidimensional modeling does not easily accommodate alternate methods of analysis such as data mining and statistical analysis.

The CIF uses a data model that is based on an ERD methodology that supports the business rules of the enterprise. This type of model is also easily enhanced or appended if need be. Attributes are placed in the data model based on their inherent properties rather than specific application requirements. This is an important differentiator in the BI world because it means that the data warehouse is positioned to support any and all forms of strategic data analyses, not just multidimensional ones. Data mining, statistical analysis, and ad hoc For this purpose. Meaning "to this" in Latin, it refers to dealing with special situations as they occur rather than functions that are repeated on a regular basis. See ad hoc query and ad hoc mode.  or exploration functionalities are supported as well as the multidimensional ones.

Ongoing Maintenance

There is an old adage: "Pay me now or pay me later." For this final discussion, that adage should be expanded to include: "But it will cost you a lot more if you pay me later." By now, you realize that the whole purpose behind the CIF is to stop the high costs of later constructions, adjustments, retrofits, and sub-optimal accommodations to your BI environment. It may cost you a bit more up front, in terms of making the effort to capture an enterprise view of your company's data for your first or second BI implementation. However, BI environments build upon the past iterations and will take years to complete, if it's ever finished. Just as a sound foundation for a house takes forethought fore·thought  
n.
1. Deliberation, consideration, or planning beforehand.

2. Preparation or thought for the future. See Synonyms at prudence.
 and is absolutely necessary for the longevity longevity (lŏnjĕv`ĭtē), term denoting the length or duration of the life of an animal or plant, often used to indicate an unusually long life.  of the, structure, regardless of the changes that occur to it over the years, a well-designed data warehouse data model will serve your enterprise for the long haul Long distance. Long haul implies traversing a state or a country. Contrast with short haul. . With each iteration One repetition of a sequence of instructions or events. For example, in a program loop, one iteration is once through the instructions in the loop. See iterative development.

(programming) iteration - Repetition of a sequence of instructions.
, the CIF as your foundation will yield tremendous paybacks in terms of:

* The enhancement of existing marts

* The maintenance and sustenance Sustenance
Amalthaea

goat who provided milk for baby Zeus. [Gk. Myth.: Leach, 41]

ambrosia

food of the gods; bestowed immortal youthfulness. [Gk. Myth.
 of the data warehouse and related data marts

* The overall satisfaction for all your business community members, including those focused on multidimensional analyses

Summary

In this feature we described the Multidimensional (MD) and the Corporate Information Factory (CIF) architectures in terms of their approach to the construction of the BI environment. The MD architectural approach subordinates data management to business requirements because its reason for being is to satisfy a business unit within the enterprise. On the other hand, the CIF architectural approach manages data to the subordination of the business requirements because its reason for being is to serve the entire enter-prise. The similarities and differences between these two approaches stem from these fundamental differences.

As stated earlier, we find that a combination of the data-modeling techniques round in the two architectural approaches works best-ERD or normalization techniques for the data warehouse and the star schema data model for multi-dimensional data marts. This is the ultimate goal of the CIF and uses the strengths of one form of data modeling and combines it seamlessly with the strengths of the other. In other words, a CIF with only a data warehouse and no multidimensional marts is fairly useless and a multidimensional data-mart-only environment risks the lack of an enterprise integration and support for other forms of BI analyses.

From: Mastering Data Warehouse Design Wiley
COPYRIGHT 2004 A.P. Publications Ltd.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2004, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Database And Network Intelligence
Author:Geiger, Jonathan G
Publication:Database and Network Journal
Geographic Code:1USA
Date:Jun 1, 2004
Words:4253
Previous Article:The essential guide to Linux.(Digest)(Brief Article)(Book Review)
Next Article:Measuring oracle performance.(Database And Network Intelligence)(Online transaction processing)
Topics:



Related Articles
INFORMATION BUILDERS ANNOUNCES THE EDA/DATA WAREHOUSE; Industry Leader in Data Access Middleware Introduces Methodology to Provide Rapid Delivery at...
Putting It All Together.(Statistical Data Included)
Under Construction.(Data warehousing management and information services.)(Statistical Data Included)
E-BUSINESSES SEEK CONFIGURATIONS FOR RANGE OF CLIENT DEVICES.(Technology Information)
Magma Solutions Launches Workbench to Accelerate Data Warehouse Migration from NCR Teradata to Oracle.
Business intelligence integration: extending the information net. (Storage Networking).
REMINDER/DCI Brings Four Events to New York, August 26th - 28th, at the Jacob K. Javits Convention Center.
Data warehousing in 'real time'.(Intelligence)
MySQL V5--ready for prime time business intelligence.(DATABASE AND NETWORK INTELLIGENCE: White Paper)
The Data Warehousing Institute Announces Fall 2006 World Conference in Orlando.

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles