Printer Friendly

Business intelligence integration: extending the information net. (Storage Networking).

Business intelligence may mean different things to different people. Consequently, it is useful to have a common working definition of this sometimes-misunderstood concept. Business intelligence (BI) includes software applications, technologies and analytical methodologies that perform data analysis. BI (also known as decision support systems) includes data mining, Web mining, text mining, reporting and querying, OLAP, and data visualization.

Taking BI to the next level is the new concept of business intelligence integration (BII). Perhaps the best way to illustrate BII is to equate it to commercial fishing. Now, think of a huge fishing net as your company's data warehouse. It has a wealthy catch of information related to your business. But there are also other fish that escape your net's grasp, bits and pieces of valuable information residing outside of the warehouse that are critical to your company's success. These adhoc data sources include information contained in ASCII files, Excel spreadsheets, Access databases, legacy systems (that didn't quite make the data warehouse list), departmental SQL servers and dozens of other disparate data sources including "finished" reports.

Five Classes of Ad-Hoc Data Sources

One-Time Data: Information that is too small or "out of place" for the data warehouse -- it is a minute amount of data, such as a list, that is used for a specific application, typically an Excel spreadsheet.

Report Results: Quite often the result of an application--a report, for instance--is an ad-hoc data source. The report is interesting, but there is a need to incorporate more information into it. With these reports it is not always possible or desirable to change the sourcing application. The report itself then becomes an ad-hoc data source. We sometimes refer to this as the "morphing report," and it typically occurs when the analysis is iterative, dynamic and complicated, or when passing intermediate results from one analyst to another.

Pending Warehouse Data: Information that will be part of the data warehouse, but the planning for its inclusion has not been completed, or the financial resources are not immediately available.

Legacy System Data: Information that resides in a legacy system and is serving a useful purpose but will never be part of the data warehouse, because the cost or effort to put the information into the warehouse is simply not justifiable based on its perceived value.

External Data: The tremendous amount of data that we are exchanging with our customers and vendors are representative of the data explosion that we have heard so much about. These ad-hoc data sources rarely get inserted into the data warehouse.

Certainly, there are business intelligence tools that--with varying degrees of difficulty, time, expense and resources--are capable of accessing these ad-hoc data sources and potentially integrating them into the information residing in the data warehouse. The question is how quickly, how effectively and at what cost these tools can perform this function. The answer is, frankly, not very quickly, not very effectively and often at significant cost.

This problem becomes magnified when you consider the rate at which these ad-hoc data sources are appearing, an outgrowth of the data explosion. When we typically think of the data explosion we tend to think about our databases growing by the number of rows or tables being added. But in reality the data explosion really means the ad-hoc data explosion. We need to pay more attention to this "new" kind of information age and to have tools available to deal with this reality.

As data warehouses were created, companies integrated their pertinent data into these warehouses. However, the number of data sources began to increase, and it became impractical, as well as expensive, to integrate all of the data from these ad-hoc sources into the data warehouse. Perhaps the information from disparate sources will eventually present itself to the point where it is practical to enter it into the data warehouse. The problem is that the cost and effort required to enter every piece of information into the warehouse begins to show diminishing returns. It's clear that companies will take the time and should enter the majority of their data into their warehouse. It's another matter to expect that every data source will be incorporated in a timely manner as new data emerges.

Enter BII

When you have access to and can incorporate and combine all of your data, then you have truly arrived at a comprehensive solution--BII. The theory behind BII is to recognize that there is a problem and address it with a new class of tools, focusing the technology of business intelligence to incorporate ad-hoc data source as seamlessly and effectively as possible.

If we look historically at the development of computer systems, we can draw correlations between the past and BII. We see that at first a concept or methodology was developed. That was shortly followed by a complicated coding mechanism that only engineers could understand or use. From there we generated tools to help with the coding and finally we developed visual technologies to solve the problem.

People have been involved in BII for a number of years now, but it was always performed by technicians. BII has now evolved to a point where everyday users can integrate this data using visual objects.

The ability of BII to incorporate data from ad-hoc data sources is no more evident than in the case of mergers and acquisitions. A company that acquires another organization may want to analyze a particular product line from the acquired company, integrating their legacy data with your data warehouse to determine marketability and potential market share. Obviously, the information would reside in a separate data source, yet it must be integrated with the acquiring company's market information to produce a meaningful report. BII can accomplish this task almost effortlessly.

Up until now, the focus has always been on producing the quickest query response time. While this is important, what we really need to focus on is how to produce the "quickest time to answer" the specific business problem. That is, can you take the information from disparate ad-hoc data sources, query it, cleanse it, manipulate it, and produce a report that addresses the specific question that is being asked of it? In other words how fast is your quickest time to answer (QTA) for these specific business problems?

That's not to say that building a data warehouse is a wasted exercise: quite the contrary. A data warehouse is necessary for corporate survival in today's competitive environment. The problem is, it's just not practical or cost-effective to incorporate every single piece of data into the data warehouse. The key is to have both a data warehouse and a solid BII strategy to deliver all of the information in your company to the people who need it the most.

The truth is every company has ad-hoc data sources. The question is how do you integrate these sources with the information that is in your warehouse? Many of us will not even admit that this data integration problem exists, although, it's our job to provide access to these data nuggets in making strategic business decisions.

So the question is, how do you deliver a comprehensive ad-hoc data solution for your end users knowing that some of the data will always be missing from the data warehouse? Your BII solution must be able to do the following:

* Query and access data from the warehouse.

* Tie in or define ad-hoc data sources quickly and easily: once again things like spreadsheet data, ASCII Files, Web data, external databases or even results from your existing BI tools.

* Access these data sources with very little configuration or effort, with little or no code to write, preferably in a visual or graphical representation.

* Have the ability to clean, transform and sort data. Let's face it: one of the reasons why these data nuggets are not in the warehouse is because the data is dirty; therefore, cleansing and sorting capabilities must exist.

* Create local joins at the application level. If you're able to query these disparate data sources, you then must be able to combine them. And you must be able to combine them without writing complicated code.

* Create consolidated reports against these multiple data sources.

* Deliver the reports in multiple ways to multiple audiences, through the pervasive network of the future.

Future of BII

Where is BIT going in the future? While the formal concept is still relatively new, the future direction of BII is towards automation and proactivity. The ideal scenario is that the information a person wants is delivered automatically, before the person even asks for it. What's more, it can be delivered through the device of choice, such as a pager, email, cell phone, or customized Web portal. The idea is that these BII applications will proactively investigate business anomalies, flag them, and report them in much the same way a person is automatically notified when their stock price increases and they are paged with that information.

It's clear that the concept of BII, when applied correctly, can provide answers to business problems that can have an extremely positive impact on a company's bottom line today. The question is which companies will be among the first to embrace it and realize the benefits before the others?

Jim Kanzler is president and CEO of MetaS (Morgan Hill, Calif.)
COPYRIGHT 2002 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2002, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

Article Details
Printer friendly Cite/link Email Feedback
Author:Kanzler, Jim
Publication:Computer Technology Review
Geographic Code:1USA
Date:Dec 1, 2002
Previous Article:Milk and honey: reaching the promised land of heterogeneous storage management. (Storage Networking).
Next Article:The economics of large-scale data transfer: the benefits of Fibre Channel and SONET for high-performance, cost-effective data transport through WAN....

Related Articles
Managing Storage In The New Internet Economy.
All For One, One For All!
EMC launches world's first integrated ATA networked storage solution.
Sun Microsystems, Hitachi, Ltd., and Hitachi Data Systems extend global relationship to deliver data center storage solutions.
Consolidate storage without losing control.
Fabric-based intelligence: but do a reality check on the switches before you buy.
Databases: the future of storage is in software.
The network-centric file management appliance: overcoming the challenges of enterprise file services.

Terms of use | Privacy policy | Copyright © 2019 Farlex, Inc. | Feedback | For webmasters