Distinguished businessman Jack Welch once said 'An organisation's ability to learn, and translate that learning into action rapidly is the ultimate competitive business advantage.' This is probably why most businesses today spend time and money researching their customers to replace the 'average white male, employed, between the age of 20 and 25' model they targeted until now, with a much more detailed profile. Having gathered an extremely extensive amount of information on their customers by storing data related to purchases e.g. time, location and nature of sale, response to a particular promotion, etc, extracting value from this information should be a fairly straightforward and fast process. Well, it might be straightforward (in certain cases) but it is not necessarily fast.
Despite an obvious difference in size, weight, and look, today's computers still rely on what is fundamentally the same technology their predecessors used 30 years ago, namely the same mathematical processes, albeit with much better performance, portability, disk and memory capacities. Yet, the underlying technology is still fundamentally the same. So while today's machines are lighter, faster, and smaller, they need to be 'reinvented' if they are to be truly different. This is because major breakthroughs in technology tend to happen when a radically different approach is taken; for example, it is widely accepted that quantum computers, which tackle information from an entirely new angle i.e. the atomic level, will make a massive difference to the way we analyse data because they can easily perform tasks that today's computers are completely incapable of performing.
Database technology has recently undergone a similarly radical change; while for many years it relied on indexes in order to categories and then find information, today a new breed of solutions is taking hold. These are the "Relational Database Management Systems" (RDBMS) that offer very high levels of performance without the need for indexing or pre-partitioning of the data. The advantage? Users enjoy flexible and unconstrained access to the data, as well as significantly lower system set-up and maintenance costs.
Let us take the following example: you have an extensive clothing catalogue with 2,000 pages of items including men's trousers, ladies' handbags and children's shoes among others and you want to find all men's red shirts with white buttons and contrasting cuffs. With an index-based database the system would look through the data firstly to look for men's red shirts, then again to find those with white buttons, and then finally to select those with contrasting cuffs. If on the other hand you have an RDBMS that does away with indexes, the system will read the entire catalogue once and provide you with the answer. You might think that reading the entire catalogue is a waste of time but this is where the biggest change was made; these systems are so fast that they can scan every item quicker than it takes the first database to sort through the information using indexes. Sophisticated algorithms and extensive use of memory-based processing allows these databases to scan each individual entry (or row) for a match to the query being processed so quickly that there is no need to build an index. In addition, the elimination of indexes dramatically reduces overall storage requirements; typically 60%-80% of a traditional database implementation will be index storage, none of which is required in the new systems. And finally, the absence of indexes also reduces overall load times, as the index-build phase is eliminated.
But data is not stable because it tends to grow; more products are introduced, seasonal promotions are kicked off, multi-vendor bundles are launched. So what happens when the volume of data to sieve through grows? Performance increases with it. This is due to the fact that this type of RDBMS relies on massively parallel processing power (MPP), which scales the system in direct proportion to the amount of information stored. So for example a one-blade system will scan 100 million rows per second, a ten-blade system will scan one billion rows per second, and so on. This high degree of query scalability means that as data volumes increase, performance will remain constant. This is another similarity between this new breed of databases and quantum computers. As Professor Artur Ekert of the University of Oxford said "It [quantum computing] is like massively parallel processing but in one piece of hardware." And the data analysis challenge is so great that we find another connection between these high-performing databases and quantum computers, namely that one of the key uses for the latter is going to be the search of vast databases.
So if 1,600 internet users took eight months and an astonishing amount of computing power to reduce RSA 129 (a 129-digit number) to two primes, while a quantum computer could crack it in a few seconds, how much faster is the new variety of databases compared to the traditional index-based ones? Recent comparisons of our [WX.sub.2] analytical database solution against Oracle database implementations carried out by end users produced query execution times between ten and 60 times faster. And there is a bonus: these benchmarks were carried out using hardware platforms costing 50%-70% less than the original Oracle platform. [WX.sub.2] was able to satisfy 2,000 queries per day as opposed to 60 or 70.
The results clearly speak for themselves. And while the elimination of indexes cannot yet compete with the astonishing gap in performance we could one day have between a classic supercomputer (billions of years) and a quantum computer (one year), the feedback is certainly very positive: "I got my answer back in six seconds. My jaw hit the ground!" Jim Lewis, senior research associate at the Cambridge Astronomical Survey Unit, Cambridge University Institute of Astronomy.
Roger Llewellyn, Kognitio.
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||DATABASE AND NETWORK INTELLIGENCE: OPINION PIECE|
|Publication:||Database and Network Journal|
|Date:||Apr 1, 2008|
|Previous Article:||Businesses are embracing an XML future to prepare for changing times.|
|Next Article:||Infosecurity Europe 2008.|