The Software RAID Revolution Has Begun!
The irony is that in today's world of super-fast CPUs, bus architectures, and ever-falling memory prices, this dominance makes virtually no sense. Today's CPUs are literally thousands of times faster than those of 15-20 years ago; memory is cheap and abundant; new, fast communication protocols like Fibre Channel are maturing and by today's standards, the processing overhead incurred by RAID tasks is negligible. Examined in this light, hardware-based RAID becomes redundant and altogether unnecessary. Not only is host-based RAID cheaper, faster, more flexible, and easier to implement/maintain, it is the only paradigm able to fully leverage newer technologies like Fibre Channel and Storage Area Networks (SANs)--technologies which imply an openness and connectivity that even the best hardware RAID solutions cannot provide. Once the facts and figures are considered, it becomes obvious that high-performance, host-based RAID is the best and most logical technology to usher in the new age of enterprise-level storage systems.
Host-Based RAID: Not What It Used To Be
Due to its initial shortcomings and the ensuing dominance of hardware RAID in the marketplace, host-based RAID has long been viewed as an inferior, low-end solution, unsuitable for enterprise-level storage demands. While this stereo type may have held some truth in the past, it is simply not valid today. Unlike the feeble SCSI-based RAID software of yesteryear, the new generation of software RAID applications are being specifically designed and developed to take advantage of Fibre Channel and today's multi-processor 64-bit computing architectures. Accordingly, they offer a level of performance unrivaled by even the most elaborate hardware RAID implementations. Key advantages include:
* Cost and Longevity - On average, hardware-based RAID solutions cost four to seven times as much as host-based systems of similar capability. The reason for this huge difference is simple: RAID controllers are expensive pieces of hardware. They are just as vulnerable to depreciation, obsolescence and failure as all other computing devices because they are pieces of hardware. As the march of technology gradually renders them obsolete, old RAID controllers (those purchased more than, say, three years ago) are either kept stubbornly in place or they are thrown away and replaced. Neither choice is cost-effective. In stark contrast software RAID systems can never be rendered obsolete by advances in hardware technology; in fact, the performance of software systems only increases as the underlying hardware improves.
* Speed - In a typical hardware RAID configuration, each controller is assigned to a finite array of disk drives and is responsible for handling all I/O traffic between those drives and the host machine. These controllers use intermediate memory for caching purposes, which means that data going to or from the disks has to be copied twice--once from the source device to the controller's memory and again from the controller's memory to the target device. Host based RAID applications interface directly with the disks and, thus, enjoy a significant built-in speed advantage (See Fig).
If the RAID software is optimized for Fibre Channel, this advantage becomes even more pronounced. While a single fibre-connected, hardware-controlled RAID subsystem is limited to a maximum transfer rate of 100Mbps (200Mbps if using the latest 2Gbps fibre), host-based RAID software is not handicapped by such limitations. The number of disk drives and the number of Host Bus Adapters (HBAs) that are plugged into the host limit bandwidth. For a single high-performance server like a Silicon Graphics Origin2000 (which can accommodate dozens of HBAs), this can equate to sustained transfer rates of gigabytes per second and tens of thousands of I/Os per second. If one continues adding hosts, HBAs, and disk drives into the mix, these performance figures could be increased to a theoretically infinite value. Traditional hardware RAID systems cannot--and will not--ever approach the speed or throughput of an optimized host-based RAID system.
* Dynamic Growth/Shrinkage of the File System - Individual disks can't easily be added/removed as storage needs change because each, controller in a hardware RAID system is "assigned" to a finite array of disks. Expanding most hardware RAID systems today usually involves taking the system offline, backing the RAID's data onto tape, adding the new drives to the volume's configuration files, then copying the RAID's data back again. With the cutting-edge software RAID applications that are now coming to market, that same process can now be accomplished on the fly, without ever shutting down the system or interrupting a single byte of traffic. The steps would literally consist of: 1) Put additional drives in place, 2) Click the Add button on the software's interface. If available drives already existed some where else on the Fibre Channel network (and the host can "see" and target those drives through a switch), the process would be even easier.
* Dynamic/Automatic Adjustment of Stripe Depth - The performance of a given RAID depends very much on how well it is "tuned" to the type of I/O traffic being passed to it. For example, a RAID designed to service requests for very large blocks of data (such as those in streaming video applications) should be configured differently than a RAID designed to service many simultaneous requests for small blocks of data (such as those generated during transaction processing).
This tuning is accomplished by adjusting the RAID's "stripe depth," which is simply how much data gets written to individual disk members during a typical write operation. For this simple example, suppose there is a 6MB file stored on a RAID with a stripe depth of 2MB. If a host were to request that file, it would be returned in 2MB chunks, therefore invoking only three of the RAID's disk members. If that same file were stored on a RAID with a much smaller stripe depth, it could be returned using all of the RAID's disk members simultaneously, thereby increasing the transfer rate substantially. An opposite example would hold for very small files. In that case, it would be desirable to have a relatively large stripe depth, so incoming requests would stand a good chance of being satisfied simultaneously by individual disk members.
In traditional hardware RAID subsystems, the stripe depth is configured initially and cannot be modified to accommodate future changes in I/O traffic characteristics. Fortunately, the new generation of software RAID applications do accommodate such changes, by allowing administrators to adjust the stripe depth of any RAID volume at will, right on their screen. In fact, the newest and very best RAID applications even employ "smart" algorithms, which intelligently analyze I/O traffic and automatically adjust stripe depth (and many other important parameters) to the optimum level.
* Comprehensive Remote Management - Implementing and maintaining hardware-controlled RAIDs is a very cumbersome, labor-intensive process. Installation, monitoring, and configuration is usually done either from a host to which the RAID is directly attached, through a Java or HTML browser-based menu system, or (as is the case most of the time) on the front panel of the RAID enclosure itself. In a large, sprawling enterprise environment (where it is not feasible for a single person to handle such a task), this translates into additional personnel and, therefore, additional operating costs. It also means that it is very difficult to ever get a meaningful, top-down view of the enterprise's resources, especially in heterogeneous, mixed-vendor storage environments.
Thanks to Fibre Channel, today's high-performance RAID management applications make it possible for a single administrator on a single workstation to view, monitor, change and configure any storage resource on the network--all through a comfortable graphical user interface. This management ability extends all the way to the individual disk drives and disk drive enclosure systems themselves, allowing administrators to collect detailed information on array performance that is normally not even available within a hardware RAID system (all hardware RAID controllers intentionally use proprietary methods for communicating with their drives, which makes it impossible for an outside host to query or even "see" an individual drive within the array-host-based RAIDs are obviously not restricted in this way).
The Irrefutable Logic Of A Host-Based RAID System
There is no question that Fibre Channel technology represents the inevitable future of high-demand, enterprise-level storage environments. It offers an echelon of speed, connectivity, and flexibility with which SCSI-based architectures simply cannot compete. What has been absent from the landscape thus far are comprehensive storage management applications that allow the true potential of the medium to be realized.
Fortunately, as software developers have begun to revisit host-based RAID within the context of new technology, such applications are finally being made available. With their superior flexibility, superior speed, and superior features--not to mention a price tag that is one-fourth to one-seventh the size of comparable hardware RAID systems--today's host-based RAID systems are poised to radically alter the way we think about and work with large-scale storage applications. Hardware RAID manufacturers and loyalists take note: the software revolution has begun.
Bret Cox is the president of Radiant Software, Inc. (Santa Monica, CA). Radiant is a wholly owned subsidiary of AT&T Corporation.
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Technology Information|
|Publication:||Computer Technology Review|
|Date:||Apr 1, 2000|
|Previous Article:||NAS: The Storage Workhorse.|
|Next Article:||Storage Networking--Promises, challenges And Coming Convergence.|