Interface considerations for tiered storage.
Tiered Storage Fundamentals
The traditional storage model acknowledges only two types of data storage: online and offline. Online requires high-availability, and is usually for transactional data that demands the performance and reliability of enterprise-class disc drives; offline refers to archival data that is infrequently accessed and stored in libraries of high-capacity tapes.
But what about the vast quantity of data that falls between these two categories? The following applications all involve data that is not necessarily highly-available or mission-critical storage, but which must still be readily accessible to multiple users:
* File serving
* Fixed-content data
* Disc-to-disc backup
* Bulk storage
* Short-term archiving
In addition, regulatory compliance such as Sarbanes-Oxley and HIPAA requires rapid retrieval of enormous quantities of financial and medical data. An Enterprise Strategy Group study concluded that such "in-between" information will soon comprise the majority of enterprise data.
The previous options for this plethora of data--online storage (optimized for performance, not capacity) with its relatively high cost-per-Gbyte; or offline storage (optimized for capacity, not accessibility) with its slow, labor-intensive data retrievals--was inefficient, wasting either costly hardware or overtaxing IT staff resources.
Tiered storage eliminates such inefficiency by introducing a third tier of nearline storage. Melding high capacity, low cost-per-Gbyte and easy integration into existing enterprise infrastructures, nearline storage fills the gap between performance and archival solutions, enabling comprehensive tiered storage strategies that ensure optimal cost/performance for every type of data.
Today, offline storage is predominantly served by tape-based solutions. The balance of this article focuses on interface considerations in online and nearline storage tiers; the offline storage tier is more appropriately examined in a separate analysis.
Fibre Channel (FC)
With the advent of the Fibre Channel storage area network (SAN), online data accessibility skyrocketed--and along with it network efficiency. To be sure, such efficiency requires significant investment in FC infrastructure itself, as well as sizable quantities of money and man-hours to optimize SCSI-based storage management and application software solutions.
To capitalize on such investments, FC tiered storage solutions must efficiently leverage the FC interface. Native FC nearline storage devices make full use of the FC protocol, enabling tighter integration of online, nearline and offline storage tiers. Such interface rationalization streamlines FC SAN infrastructures, reducing costs, easing management and improving performance.
However, some enterprises have turned to nearline-ready SATA disc drives for nearline duty in their FC SANs. Nearline SATA disc drives cost slightly less than nearline FC disc drives, but are more costly to integrate and may compromise system performance and reliability. There are three fundamental ways to add SATA-based nearline storage to an FC SAN:
* Deploy a separate SATA infrastructure
* Install FC-to-SATA conversion bridges
* Utilize the upcoming FC-SATA tunneling protocol
None of these approaches exploits the advanced capabilities the FC protocol provides. For example, the bridged system loses Fibre Channel's error recovery capabilities, advanced multi-host command queuing features and the ability for a single host to simultaneously communicate with multiple drives. (To enable one host to concurrently talk with multiple SATA drives, the bridged system would require the additional expense of a multichannel HBA.)
Native FC nearline storage extracts maximum value from a Fibre Channel SAN by taking complete advantage of the FC protocol. This in turn enables FC tiered storage solutions to deliver optimal efficiency and cost-effectiveness. That said, SATA-based storage solutions still have a useful role to play in applications where minimal cost-per-GB is paramount.
Serial ATA (SATA)
In understanding SATA's role in tiered storage, it's important to differentiate between SATA as a system interface and SATA as a disc drive interface.
Offering modern serial architecture, impressive throughput, and elimination of master/slave headaches, the SATA interface at a system level has quickly superseded its parallel predecessor both on the desktop and in some low-end server applications. In these environments SATA infrastructure provides superior performance.
Complementing that success is the explosive popularity of SATA disc drives, which combine enormous capacity and respectable speed at a remarkably low cost-per-GB. Recently, a new class of nearline SATA disc drives, stronger and smarter than their desktop counterparts, has arisen to address the more demanding workloads found in nearline applications: around-the-clock data accessibility coupled with highly random data activity.
To deliver nearline-class reliability, nearline SATA drives are purpose-built to withstand the rigors of random reads/writes and 24/7 operation. In contrast, typical desktop SATA drives are designed for the less strenuous environment of sequential reads/writes and 8/5 power-on hours.
Nearline SATA drives also incorporate firmware features that simplify integration into multi-disc (such as RAID) nearline applications, including backup/restore and archiving. These tools boost drive performance, ease integration and enhance data integrity in nearline storage environments.
There are several caveats to utilizing nearline SATA devices. As noted earlier, they require either costly bridges or deployment of their own infrastructure, separate from an enterprise's current Fibre Channel or parallel SCSI environment. This entails the redundancy (and corresponding expense) of qualifying, purchasing, inventorying and maintaining a range of SATA parts/components. Existing FC or parallel SCSI shelves, cabinets, and cabling cannot be reused. Enterprise infrastructure hardware is specifically designed to meet the density requirements (for example, vibration and temperature management) of enterprise-class drives. In contrast, SATA cabinets/enclosures are not suitable for such demanding use.
Finally, SATA scalability is severely limited. SATA 1.0 allows only one drive per host controller port (requiring additional ports to handle extra drives). SATA II's hub-like Port Multipliers enable each PM-equipped port on the host controller to connect up to 15 drives--but impose the performance bottleneck typical of hubs.
While SATA infrastructure is clearly problematic in the context of tiered storage environments, nearline-ready SATA drives have a significant advantage--thanks to the new SAS interface.
Serial Attached SCSI (SAS)
Simply put, serial architecture liberates SCSI from the constraints of its parallel past. Serial Attached SCSI takes a decidedly straightforward and direct approach to achieving outstanding performance, scalability and flexibility. Its point-to-point, serial architecture is far simpler and more robust than that of its parallel predecessor, yet offers significantly higher throughput (3.0 Gbits/sec, with a clear roadmap to 12.0 Gbits/sec) as well as vastly superior scalability.
While SAS overcomes many of parallel SCSI's interface limitations through the use of modern serial-based technologies, it also maintains the core strengths of its forerunner by integrating existing SCSI commands. SCSI's robust, mature command set employs sophisticated features such as Cyclic Redundancy Check (CRC) to ensure data integrity in the most rigorous online applications.
But perhaps SAS's most compelling feature is its compatibility with SATA, ensuring unprecedented freedom to specify and consolidate the optimal storage solutions for a broad range of applications. The Serial Attached SCSI standards committee well understood the synergies (both fiscal and physical) that would result if SAS and SATA drives could share a common storage infrastructure.
By offering compatibility between these two complementary storage interfaces, SAS enables IT managers to deploy both online (SAS) and nearline (SATA) storage tiers in a common SAS enclosure. This approach maximizes cost-effectiveness by eliminating the need for redundant SAS and SATA infrastructures (and the management expenses they entail).
The SAS infrastructure also boosts SATA scalability far beyond the limits imposed by SATA infrastructures. High-speed switches known as expanders enable quick aggregation of many drives, allowing a single SAS domain to contain over 16,000 drives (SAS and/or SATA) without performance degradation. And multiple SAS domains can easily be interconnected for remarkable levels of data availability.
Tiered storage is built on a fundamental concept: Use the right tool for the job. By employing storage devices optimized for their specific applications, tiered storage delivers greater efficiency, performance and reliability. In the same manner, choosing the most appropriate interface for those storage devices is key to achieving maximum value from any tiered storage solution.
When referring to hard drive capacity one gigabyte, or GB, equals one billion bytes and one terabyte, or TB, refers to one trillion bytes. Accessible capacity may vary depending on operating environment and formatting.
Pete Steege is a Global Marketing Manager for Enterprise Storage at Seagate (Scotts Valley, CA).
Table 1. Native FC Nearline Storage Benefits Integration Scalability Performance Native FC No hardware Throughput Simultaneous Nearline or software increases as communication Storage changes needed drive count w/multiple drives increases maximizes throughput Bridged/ Requires significant Throughput One-at-a-time Discrete expenditure on unchanged communication SATA Nearline bridges or seperate as drive count w/single drives Storage SATA infrastructure increases limits throughput Availability Simplified Management Native FC Dual ports enable Leverages full Nearline seamless failover capability of FC Storage protection management systems Bridged/ Single port Limited to Discrete lacks failover management SATA Nearline protection capabilities Storage (unless Port supported by SATA Multiplier added)
|Printer friendly Cite/link Email Feedback|
|Publication:||Computer Technology Review|
|Date:||Oct 1, 2005|
|Previous Article:||Are the days for traditional backup and recovery numbered? Changing the rules for backup and recovery.|
|Next Article:||Network-centric IP SAN: a new approach to unleashing the full potential of your IP network.|