Printer Friendly
The Free Library
14,504,020 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Interface considerations for tiered storage.


Never before has the enterprise enjoyed such a comprehensive range of storage solutions from which to choose. The Fibre Channel, Serial ATA See SATA.

Serial ATA - Serial Advanced Technology Attachment
 and Serial Attached SCSI See SAS.  interfaces all play vital roles in tiered storage A data storage system made up of two or more types of storage based on their access speed. For example, magnetic disk and tape or magnetic disk and optical disc are widely used in a tiered storage system. See HSM.  infrastructures, and often the perception of these roles can overlap. Striking the optimal balance of interfaces in tiered storage solutions is key to achieving maximum performance, reliability and cost-effectiveness.

Tiered Storage Fundamentals

The traditional storage model acknowledges only two types of data storage: online and offline. Online requires high-availability, and is usually for transactional data that demands the performance and reliability of enterprise-class disc drives; offline refers to archival data that is infrequently in·fre·quent  
adj.
1. Not occurring regularly; occasional or rare: an infrequent guest.

2.
 accessed and stored in libraries of high-capacity tapes.

But what about the vast quantity of data that falls between these two categories? The following applications all involve data that is not necessarily highly-available or mission-critical storage, but which must still be readily accessible to multiple users:

* File serving

* Fixed-content data

* Disc-to-disc backup

* Bulk storage

* Short-term archiving

In addition, regulatory compliance such as Sarbanes-Oxley and HIPAA (Health Insurance Portability & Accountability Act of 1996, Public Law 104-191) Also known as the "Kennedy-Kassebaum Act," this U.S. law protects employees' health insurance coverage when they change or lose their jobs (Title I) and provides standards for patient health,  requires rapid retrieval of enormous quantities of financial and medical data. An Enterprise Strategy Group study concluded that such "in-between" information will soon comprise the majority of enterprise data.

The previous options for this plethora of data--online storage (optimized for performance, not capacity) with its relatively high cost-per-Gbyte; or offline storage Refers to disks and tapes that are kept in a data library. Offline data cannot be accessed from a computer or terminal until it is mounted in the drive.  (optimized for capacity, not accessibility) with its slow, labor-intensive data retrievals--was inefficient, wasting either costly hardware or overtaxing IT staff resources.

Tiered storage eliminates such inefficiency by introducing a third tier of nearline storage Nearline storage (where Nearline is a contraction of Near-online) is a term used in computer science to describe an intermediate type of data storage. It is a compromise between online storage (constant, very rapid access to data) and offline storage (infrequent . Melding high capacity, low cost-per-Gbyte and easy integration into existing enterprise infrastructures, nearline storage fills the gap between performance and archival solutions, enabling comprehensive tiered storage strategies that ensure optimal cost/performance for every type of data.

Today, offline storage is predominantly served by tape-based solutions. The balance of this article focuses on interface considerations in online and nearline storage tiers; the offline storage tier is more appropriately examined in a separate analysis.

Fibre Channel (FC)

With the advent of the Fibre Channel storage area network (SAN), online data accessibility skyrocketed--and along with it network efficiency. To be sure, such efficiency requires significant investment in FC infrastructure itself, as well as sizable quantities of money and man-hours to optimize SCSI-based storage management and application software solutions.

To capitalize on Cap´i`tal`ize on`   

v. t. 1. To turn (an opportunity) to one's advantage; to take advantage of (a situation); to profit from; as, to capitalize on an opponent's mistakes s>.
 such investments, FC tiered storage solutions must efficiently leverage the FC interface. Native FC nearline storage devices make full use of the FC protocol, enabling tighter integration of online, nearline and offline storage tiers. Such interface rationalization streamlines FC SAN infrastructures, reducing costs, easing management and improving performance.

However, some enterprises have turned to nearline-ready SATA (Serial ATA) A serial version of the ATA (IDE) interface, which has been the de facto standard hard disk interface for desktop PCs for more than two decades. The original Parallel ATA (PATA) interface was launched in 1986.  disc drives for nearline duty in their FC SANs. Nearline SATA disc drives cost slightly less than nearline FC disc drives, but are more costly to integrate and may compromise system performance and reliability. There are three fundamental ways to add SATA-based nearline storage to an FC SAN:

* Deploy a separate SATA infrastructure

* Install FC-to-SATA conversion bridges

* Utilize the upcoming FC-SATA tunneling protocol A network protocol that encapsulates packets at a peer level or below. It is used to transport multiple protocols over a common network as well as provide the vehicle for encrypted virtual private networks (VPNs).

None of these approaches exploits the advanced capabilities the FC protocol provides. For example, the bridged system loses Fibre Channel's error recovery capabilities, advanced multi-host command queuing The ability to store multiple commands and execute them one at a time.  features and the ability for a single host to simultaneously communicate with multiple drives. (To enable one host to concurrently talk with multiple SATA drives, the bridged system would require the additional expense of a multichannel Using two or more paths for transmission or processing. It can refer to a variety of architectures including (1) multiple I/O channels between the CPU and peripheral devices, (2) multiple wires in a cable, (3) multiple "logical" channels within a single wire or fiber or (4) multiple  HBA (Host Bus Adapter) See host adapter. .)

Native FC nearline storage extracts maximum value from a Fibre Channel SAN by taking complete advantage of the FC protocol. This in turn enables FC tiered storage solutions to deliver optimal efficiency and cost-effectiveness. That said, SATA-based storage solutions still have a useful role to play in applications where minimal cost-per-GB is paramount.

Serial ATA (SATA)

In understanding SATA's role in tiered storage, it's important to differentiate between SATA as a system interface and SATA as a disc drive interface.

Offering modern serial architecture, impressive throughput, and elimination of master/slave headaches, the SATA interface at a system level has quickly superseded its parallel predecessor both on the desktop and in some low-end server applications. In these environments SATA infrastructure provides superior performance.

Complementing that success is the explosive popularity of SATA disc drives, which combine enormous capacity and respectable speed at a remarkably low cost-per-GB. Recently, a new class of nearline SATA disc drives, stronger and smarter than their desktop counterparts, has arisen to address the more demanding workloads found in nearline applications: around-the-clock data accessibility coupled with highly random data activity.

To deliver nearline-class reliability, nearline SATA drives are purpose-built to withstand the rigors of random reads/writes and 24/7 operation. In contrast, typical desktop SATA drives are designed for the less strenuous stren·u·ous  
adj.
1. Requiring great effort, energy, or exertion: a strenuous task.

2. Vigorously active; energetic or zealous.
 environment of sequential reads/writes and 8/5 power-on hours.

Nearline SATA drives also incorporate firmware A category of memory chips that hold their content without electrical power. Firmware includes flash, ROM, PROM, EPROM and EEPROM technologies. When holding program instructions, firmware can be thought of as "hard software." See flash memory, ROM, PROM, EPROM, EEPROM and FOTA.  features that simplify integration into multi-disc (such as RAID) nearline applications, including backup/restore and archiving. These tools boost drive performance, ease integration and enhance data integrity in nearline storage environments.

There are several caveats to utilizing nearline SATA devices. As noted earlier, they require either costly bridges or deployment of their own infrastructure, separate from an enterprise's current Fibre Channel or parallel SCSI Parallel SCSI (formally, SCSI Parallel Interface, or SPI) is one of the interface implementations in the SCSI family. In addition to being a data bus, SPI is a parallel electrical bus: There is one set of electrical connections stretching from one end of the SCSI bus  environment. This entails the redundancy (and corresponding expense) of qualifying, purchasing, inventorying and maintaining a range of SATA parts/components. Existing FC or parallel SCSI shelves, cabinets, and cabling cannot be reused. Enterprise infrastructure hardware is specifically designed to meet the density requirements (for example, vibration and temperature management) of enterprise-class drives. In contrast, SATA cabinets/enclosures are not suitable for such demanding use.

Finally, SATA scalability is severely limited. SATA 1.0 allows only one drive per host controller port (requiring additional ports to handle extra drives). SATA II's hub-like Port Multipliers Also called a "fan-out," it is a device that expands one port into several. For example, an Ethernet port multiplier allows multiple stations to be connected to a 10Base5 cable via one transceiver tap. Otherwise, each station requires its own transceiver.  enable each PM-equipped port on the host controller to connect up to 15 drives--but impose the performance bottleneck A lessening of throughput. It often refers to networks that are overloaded, which is caused by the inability of the hardware and transmission lines to support the traffic. It can also refer to a mismatch inside the computer where slower-speed peripheral buses and devices prevent the CPU  typical of hubs.

While SATA infrastructure is clearly problematic in the context of tiered storage environments, nearline-ready SATA drives have a significant advantage--thanks to the new SAS (1) (SAS Institute Inc., Cary, NC, www.sas.com) A software company that specializes in data warehousing and decision support software based on the SAS System. Founded in 1976, SAS is one of the world's largest privately held software companies. See SAS System.  interface.

Serial Attached SCSI (SAS)

Simply put, serial architecture liberates SCSI SCSI
 in full Small Computer System Interface

Once common standard for connecting peripheral devices (disks, modems, printers, etc.) to small and medium-sized computers. SCSI has given way to faster standards, such as Firewire and USB.
 from the constraints of its parallel past. Serial Attached SCSI takes a decidedly straightforward and direct approach to achieving outstanding performance, scalability and flexibility. Its point-to-point, serial architecture is far simpler and more robust than that of its parallel predecessor, yet offers significantly higher throughput (3.0 Gbits/sec, with a clear roadmap to 12.0 Gbits/sec) as well as vastly superior scalability.

While SAS overcomes many of parallel SCSI's interface limitations through the use of modern serial-based technologies, it also maintains the core strengths of its forerunner A family of ATM adapters from Marconi (formerly Fore Systems). See Marconi.  by integrating existing SCSI commands In SCSI computer storage, a command is the basic unit of communication. The SCSI command architecture was originally defined for parallel SCSI buses but has been carried forward with minimal change for use with Fibre Channel, iSCSI and Serial Attached SCSI. . SCSI's robust, mature command set employs sophisticated features such as Cyclic Redundancy Check (algorithm) cyclic redundancy check - (CRC or "cyclic redundancy code") A number derived from, and stored or transmitted with, a block of data in order to detect corruption.  (CRC (Cyclical Redundancy Checking) An error checking technique used to ensure the accuracy of transmitting digital data. The transmitted messages are divided into predetermined lengths which, used as dividends, are divided by a fixed divisor. ) to ensure data integrity in the most rigorous online applications.

But perhaps SAS's most compelling feature is its compatibility with SATA, ensuring unprecedented freedom to specify and consolidate the optimal storage solutions for a broad range of applications. The Serial Attached SCSI standards committee well understood the synergies (both fiscal and physical) that would result if SAS and SATA drives could share a common storage infrastructure.

By offering compatibility between these two complementary storage interfaces, SAS enables IT managers to deploy both online (SAS) and nearline (SATA) storage tiers in a common SAS enclosure. This approach maximizes cost-effectiveness by eliminating the need for redundant SAS and SATA infrastructures (and the management expenses they entail).

The SAS infrastructure also boosts SATA scalability far beyond the limits imposed by SATA infrastructures. High-speed switches known as expanders enable quick aggregation of many drives, allowing a single SAS domain to contain over 16,000 drives (SAS and/or SATA) without performance degradation. And multiple SAS domains can easily be interconnected for remarkable levels of data availability Refers to the degree to which data can be instantly accessed. The term is mostly associated with service levels that are set up either by the internal IT organization or that may be guaranteed by a third party datacenter or storage provider. .

Conclusion

Tiered storage is built on a fundamental concept: Use the right tool for the job. By employing storage devices optimized for their specific applications, tiered storage delivers greater efficiency, performance and reliability. In the same manner, choosing the most appropriate interface for those storage devices is key to achieving maximum value from any tiered storage solution.

When referring to hard drive capacity one gigabyte, or GB, equals one billion bytes and one terabyte, or TB, refers to one trillion bytes. Accessible capacity may vary depending on operating environment In computing, an operating environment is the environment in which users run programs, whether in a command line interface, such as in MS-DOS or the Unix shell, or in a graphical user interface, such as in the Macintosh operating system.  and formatting.

Pete Steege is a Global Marketing Manager for Enterprise Storage at Seagate (Scotts Valley, CA).

www.seagate.com
Table 1. Native FC Nearline Storage Benefits

               Integration           Scalability     Performance

Native FC      No hardware           Throughput      Simultaneous
Nearline       or software           increases as    communication
Storage        changes needed        drive count     w/multiple drives
                                     increases       maximizes
                                                     throughput

Bridged/       Requires significant  Throughput      One-at-a-time
Discrete       expenditure on        unchanged       communication
SATA Nearline  bridges or seperate   as drive count  w/single drives
Storage        SATA infrastructure   increases       limits throughput

               Availability       Simplified Management

Native FC      Dual ports enable  Leverages full
Nearline       seamless failover  capability of FC
Storage        protection         management systems

Bridged/       Single port        Limited to
Discrete       lacks failover     management
SATA Nearline  protection         capabilities
Storage        (unless Port       supported by SATA
               Multiplier added)
COPYRIGHT 2005 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2005, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Author:Steege, Pete
Publication:Computer Technology Review
Geographic Code:1USA
Date:Oct 1, 2005
Words:1504
Previous Article:Are the days for traditional backup and recovery numbered? Changing the rules for backup and recovery.(Disaster Recovery & Backup/Restore)
Next Article:Network-centric IP SAN: a new approach to unleashing the full potential of your IP network.(Storage Networking)
Topics:



Related Articles
Cycle-Safe Inc.[R] Introduces First Two-Tiered Fiberglass Bicycle Storage Enclosure.
File systems and storage.(Special SAN Section)
Tiered storage cuts costs, improves business alignment.(Storage Networking)
Heterogeneous SANs: the "Circe" of storage.(Storage Networking)(Storage area networks)
Tiered storage: does all data have to fly first class?(Storage Networking)
Architecting a tiered data center: simple fundamentals bring great returns.(Storage Management)
SAN-based data replication.(Storage Management)(Storage area networks)
PMC-Sierra delivers SAS architectural solution for network attached and server attached storage.(PM8387 SXP, PM8389 SMC-L)
Enabling tiered storage through tape virtualization: delivering more performance, reliability and efficiency at lower cost.(HSM: Special Section)
TechTarget.

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles