Printer Friendly
The Free Library
5,665,934 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Virtual tape: a solid citizen in an ILM world.


Enterprise storage has rapidly evolved in recent years from a "two-size-fits-all" (high- performance disk and low-cost tape) world into a complex mix of discrete storage elements with a wide variety of performance, cost and capacity attributes. This alphabet alphabet [Gr. alpha-beta, like Eng. ABC], system of writing, theoretically having a one-for-one relation between character (or letter) and phoneme (see phonetics). Few alphabets have achieved the ideal exactness.  soup of technologies (SATA (Serial ATA) A serial version of the ATA (IDE) interface, which has been the de facto standard hard disk interface for desktop PCs for more than two decades. The original Parallel ATA (PATA) interface was launched in 1986. , iSCSI, etc.) presents an unprecedented opportunity for storage managers to locate the right data on the right device at the right time, while satisfying another alphabet soup of business and regulatory requirements Regulatory requirements are part of the process of drug discovery and drug development. Regulatory requirements describe what is necessary for a new drug to be approved for marketing in any particular country.  (HIPAA (Health Insurance Portability & Accountability Act of 1996, Public Law 104-191) Also known as the "Kennedy-Kassebaum Act," this U.S. law protects employees' health insurance coverage when they change or lose their jobs (Title I) and provides standards for patient health, , Sarbanes-Oxley, etc.). Moreover, software tools have emerged under the SRM (1) (Storage Resource Management) The management of the storage resources in an organization in order to avoid duplication of files and to determine space utilization across all servers.  banner to more efficiently manage allocation The apportionment or designation of an item for a specific purpose or to a particular place.

In the law of trusts, the allocation of cash dividends earned by a stock that makes up the principal of a trust for a beneficiary usually means that the dividends will be treated as
 of storage.

However, these technologies also present a mind-boggling, complex set of management challenges which, if not addressed properly, could negate ne·gate  
tr.v. ne·gat·ed, ne·gat·ing, ne·gates
1. To make ineffective or invalid; nullify.

2. To rule out; deny. See Synonyms at deny.

3.
 the potential business benefits associated with what has come to be known as Information Lifecycle Management Information Lifecycle Management refers to a wide-ranging set of strategies for administering storage systems on computing devices. Specifically, four categories of storage strategies may be considered under the auspices of ILM. .

One common assumption underlying the implementation of ILM is that organizations can dramatically reduce cost while improving access to critical data and compliance with stringent regulatory requirements. At the hardware level, this essentially comes down to introducing new layers of disk technology into the storage hierarchy The range of memory and storage devices within the computer system. The following list starts with the slowest devices and ends with the fastest. See storage and memory.

VERY SLOW Punch cards (obsolete) Punched paper tape (obsolete) FASTER
. ILM is, after all, at least partly a variation on the old theme of hierarchical storage management See HSM.  (HSM (1) (Hierarchical Storage Management) The automatic movement of files from hard disk to slower, less-expensive storage media. The typical hierarchy is from magnetic disk to optical disc to tape. ).

Frequently discussed in this context is some form of ultra low-cost disk, such as Serial ATA See SATA.

Serial ATA - Serial Advanced Technology Attachment
. This disk is commonly associated with backup data that is staged for fast recovery or reference data that is required on a "reasonably" timely basis. Each of these applications (backup and reference data) of SATA disk require an application to access data written to this class of storage devices adding some degree of complexity associated with creating and managing yet another layer in a storage hierarchy while also managing the data itself.

In no context has the notion of backup and archive been eliminated from the ILM discussion. In fact, they remain ever more important in the hierarchy, as organizations strive to better manage their growing stores of data and preserve that data into perpetuity perpetuity n. forever. (See: in perpetuity, rule against perpetuities)


PERPETUITY, estates. Any limitation tending to take the subject of it out of commerce for a longer period than a life or lives in being, and twenty-one years beyond; and in case of a
.

What if you were able to achieve the desired cost benefits of relatively fast, low-cost intermediate storage without adding the complexity associated with managing it? That is where virtual tape comes in. CentricStor is a heterogeneous Not the same. Contrast with homogeneous.

heterogeneous - Composed of unrelated parts, different in kind.

Often used in the context of distributed systems that may be running different operating systems or network protocols (a heterogeneous network).
 virtual tape system designed and manufactured by Fujitsu Siemens Computers Fujitsu Siemens Computers is a Japanese and German IT vendor, selling consumer and business computing products in the markets of Europe, the Middle East and Africa (products marketed elsewhere are sold under the Fujitsu brand).  and now available in the U.S. through PeakData, Inc. and its reseller An organization that sells hardware and software to the general public. Resellers purchase products from software publishers and hardware manufacturers.  partners.

CentricStor maintains the traditional two levels of the hierarchy (primary storage and archive) from a management perspective, but delivers "virtually" two to four performance levels all controlled by management policy within the archive. In addition, it supports duplexing or Export/Import for business continuation. The levels within the archive could be fast cache, slow cache (with binding capabilities), fast tape, capacity tape, WORM tape separate from re-usable tape, and Iron Mountain or the equivalent.

Our analysis of CentricStor within an ILM strategy reveals the potential for significant cost savings when compared with using a three-tiered storage approach. ILM represents the intelligent, automated au·to·mate  
v. au·to·mat·ed, au·to·mat·ing, au·to·mates

v.tr.
1. To convert to automatic operation: automate a factory.

2.
, migration of data from higher performing storage subsystems The part of a computer system that provides the storage. It includes the controller and disk drives. See storage system.  (expensive) to lower performing storage subsystems (lower cost) during the useful lifecycle of data with the explicit goal of reducing the cost of storage for all of the data within an enterprise.

A goal of a cohesive cohesive,
n the capability to cohere or stick together to form a mass.
 ILM strategy is to eliminate server workloads by migrating data through each level of storage (e.g., from primary to secondary storage and from secondary storage to long-term retention). A design requirement for any long-term storage decision (and ILM solution) includes non-disruptive introduction of newer data storage technologies into the migration levels.

Another important benefit of an effective ILM strategy is to reduce storage complexity, thus reducing storage administration costs. ILM solutions will require separately defined migration and recall requirements for each of the various user-defined classes of data (e.g., mission-critical, vital, sensitive, non-critical).

A complete virtual tape system such as CentricStor within ILM can provide the following benefits:

* Simplifying the administration of data once written into archive with assurances that data is always recoverable

* Satisfying various levels of redundancy to assure service-level commitments to Lines-of-Business for data availability Refers to the degree to which data can be instantly accessed. The term is mostly associated with service levels that are set up either by the internal IT organization or that may be guaranteed by a third party datacenter or storage provider.

* Leveraging magnetic tape-based data storage solutions to provide the lowest average costs per MB possible

With the addition of an integrated virtual tape 'front end,' a comprehensive tape automation subsystem A unit or device that is part of a larger system. For example, a disk subsystem is a part of a computer system. A bus is a part of the computer. A subsystem usually refers to hardware, but it may be used to describe software.  becomes a viable, compelling participant in the ILM architecture. A comprehensive virtual tape system can bring the acquisition cost of storage below $0.001/MB while also delivering policy-based management See policy management.  and media assurance for long-term data availability.

Integrating a complete virtual tape solution into an ILM-based operation enables backups and ILM-migrations (archiving) to use common resources providing further operational savings in administration. Retention periods are governed gov·ern  
v. gov·erned, gov·ern·ing, gov·erns

v.tr.
1. To make and administer the public policy and affairs of; exercise sovereign authority in.

2.
 by several factors such as regulatory compliance, operational efficiencies, corporate policies, and user requests. Each of these retention periods is controlled by the archiving application that enables access to migrated data (see Figure 1).

If we assume that: (a) data growth is linear, that (b) data actively accessed within 30 days of creation is one half of the data accessed within 30 to 90 days, and that (c) there is a 5-year data retention policy in force--we would calculate that the long-term storage would represent 57 of 60 total months of data created, or 95% of the total storage.

For the sake of simplicity, let's assume that there is one TB of mission critical data in primary storage. Mission-critical data is assumed to be 15% of the online data in primary storage; so, by calculation, the total data within primary storage is 6.7TB. Forty percent of the data is non-critical, leaving 60% to be placed on replicated disk systems, mirrored or perhaps hot failover Invoking a secondary system to take over when the primary system fails. Up-to-date copies of all required data and applications are maintained on the secondary system in order to respond immediately if the primary system becomes unusable. Also called "fallover." See replication.  capable.

In a mirroring disk example there may be two instances of data on each of the failover systems with each instance having additional capacity for growth and management--in effect doubling the available data storage space. One TB of mission-critical storage could require 32TB of primary storage (60% X 6.7TB = 4TB; mirrored X 2 subsystems = 16TB plus additional free space for growth and management = 32TB). If the same disk subsystems were used for secondary storage, an additional 64TB of on-line disk capacity would be required.

The one TB of mission-critical data residing on primary storage (assuming linear growth and mission critical is 15% of the total online) is a small of portion of the total enterprise requirement of 400TB. This example requires a long-term storage capacity equal to 95% of the total 60-month data capacity equating e·quate  
v. e·quat·ed, e·quat·ing, e·quates

v.tr.
1. To make equal or equivalent.

2. To reduce to a standard or an average; equalize.

3.
 to 380TB. As an archive built upon LTO-2, it would require 950 cartridges
  • List of rifle cartridges
  • List of handgun cartridges
  • Table of pistol and rifle cartridges
  • List of cartridges by caliber
. In the archive, ensuring long-term availability, there would be at least four copies of the media (two local and two remote) requiring at least 3800 tapes.

Total storage hardware costs for this example would amount to $6.7M. Distribution of costs across the three tiers of the ILM model are as follows:

* Primary storage consumes 70%

* Secondary storage: 15%

* Long-term archive: 15%

If all primary and secondary storage were deployed on high-performance, redundant, mirrored disk systems, total systems costs would increase significantly.

Secondary storage is the issue--and where the complexity comes in. No one disagrees that mission-critical data needs to be on the very best disk systems available. The question is about the "aging of data" and getting old data out of the way so that expensive resources are available for priority computing computing - computer  tasks.

In our example, we see 13.4TB of secondary data. The likelihood of reference to that data is less than 5% at 30 days. After 90 days, it is perhaps less than 1%. Meanwhile, customers are keeping this data on their most expensive, highest availability disk systems with 2 to 5 mirrored copies. I say: archive aggressively, use CentricStor cache to replace secondary storage and leverage the fact that all data will reside on tape. Using CentricStor in your ILM process means that a single copy of the data in CentricStor cache has the redundancy of multiple copies of data on tape to begin with, thus mirrored vertically, rather than within the same technology.

Savings generated by deploying CentricStor into the ILM architecture are several. Specific hardware savings amount to nearly $1M in the example described above. In addition, there is the elimination of an entire level of data migration resources from server workloads and elimination of storage administration for the 13.4TB of secondary storage equipment.

To provide data backup for primary storage requires resources to move 6.7TB of primary data and, by simple calculation, 13.4TB of secondary storage or 20TB in total. Fully two-thirds of the on-line storage being backed up does not change, but still consumes expensive backup and administrative management. A backup solution for this amount of data would require an investment of $700K for drives, media and libraries.

Aggressive use of CentricStor as the secondary storage resource will save substantially on the backup window. The secondary storage amounts to 13.4TB which would be included in the migration into archive at the end of 30 days, eliminating the backup requirement. The additional cost for backing up primary storage into an established CentricStor archive solution is minimal--less than $200K--saving more than $500K on hardware without consideration for backup administration costs.

In closing, CentricStor as a component in ILM makes sense. The archiving process manages access to data, and the ability of CentricStor to provide a large, segmented data cache with separate migration policies enables CentricStor to perform as the secondary storage layer.

It is assumed that long-term data should go to CentricStor because cost-per-copy of data is approaching $0.0005/MB including management software today. CentricStor supports the non-disruptive addition of newer recording technologies and automated conversions, without impact to servers. CentricStor also has a feature for automatic media recycling recycling, the process of recovering and reusing waste products—from household use, manufacturing, agriculture, and business—and thereby reducing their burden on the environment.  to assure that all data is recoverable during the retention period. If CentricStor is in place for long-term data, then expanding its role to encompass secondary storage will eliminate $1M of hardware costs for ILM storage. Also, removing the backup requirement for the 13.4TB in secondary storage saves an additional $500,000. CentricStor saves about 20% of the total hardware infrastructure, simplifies administration, eases backups, and prepares the enterprise to adopt future technologies.
Classes of Data

                     Mission
                     Critical     Vital       Sensitive   Non-Critical

Primary Storage      Synchronous  Replicated  Replicated  Capacity
(0 to 30 Days)       Mirror Disk  Disk        Disk        Disk
Secondary Storage    Synchronous  Replicated  Replicated  Capacity
(30 to 90 Days)      Mirror Disk  Disk        Disk        Disk
Long-Term Retention  Nearline     Nearline    Nearline    Nearline
(>90 Days)

Classes of Data

Capacities with 5-yr   Critical     Vital       Sensitive   Non-Critical
Data Retention Policy  15%          20%         25%         40%

Primary Storage
0 to 30 Days           Synchronous  Replicated  Replicated  Capacity
(1.67% of total data)  Mirror Disk  Disk        Disk        Disk
Secondary Storage      Virtual Tape with
30 to 90 Days          Segmented Data Cache Controlled
(3.33% of total data)  By Separate Management Policies
Long-Term Retention    Virtual Tape
>90 Days               Managed
(95% of total data)    Nearline Archive


Mike Holland is vice president and general manager at CentricStor Business Unit, PeakData, Inc. (Niwot, CO)

www.centricstorusa.com
COPYRIGHT 2004 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2004, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Storage Management; Information Life-Cycle Management
Author:Holland, Mike
Publication:Computer Technology Review
Date:Sep 1, 2004
Words:1822
Previous Article:Client computer storage consolidation.(Storage Management)
Next Article:Hardware enables next-gen storage: next generation storage software will require specialized hardware.(Storage Management)
Topics:



Related Articles
ILM: the next wave.(First in / First out)(Information Lifecycle Management)
Unstructured data: the roadblock to effective ILM.(Special ILM Issue)(Information Lifecycle Management)
Policy-based data management in ILM.(Special ILM Issue)(Information Lifecycle Management)
Information Lifecycle Management and the government.(Storage Networking)
The year in storage: data protection led innovations.(Data Protection)
Distributed backup is the key to ILM.(Storage Networking)(Information Lifecycle Management )
Content Addressed Storage.(Storage Management)
ILM: the promises and the problems.(Storage Management)(information lifecycle management)
The evolution of hierarchical storage management.(HSM: Special Section)(Information lifecycle management)
ILM ... easier said than done.(SPOTLIGHT: ILM)(Information Lifecycle Management)

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles