Printer Friendly
The Free Library
19,111,409 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Pervasive data volume reduction: a high-value alternative to aging tape-based backup technologies; A paradigm shift in backup technology can cut lifecycle costs by 50% or more.


Open systems tape backup Using magnetic tape for storing duplicate copies of hard disk files. Users can add an internal or external tape drive to their desktop computers for backup purposes, and files are typically copied to the tapes using a backup utility that updates on a periodic schedule.  processes, originally developed back in the late eighties, were designed to make optimum use of low-cost serial media; data and process efficiency took a second priority to the goal of keeping steady, continuous streams of data flowing to the tape. As a result, the backup process, even with modern enhancements, stores and manages ten times the amount of primary data being protected just in active (non-shelved) backup data volumes. While this data-intensive process worked well when data volumes were relatively small, as volumes continue to increase in the enterprise, this old-style backup process simply cannot keep up.

The problem has become particularly acute now that collaboration has become a favored business practice. With a large and growing number of staff members sharing information by copying it, communicating it, and storing their own versions of it, data volumes have truly exploded. The result: tape backup systems become increasingly overloaded, even as the number of tapes required to house archives soars along with associated backup costs. At the same time, backup windows need to be extended, forcing companies to choose between comprehensive data protection, and decreased access to network resources while backups are running during business hours BUSINESS HOURS. The time of the day during which business is transacted. In respect to the time of presentment and demand of bills and notes, business hours generally range through the whole day down to the hours of rest in the evening, except when the paper is payable it a bank or by a .

And the situation is likely to get worse as new applications that increase data volumes become increasingly popular. As an example, consider that content management applications, fast becoming a business-critical need for companies striving to deal with exploding content volumes, are often based on XML XML
 in full Extensible Markup Language.

Markup language developed to be a simplified and more structural version of SGML. It incorporates features of HTML (e.g., hypertext linking), but is designed to overcome some of HTML's limitations.
 technology--a technology whose metadata typically expands data by a factor of two. In fact, in an Avamar survey of 100 companies, there are on average at least two copies of all files stored on disk, and when data replication associated with backup processes is factored in, this number is multiplied by anywhere from 10 to 100. In other words Adv. 1. in other words - otherwise stated; "in other words, we are broke"
put differently
, every single file created in the enterprise can result in hundreds of copies of that file stored on backup tape See tape backup. .

While many companies have worked to control the magnitude of their tape storage issue by judicious retention policy management, the value of these programs is fast becoming moot because of regulatory requirements that virtually force companies to save all data forever, just to protect against possible downstream litigation An action brought in court to enforce a particular right. The act or process of bringing a lawsuit in and of itself; a judicial contest; any dispute.

When a person begins a civil lawsuit, the person enters into a process called litigation.
 or compliance issues. Against this backdrop, tape backup vendors are working furiously to come up with fixes. But all suffer from the same basic flaw: they're based on outdated serial access technology and rely on high-volume data management tapes that are thinner than a human hair and prone to distortion and, therefore, data corruption Data corruption refers to errors in computer data that occur during transmission or retrieval, introducing unintended changes to the original data. Computer storage and transmission systems use a number of measures to provide data integrity, the lack of errors. . And, they're expensive.

Instead of band-aids, what the backup industry really needs is a totally different approach, a re-engineering similar in scope to the great ERP (Enterprise Resource Planning) An integrated information system that serves all departments within an enterprise. Evolving out of the manufacturing industry, ERP implies the use of packaged software rather than proprietary software written by or for one customer.  transformation. Instead of tape, the industry cries out for a disk-based alternative. Instead of complex and time-consuming retention management practices and data compression data compression

Process of reducing the amount of data needed for storage or transmission of a given piece of information (text, graphics, video, sound, etc.), typically by use of encoding techniques.
 algorithms, companies crave cost-effective, automated techniques for reducing data volumes. What they don't need are systems that increase overhead, require large investments in new software applications and massive amounts of disk space, or that demand extensive manual intervention to laboriously identify and mark every piece of data to indicate whether it should be archived forever or not. A massive content management system for all structured and unstructured data Data that does not reside in fixed locations. Free-form text in a word processing document is a typical example. Contrast with structured data. See free-form database.  across the entire enterprise is simply not a cost-effective or practical solution--and is not likely to be so for the foreseeable future.

Ironically, re-engineering storage actually requires far fewer infrastructure changes, far less cost, and creates far less network impact than other options currently available and proposed--if the re-engineering effort is based on a pervasive data volume reduction (PDVR PDVR Portable Digital Video Recorder
PDVR Personal Digital Video Recorder
) paradigm that dramatically reduces data volumes. Equally critical is that the data reduction is applied at the data source, not downstream, on volumes of aggregated data. After all, destroying a snowball is far easier at the top of a mountain, then it is after it rolls down to the base.

With PDVR at the source, each piece of new data is automatically tagged with an identifier, these tags efficiently indexed, and the data backed up. Then, whenever data with these same tags are seen in the future, the backup system Noun 1. backup system - a computer system for making backups
ADP system, ADPS, automatic data processing system, computer system, computing system - a system of one or more computers and associated software with common storage
 immediately knows not to back this data up. The result: the volume of data that has to be backed up in order to completely protect resources and ensure full recovery and business survivability sur·viv·a·ble  
adj.
1. Capable of surviving: survivable organisms in a hostile environment.

2. That can be survived: a survivable, but very serious, illness.
 can be reduced by a factor of 10 to 100.

While some hardware-centric data storage systems already rely on incremental change tracking techniques that reduce data volumes, these systems only work on single files, or file systems, as they change over time--and only if the system itself contains unique knowledge of how those files and file systems were initially constructed. What's different about pervasive data volume reduction is that it is pervasive: the technique can be applied heterogeneously, across all files, file systems, data types, and servers--even across a global enterprise.

Also, because data volumes are cut so dramatically when PDVR is applied at the data source, the technology enables enterprises to put all its backup data on disk; companies can even leverage existing disk resources, adding capacity only when it is needed. At the same time, because the technology is software-based, it can be seamlessly integrated with existing storage infrastructures to protect legacy technologies and investments.

Another benefit of this approach is that storage resources can now be distributed wherever convenient. Access is simple and across the LAN (Local Area Network) A communications network that serves users within a confined geographical area. The "clients" are the user's workstations typically running Windows, although Mac and Linux clients are also used.  or WAN, just as it is for any other application. As a result, the system offers built-in, no-cost, off-site storage options. Creating backup sets of tapes or disks is no longer required, and gone are the high costs of trucking these media to offsite locations. Easy network accessibility to online disk-based storage from offsite locations also enables companies to respond to any regulatory or legal requirements for historical data with unprecedented speed. With fingertip fin·ger·tip
n.
The extreme end or tip of a finger.
 access to any data, any time, time-consuming and expensive tape retrieval and restore processes from multiple incremental backup See backup types.

(operating system) incremental backup - A kind of backup that copies all files which have changed since the date of the previous backup. The first backup of a file system should include all files - a "full backup". Call this level 0.
 tapes are as outdated as eight-track audio tapes.

PDVR also enables enterprises (for the first time) to cost effectively deliver the full features of the robust corporate backup system to widely distributed Adj. 1. widely distributed - growing or occurring in many parts of the world; "a cosmopolitan herb"; "cosmopolitan in distribution"
cosmopolitan

bionomics, environmental science, ecology - the branch of biology concerned with the relations between organisms
 branch offices, with no implementation costs or hassles, no additional media investments, and no additional management overhead. Corporate data is simply corporate data, wherever it's created, however it's used, and whoever shares it. So why have multiple systems to back it up?

Backups that rely on PDVR are also fast. So fast, in fact, that 1.1 terabytes (the typical amount of data backed up during an average eight-hour backup window) can now be completely backed up in one hour. Smaller servers or datasets can even be backed up during the business day. In other words, backup windows and all the limitations and havoc they wreak wreak  
tr.v. wreaked, wreak·ing, wreaks
1. To inflict (vengeance or punishment) upon a person.

2. To express or gratify (anger, malevolence, or resentment); vent.

3.
, are now a thing of the past. There never was a business reason for backup windows; they were simply an artifice ar·ti·fice  
n.
1. An artful or crafty expedient; a stratagem. See Synonyms at wile.

2. Subtle but base deception; trickery.

3. Cleverness or skill; ingenuity.
 of the way tape works. And now, with PDVR, daily network operations are never impacted by backup operations. As a result, system availability is optimized, enabling many enterprises to defray de·fray  
tr.v. de·frayed, de·fray·ing, de·frays
To undertake the payment of (costs or expenses); pay.



[French défrayer, from Old French desfrayer : des-,
 additional capacity investments.

Not to be overlooked are the sizable cost benefits that PDVR brings in the area of staffing. It is a well-known fact that the number of backup administrators required is directly proportional (Math.) proportional in the order of the terms; increasing or decreasing together, and with a constant ratio; - opposed to inversely proportional.

See also: Directly
 to the amount of data in storage, with one full-time administrator required for every five terabytes of data stored on tape. By comparison, for disk-based storage the same single administrator can manage more than 15 terabytes of data. The bottom line: PDVR-optimized storage not only cuts the volume of data requiring management by as much as 90%, but it also increases management efficiency by two-thirds, cutting overall lifecycle backup costs by 50% or more.

So, what it comes down to is this: PDVR eliminates tape purchases, the costs of continually upgrading tape libraries and drives, and offsite storage and retrieval costs. At the same time it boosts productivity of backup administrators to unprecedented levels, optimizes network utilization, and facilitates rapid compliance with regulatory requirements while cost-effectively protecting against future legal liabilities. Furthermore, because it requires no proprietary hardware investments, it costs substantially less than any other long-term backup solutions being offered in the marketplace, even while it seamlessly integrates with those and all other existing backup technologies. And finally, the technology is both proven and available. In sum, PDVR is a backup technology whose time has come.

Janae Lee is vice president of marketing at Avamar (Irvine, CA)

www.avamar.com
COPYRIGHT 2003 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2003, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Storage Networking
Author:Lee, Janae
Publication:Computer Technology Review
Geographic Code:1USA
Date:Dec 1, 2003
Words:1420
Previous Article:Reduce the cost of compliance: database archiving and Information Lifecycle Management.
Next Article:Network configuration management: an innovative, additional layer of network security.
Topics:



Related Articles
ENVISION Integrates Online, Near Online, And Offline Storage.
The emergence of e-vaulting: electronic vaulting is a compelling improvement on traditional in-house data backup and recovery functions.
Speeding up the network: D2D backup lets VARs beat the Bottleneck Bugaboo. (Nectivity).
Best practices for implementing data lifecycle management solutions.
Tiered storage: new strategies match new demands and opportunities.
Plan for the worst, hope for the best: backup & disaster recovery, Part 2.
The optimal backup solution: it's now within your reach.
Tape turning: protect against data loss.
Enabling tiered storage through tape virtualization: delivering more performance, reliability and efficiency at lower cost.
The key to Information Lifecycle Management is cost-effective backup.

Terms of use | Copyright © 2012 Farlex, Inc. | Feedback | For webmasters | Submit articles