Printer Friendly
The Free Library
14,560,496 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Archival data has a new mission: Critical; it's not what it used to be.


Having chaired two panels at storage conferences in January, it was clear in both that the category of archival data is quickly demanding a new wave of focus. For years, archival data was used to describe data that was in the long-term, decreasing value stage. As customers and vendors now identify their more pressing storage needs going forward, the amount of data in the category of "long-term retention" is being viewed differently than in the past. Historically, when data reached archival status, it had reached its final state before being deleted ending the data lifecycle. Archiving almost always assumed that the value of data decreased as it aged. This is no longer the case. Recently, data lifecycle management has taken on renewed emphasis and at times it seems like all data is critical. The second-wave of archival storage management is underway.

Many organizations are now facing increasing regulatory pressure to comply with federal mandates for email, medical, insurance, legal, financial and government classified data. In addition, over half of the digital data being generated annually today (this is approximately one exabyte or 1x[10.sup.18]) now falls into the category of "fixed content," meaning that the data doesn't change after it is initially created. Fixed content is sometimes referred to as "reference data" "rich media," or archival data. Fixed content includes storage intensive applications such as critical business applications data, complex legal and reference documents, medical data, email attachments, blueprints, satellite imagery Satellite imagery consists of photographs of Earth or other planets made from artificial satellites. History
The first satellite photographs of Earth were made August 14, 1959 by the US satellite Explorer 6.
, security surveillance, check images, and broadcast content, among others which content is seldom if ever altered.

The assumption that older or aged data has lost its value no longer holds for several specific vertical markets. New applications and a variety of legal and business requirements are driving the need for many businesses to re-examine re·ex·am·ine also re-ex·am·ine  
tr.v. re·ex·am·ined, re·ex·am·in·ing, re·ex·am·ines
1. To examine again or anew; review.

2. Law To question (a witness) again after cross-examination.
 their archival policies. One of the most visible examples of the emphasis on the increasingly critical value of archival data lies with the HIPAA (Health Insurance Portability & Accountability Act of 1996, Public Law 104-191) Also known as the "Kennedy-Kassebaum Act," this U.S. law protects employees' health insurance coverage when they change or lose their jobs (Title I) and provides standards for patient health,  (Health Insurance Portability and Accounting Act) requirements. Not only does HIPAA require health providers to preserve data for a yet to be determined time period, but the failure to protect critical patient data presently carries with it penalties ranging up to $25,000 per violation. Just the threat of the fines and other forms of noncompliance noncompliance

failure of the owner to follow instructions, particularly in administering medication as prescribed; a cause of a less than expected response to treatment.

noncompliance 
 are encouraging storage administrators to make sure that an increasing number of archival data applications will be kept indefinitely for future reference. The PACS (Picture ArChiving System) A storage and management system for high-resolution images. Typically pertaining to the medical field, images such as X-rays, MRIs and CAT scans require a greater amount of storage than other industries.  application (Picture Archiving and Communications System In telecommunication, a communications system is a collection of individual communications networks, transmission systems, relay stations, tributary stations, and data terminal equipment (DTE) usually capable of interconnection and interoperation to form an integrated whole. ) that captures and stores radiology information and other types of medic medic: see alfalfa.  al images is a primary component of the HIPAA requirement. Email archives also fall into this category and face increasing pressure to be retained indefinitely for legal reasons. As a general rule used for common email retention policies, 80 percent of email can be immediately archived. Email will soon require HSM (1) (Hierarchical Storage Management) The automatic movement of files from hard disk to slower, less-expensive storage media. The typical hierarchy is from magnetic disk to optical disc to tape.  on steroids to meet the archival demands! Given today's legal, economic and political climate, the value of archival data has never been higher.

The increased emphasis for preserving critical archive data requires a different set of storage attributes than did previous archival management schemes.

Archival Storage and Data Characteristics

* Large-scale storage capacity needed, scalable to petabytes (1x[10.sup.15])

* Infinite data retention periods required (measured in years) as the data must be preserved, but not necessarily the media it resides on

* Archive data normally has low access and reference requirements but relatively high data transfer rate (bandwidth) requirements

* Much of archival data is static in nature or "fixed content," unstructured and is stored using a variety of formats

* WORM (Write-Once-Read-Many) capability is increasingly desirable for legal reasons

* Random and sequential access In computer science, sequential access means that a group of elements (e.g. data in a memory array or a disk file or on a tape) is accessed in a predetermined, ordered sequence. Sequential access is sometimes the only way of accessing the data, for example if it is on a tape.  required based on the application

* Delayed initial access time is acceptable (from seconds up to a few minutes)

* Archive data can involve local and remote access (location independent) with many users in many locations

* Needs a data classification taxonomy to enable unique content search and access as some archival searches can cost six figures

* Multiple copies of archival data are needed given the criticality and increasing value of data

* Device security and data security (intrusion protection, authenticity) are required for archival data management

* Archive data requires its own policies consistent with regulatory practices for each industry category

The data lifecycle is traditionally described as having four distinct categories. In each case, we continue to observe that the probability of reuse of data decreases as data ages. In the past, the value of data most often decreased as data aged.

1) The active cycle -- this period often lasts for 30 days, typically disk storage (P=>.5)

2) The reference cycle -- this typically lasts for 60 days, typically disk and automated tape storage (P=>.1)

3) The archive cycle -- this period often lasts up to seven years, typically automated tape though the new class of archival disks are gaining momentum such as the 160Gb and 320Gb ATA (1) (AT Attachment) The specification for IDE drives. See IDE.

(2) See analog telephone adapter.

ATA - Advanced Technology Attachment
 disks for fixed content storage (P=<.01)

4) Destroy/delete cycle -- historically at the end of seven or more years (P<=.001)

Note: P is the probability that the data, file or object will be accessed during the various lifecycle stages.

Though the first two categories of the data lifecycle remain similar to the past, the last two components are changing. The third component, the archive cycle, is now extending indefinitely and often well past the traditional seven-year window. Less data is being deleted and more data is being kept for longer periods of time.

What does this mean to the storage industry? Digital archives are quickly defining new requirements for storage and its management.

Key data requirements for digital archive management:

* Retention/destruction management

* Audit provisions for tracking and reporting

* Long-term data preservation

* Compliance management for legal issues

* Authentication (1) Verifying the integrity of a transmitted message. See message integrity, e-mail authentication and MAC.

(2) Verifying the identity of a user logging into a network.
 

* High availability Also called "RAS" (reliability, availability, serviceability) or "fault resilient," it refers to a multiprocessing system that can quickly recover from a failure. There may be a minute or two of downtime while one system switches over to another, but processing will continue.  for data and devices

* Advanced search and access capability with unique naming conventions (taxonomy)

* An industrial class HSM (Hierarchical Storage Management See HSM. ) for tiered SLAs

* Renewed use of WORM (Write-Once-Read-Many) functionality as certain data must never be changed

* The large-scale storage requirements mandate low-cost storage, TCO (1) (Total Cost of Ownership) The cost of using a computer. It includes the cost of the hardware, software and upgrades as well as the cost of the inhouse staff and/or consultants that provide training and technical support. See ROI.  becomes a key consideration as data lifecycles increase

The time of viewing archival storage as the final stage of existence for data is passing. In some cases the value and utility of data is actually increasing as data ages even if the accesses to that data decline. Surprisingly, archive data is possibly becoming the fastest growing segment of the storage industry in terms of storage demand. What a surprise! Many of today's storage intensive applications are instantly creating fixed content and archive data. Applications including voice, text, graphic images, audio, HDTV (High Definition TV) A set of digital television (DTV) standards that offer the highest resolution and sharpest picture. Although some HDTV sets are available in standard (rather square) screen sizes, the overwhelming majority of sets are wide screen, which eliminates , 3-D graphics, and movies all create the demand for archival data preservation. New and emerging digital applications will continue to fuel many years of explosive growth for storage as terabyte-plus data-warehouses, VCR VCR: see videocassette recorder.
VCR
 in full videocassette recorder

Electromechanical device that records, stores on a videotape cassette, and plays back on a TV set recorded images and sound.
 to HDTV quality movies, the possible digital cinema, electronic voice and video-mail, digital security systems, and digital photography all will drive major changes in the way we view archival storage. Approximately 10 percent of the digital data produced in the world resides on magnetic disk storage, and an estimated 90 percent of digital storage resides on removable storage media such as tape, optical (CD, DVD DVD: see digital versatile disc.
DVD
 in full digital video disc or digital versatile disc

Type of optical disc. The DVD represents the second generation of compact-disc (CD) technology.
) or small-diameter removable disks.

Given this, the storage industry is beginning to view archival data as a much more meaningful class of storage. Though low cost is important, the new requirements for preserving data and making it accessible on a broad scale are real. Again we visit the need and value of a tiered storage A data storage system made up of two or more types of storage based on their access speed. For example, magnetic disk and tape or magnetic disk and optical disc are widely used in a tiered storage system. See HSM.  hierarchy (and HSM functionality) that differentiates between performance, capacity, retrieval capability and now data protection and security. Rigid disk Same as hard disk.  drives, magnetic tape drives (storage) magnetic tape drive - (Or "tape drive") A peripheral device that reads and writes magnetic tape. , optical disks, flexible drives and flash memory will all play bigger roles for storing fixed content for a wide class of users. The unique combination of complex objects, along with different availability and bandwidth requirements Bandwidth requirements (communications)

The channel bandwidths needed to transmit various types of signals, using various processing schemes. Every signal observed in practice can be expressed as a sum (discrete or over a frequency continuum) of sinusoidal
 for archival data pose several new challenges for the storage management industry. The sheer size of fixed content and archive files changes the rules for moving data from place to place as transmission times are surpassing current architectural limits quickly.

How critical is archive data? Can we actually begin to call certain types of archival data mission critical data? In terms of describing data that is mandatory to instantly resume business operations Business operations are those activities involved in the running of a business for the purpose of producing value for the stakeholders. Compare business processes. The outcome of business operations is the harvesting of value from assets  in case of any type of disaster, archival data is probably not classified as mission critical. In terms of the value of archival data to businesses, it is clearly becoming increasingly critical.
COPYRIGHT 2003 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2003, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Author:Moore, Fred
Publication:Computer Technology Review
Date:Feb 1, 2003
Words:1398
Previous Article:Enterprise content management makes the most of what you've got: Complex term, simple idea.
Next Article:Microsoft offers dividends. Will other techs follow? (First in / first out).
Topics:



Related Articles
IBM Tivoli storage manager and NetApp NearStore R100 provide auto storage Mgnt with fast access.(Product Announcement)
When tape becomes mission critical: A white paper. (Tape/Disk/Optical Storage).(Industry Overview)
Filling the storage gap: nearline innovations extend scope of existing enterprise storage capabilities. (SAN).(Buyers Guide)
The evolving role of tape storage.
Networked storage management software hits 10-year mark: thanks to IBM.
Storage challenging business creativity: hard choices ahead for management and government compliance.(Storage Management)
UDO: why professional optical storage makes sense in a low-cost disk world.(Disaster Recovery)
IP SANs to the rescue: fortifying business continuity.(Disaster Recovery & Backup/Restore)
SAS: reinventing flexible storage in the enterprise.(SCSI Trade Association)(Serial Attached SCSI)
Is your company's archive really archival? What to do if you fear it isn't.(Storage Management)

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles