Printer Friendly
The Free Library
14,505,384 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

PROFILING THE STORAGE HIERARCHY.


Where should your data reside?

Estimates now indicate that half to two-thirds of the world's data is being "born digital," meaning that its original occurrence is in a digital format. By the year 2004, it is projected that as much as 14% of the known data in the world will have been captured in a machine-readable digital format. Nearly 86% of the world's data will remain on paper, microfiche Pronounced "micro-feesh." A 4x6" sheet of film that holds several hundred miniaturized document pages. See micrographics. , charts, graphs, pictures, various films, or other non-machine-readable formats. Best estimates indicate 10% of all digital data reside on disk or online storage while roughly 90% resides on removable and mass storage technologies. The age-old rule that 80% of the activity goes to 20% of the data still holds.

Today's digital data storage technologies map nicely into a hierarchy consisting of both fixed and removable media In computer storage, removable media refers to storage media which can be removed from its reader device, conferring portability on the data it carries. A removable drive is a reader device for such media.  products. The removable products are making mass storage, long-term data archiving, and electronic data vaulting Transmitting data to a computer in a different location for backup.  affordable realities. The notion of a hierarchy has been used for nearly twenty-five years by the storage industry to relate the various tradeoffs of storage products and subsystems. Faster and more expensive products occupy the high end of the hierarchy and slower access, higher capacity, and less expensive products occupy the lower levels.

Though many have predicted the hierarchy would ultimately evolve to become a seamless single-level of storage, achieving this goal is seldom a reality. Three key parameters have normally been used for selecting the optimal level for data placement in the storage hierarchy The range of memory and storage devices within the computer system. The following list starts with the slowest devices and ends with the fastest. See storage and memory.

VERY SLOW Punch cards (obsolete) Punched paper tape (obsolete) FASTER
. They include 1) the size of the file or application, 2) the performance requirements and 3) the price of the subsystem. Now, a fourth area has joined the primary selection criteria--availability coupled with quality of service. Broad ranges continue to exist within all four parameters and some obvious access time and cost gaps remain within the hierarchy. Anyone can decide to put all data on disk storage and just "buy more storage" when needed. This is the easiest and possibly a more cost-effective methodology up to about the 200-250GB range at today's storage and personnel costs. Beyond that capacity range however, choosing and managing the best combination of storage devices and technologies becomes a more cost-effective strategy than not managing storage and just adding more hardware. Remember the cost of people increases each year and the price of storage falls each year.

Inside The Hierarchy

Memory products based on DRAM (solid-state) technology occupy the highest levels of the hierarchy and are the fastest and most expensive. Large OS/390 class enterprise servers also use the less expensive expanded storage Additional memory in IBM mainframes that is not normally addressable by applications. Introduced for the 3090 series, the data are usually transferred in 4K pages from expanded storage to central storage (main memory). See hiperspace.  level to hold ultra-high access data tables and virtual pages, but expanded storage cannot execute programs. The comparable trend in other servers is simply to add more main memory, which can execute programs. Solid State Disks first appeared in 1978 for the mainframe market and provide the highest I/O (Input/Output) The transfer of data between the CPU and a peripheral device. Every transfer is an output from one device and an input to another. See PC input/output.

I/O - Input/Output
 performance of any storage device. The use of DRAM technology removes the latency and seeks components from an I/O operation. This class of device resides on an I/O channel See channel.  rather than an internal memory bus as does expanded storage. It appears to the operating system operating system (OS)

Software that controls the operation of a computer, directs the input and output of data, keeps track of files, and controls the processing of computer programs.
 (Unix, Linux, NT, etc.) as an ultra-fast magnetic disk drive(s). Today, these devices approach 20GB capacities and offer fault-tolerant designs effectively addressing availability concerns.

Magnetic disk storage holds an estimated 95% of all of the world's mission critical data and these disks have clearly defined both a high-performance and high-capacity level in the hierarchy with price per megabyte One million bytes, or more precisely 1,048,576 bytes. Also MB, Mbyte and M-byte. See mega and space/time.

(unit) megabyte - (MB, colloquially "meg") 2^20 = 1,048,576 bytes = 1024 kilobytes. 1024 megabytes are one gigabyte.
 and access density being the most apparent differences between these levels. Technological progress in magnetic storage on all fronts has been phenomenal by any measure and there appears to be no near-term end in sight. Coupled with widespread caching capability, disk is typically the most versatile level of the hierarchy. Backup and, in particular, recovery of critical disk storage in a timely manner leaves many challenges and presents developers with many opportunities for improvement.

As data becomes less active or moves to an archival state, it becomes too costly to leave the data on spinning magnetic disks for 24x365 consuming electricity, generating heat, and occupying increasingly costly real estate. Migrating the less-frequently used data to a lower cost and a lower performance level of storage becomes more attractive. In just a few years, we will look at the Intelligent SAN (ISAN) as a storage network with embedded Inserted into. See embedded system.  hierarchical storage capabilities using intelligent metadata as being optimal. This will enable transparent data movement between the appropriate levels of the hierarchy independent of any server. We will move a step closer to a single-level storage concept that lets us get the right data to the right place at the right time.

Once a favorite technology for the future, optical disk storage now is centered on CD-ROM CD-ROM: see compact disc.
CD-ROM
 in full compact disc read-only memory

Type of computer storage medium that is read optically (e.g., by a laser).
 and DVD DVD: see digital versatile disc.
DVD
 in full digital video disc or digital versatile disc

Type of optical disc. The DVD represents the second generation of compact-disc (CD) technology.
 technology. Optical disk has been squeezed from the general storage hierarchy, as it has not kept pace with magnetic disk and tape developments in areal density The number of bits per square inch of storage surface. It typically refers to disk drives, where the number of bits per inch (bpi) times the number of tracks per inch (tpi) yields the areal density. , price, access time, and performance. WORM optical product shipments are now in significant decline. Future optical devices remain under development, as breakthroughs are usually "just around the corner." DVD offers some promise, though standards issues and low performance and transfer rates limit much of its potential to that of being a replacement for CD-ROMs.

Nearline defines the level of storage between disk and far-line or shelved storage by using robotics to store and retrieve media automatically and is typically quicker than human retrieval. Over thirty companies supply various forms of Nearline storage Nearline storage (where Nearline is a contraction of Near-online) is a term used in computer science to describe an intermediate type of data storage. It is a compromise between online storage (constant, very rapid access to data) and offline storage (infrequent . The volumetric efficiency Volumetric efficiency in internal combustion engine design refers to the efficiency with which the engine can move the charge into and out of the cylinders. More correctly, volumetric efficiency is a ratio (or percentage) of what volume of fuel and air actually enters the  of Nearline exceeds all other technologies and its price per megabyte purchased remains lower by a factor of ten or more times compared to disks. Today, Nearline storage consists of robotics that presently accesses either magnetic tape cartridges or optical disks. Other media types are under consideration, including small form-factor magnetic disks. This level will contain 85% or more of the world's digitally stored data for the foreseeable future.

Nearline has some remaining limitations however. By using robotics, the time it takes to get to the first file or byte of data typically takes five to ten seconds for the media to be mounted on a read-write mechanism. This leaves a major access time gap between online disk with access times in the range of ten to twenty ms. and Nearline with initial access times measured in seconds.

Secondly, Nearline tape storage is best suited for sequentially accessed data. It was originally hoped that optical disk would fill the removable storage, random access requirement, but its slow progress keeps it from addressing this requirement on any broad scale. Backups, recovery, batch processing (1) Performing a particular operation automatically on a group of files all at once rather than manually opening, editing and saving one file at a time. For example, graphics software that converts a selection of images from one format to another would be a batch processing utility. , and archiving applications are primarily created and accessed sequentially. Virtual tape, using a disk buffer Not to be confused with page cache.

In computer storage, disk buffer (often ambiguously called disk cache or cache buffer) is the embedded memory in a hard drive acting as a buffer between the computer and the physical hard disk platter that is used for storage.
 appearing as tape drive to the attached server, is beginning to help address this issue though its real promise is not nearly fulfilled.

Finally, the overall cost of ownership to manage tape-based storage is often viewed as higher than disk and more labor-intensive. These three parameters, initial access time, sequential access In computer science, sequential access means that a group of elements (e.g. data in a memory array or a disk file or on a tape) is accessed in a predetermined, ordered sequence. Sequential access is sometimes the only way of accessing the data, for example if it is on a tape.  only, and a perceived higher management cost represent the next frontier for the Nearline and mass storage providers to address.

Far-line, or manually retrieved shelf storage, still represents the vast majority of the world's non-digital data. The path to digitization dig·i·tize  
tr.v. dig·i·tized, dig·i·tiz·ing, dig·i·tiz·es
To put (data, for example) into digital form.



dig
 is only slowing, not halting, its rate of growth. Managing and exploiting the hierarchy to its maximum benefit remains a challenge on a!! computing platforms, but as storage grows, the payoff for implementing an effective storage hierarchy is enormous. Automated libraries using magnetic tape, possibly small form-factor magnetic disks, the DVD and possibly other emerging storage mediums will become the foundation containing most of this mass-storage growth. New and emerging digital applications will continue to fuel a period of explosive growth for storage well into the next century as terabyte-plus databases, data warehouses, electronic voice, and video mail systems all drive up requirements.

We have now made the creation and manipulation of data, our most valuable resource, the primary focus of the new millennium. Computer power and storage capacity have been the most potent technologies driving the Internet Age and their role for the foreseeable future is in hand. This is about to change. Information is power but if it is immobile im·mo·bile
adj.
1. Immovable; fixed.

2. Not moving; motionless.



immo·bil
 and, thus can't be readily moved to the right place, we have created "the worldwide gridlock Gridlock

A government, business or institution's inability to function at a normal level due either to complex or conflicting procedures within the administrative framework or to impending change in the business.
." Our future vision begins to shift our attention toward communications capability as the next new driving force following the digital era we are well into. When this happens, the longstanding storage hierarchy should become the concern of the few, not the many.
COPYRIGHT 2000 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2000, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Technology Information
Author:Moore, Fred
Publication:Computer Technology Review
Geographic Code:1USA
Date:Mar 1, 2000
Words:1425
Previous Article:CMD RAIDs The Market.(CMD Technology's CRD-7220)(Product Announcement)
Next Article:Open Systems Connectivity To Mainframe Storage Networks.(excerpt from Marc Farley's book, "Building Storage Networks")(Technology Information)
Topics:



Related Articles
GET READY FOR INTELLIGENT TAPE.(data storage industry analysis)(Industry Trend or Event)
Notable Events In The Storage And IT Industry: A Historical Summary.(Technology Information)
Long Term Data Preservation.(Industry Trend or Event)
UTILIZING HIERARCHICAL STORAGE MANAGEMENT.(Brief Article)(Product Announcement)
The Hole In The Hierarchy.(Technology Information)
Storage Intensive Applications.(Industry Trend or Event)
Digital Data's Future--You Ain't Seen Nothin' Yet!(Industry Trend or Event)
Integrated Solutions for Precise Parts Measurement. (In Gear).
EMC to develop software for McDATA's intelligent switch platform.
Virtual tape: a solid citizen in an ILM world.(Storage Management)(Information Life-Cycle Management )

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles