The digital tsunami: a perspective on data storage: to meet demands, organizations will need to increase today's storage offerings 10 times. But how will such dramatic increases be addressed by technology and systems in the next few years?
At the Core
* examines current issues and future possibilities in data storage
* compares alternative technologies for data storage
* identifies existing and emerging technologies for data storage
The rapid emergence of digital video is just one indicator of the growing markets for storage systems with much lower costs and much larger capacities. Other major demands come from the standard business modal for corporate data storage" and the growth of storage' service providers (SSPs) that are accumulating data from small business and consumer markets.
According to according to
1. As stated or indicated by; on the authority of: according to historians.
2. In keeping with: according to instructions.
3. "How Much Information," a study by the University of California The University of California has a combined student body of more than 191,000 students, over 1,340,000 living alumni, and a combined systemwide and campus endowment of just over $7.3 billion (8th largest in the United States). Berkeley School of Information Management and Systems, to meet escalating requirements, each of these markets will need to improve today's storage offerings by 10 times. By 2010, a 100-fold increase will likely be necessary, the study shows. As records and information managers must deal with increasingly larger volumes of records, keeping up with storage media trends is becoming more critical.
According to International Data Corp., in fiscal year 2003, Fortune 500 companies spent $7 billion on approximately 1,200 terabytes (1 terabyte = 1,000 gigabytes) of data stored on magnetic tape. Additionally, Jim Porter of Disk/Trend Inc. reports that companies have spent $40 billion to store 1,400 million terabytes of data on magnetic disk. Even though storage requirements have now started to increase at more than 100 percent annually, technology is improving in cost/performance at only 35 percent each year.
This creates a significant gap in terms of the cost/performance of storage systems required to meet the growing need. To allow that data to be stored without causing large cost increases, data storage must move from relatively expensive disks to lower-cost media. In addition, the cost/performance of these lower-cost media must continue to improve at an annual rate of nearly 100 percent to meet the continuing data growth.
This growth in data can be felt in personal terms also. In the eddy 1990s analysts were foretelling that each Fortune 500 company would have a terabyte under management in a few years. In 2000, experts such as Patti Tobin predicted that in 2003, thanks to the advent of digital movies on DVD DVD: see digital versatile disc.
in full digital video disc or digital versatile disc
Type of optical disc. The DVD represents the second generation of compact-disc (CD) technology. , commercial music collections on CD-ROMs, and personal music or digital photographs stored on CD-Rs or CD-RWs, each computer-savvy home would have the equivalent of a terabyte of data under its roof. By 2006, as wireless and video technologies become ubiquitous, we should reasonably expect that individuals will move from having hundreds of gigabytes to a terabyte of personal storage on their desktop or laptop. The U.S. government--often a "canary in the mine" for new technology--has several data storage systems under development that will provide about 80 petabytes (80,000 terabytes) each.
This explosive growth of data, called the "Digital Tsunami" by Gary Ashton in his 1996 National Media Laboratory Report "Future Trends in Storage, Interconnectivity, and Data transfer," is fundamentally driven by the conversion of analog systems to fully digital systems (e.g., film cameras to digital cameras) and a leveling of computer resources available (e.g., each home computer is now equivalent in performance to early supercomputers).
But how will these dramatic increases ill data storage be addressed by technology and systems over the next several years?
Comparing the Media
The means of storing computer data has changed little over the past few decades and still consists of only about half a dozen different types of media and formats. These include silicon based dynamic and static random access memory Static random access memory (SRAM) is a type of semiconductor memory. The word "static" indicates that the memory retains its contents as long as power remains applied, unlike dynamic RAM (DRAM) that needs to be periodically refreshed (nevertheless, SRAM should not be confused with (RAM), magnetic hard disk drives, optical disks (including write once read many [WORM], magneto/optic, CD-ROM CD-ROM: see compact disc.
in full compact disc read-only memory
Type of computer storage medium that is read optically (e.g., by a laser). , and DVD disk, and magnetic tape.
In any specific storage application, the preferred medium is selected because of a particular characteristic that is superior to that offered by the other options. RAM offers high data-transfer rates and high-speed access even though it is the most expensive. Hard-disk drives provide high-capacity (greater than 100 gigabytes) and moderate access times in an affordable package but, like RAM, are neither removable nor archival, even though in recent years the swap-out capability of drives and media has lead to a change in thinking about the removability of hard disk systems. CD-ROM provides a removable disk A disk or disk cartridge that is inserted into the drive for reading and writing and removed when not required. Using optical technologies, CDs and DVDs are the most common examples. media of limited capacity (650 megabytes) and the DVD provides 5 gigabytes. Emerging formats of the WORM optical disk from Sony and Plasmon offer 30 gigabytes of archival storage in a 5.25-inch format.
The data-sharing function of the floppy is now facing obsolescence ob·so·les·cent
1. Being in the process of passing out of use or usefulness; becoming obsolete.
2. Biology Gradually disappearing; imperfectly or only slightly developed. due to its slow data rate and low capacity. It is being replaced in sophisticated installations by portable universal serial bus See USB.
(hardware, standard) Universal Serial Bus - (USB) An external peripheral interface standard for communication between a computer and external peripherals over an inexpensive cable using biserial transmission. (USB USB
in full Universal Serial Bus
Type of serial bus that allows peripheral devices (disks, modems, printers, digitizers, data gloves, etc.) to be easily connected to a computer. )-oriented drives and media. The most affordable means of storing digital data is provided by tape, and tape is usually selected for archiving large databases. The result is a mix of storage products from which the designer must chose to optimize system performance at the lowest cost. This results in a large variety of physical form factors and a need to interface among the various types of media.
An ideal solution would provide data storage with the following characteristics:
* Rapid access
* Fast read/write data rates
* Low cost per data byte
* Archivability (as an option)
* The option to be managed by an autoloader or jukebox
A storage medium that could offer the lowest cost and best performance--plus the specific set of the ideal characteristics--would replace most other types of storage in use today. A drive with removable, high-capacity, magnetic disk media potentially meets these criteria. For several reasons, however, it is not practical using today's data storage technologies. Simply put, magnetic storage is inherently temporary. Tapes, the highest-capacity removable media In computer storage, removable media refers to storage media which can be removed from its reader device, conferring portability on the data it carries. A removable drive is a reader device for such media. , are serial in nature and, therefore, slower to access. The areal density The number of bits per square inch of storage surface. It typically refers to disk drives, where the number of bits per inch (bpi) times the number of tracks per inch (tpi) yields the areal density. (amount of data per square inch) of optical storage is limited by the wavelength of light it uses.
Tape is (1) a more mature and stable technology (the first magnetic tape drives (storage) magnetic tape drive - (Or "tape drive") A peripheral device that reads and writes magnetic tape. were introduced in the early 1950s); and (2) it is slower due to the serial nature of the search and retrieval. These two features make it a less attractive product for many industry insiders. Various experts have stated that "tape is dead" over the past 20 years. While there is a kernel of truth in this statement, it deflects the larger issue that remains for the foreseeable future: There is no lower-cost alternative than tape for the safekeeping Safekeeping
The storage of assets or other items of value in a protected area.
Individuals may use self-directed methods of safekeeping or the services of a bank or brokerage firm. of large amounts of data.
Magnetic tape has been fighting several battles over the past 12 years. It has succeeded in offering the lowest cost/storage alternative for a variety of computer users in a wide range of markets, from workstations, to workgroups, to department-sized systems, to enterprise (campus/nationwide) systems. This advantage varies as new disk and tape products are introduced, but, in general, tape remains about five to 10 times less expensive than disk.
Tape suffers from several features that are endemic to its character. First, it is a removable medium, meaning that to access any file on a dismounted tape, extra time is required to find and load the tape in the system. Second, unlike disk or RAM, tape is inherently sequential and imposes delays on retrieving any file housed on a physical data set. Third, tape has a transfer rate about four times slower than disk and, there fore, must be buffered. In the past, these deficiencies were largely inconsequential in·con·se·quen·tial
1. Lacking importance.
2. Not following from premises or evidence; illogical.
A triviality. compared to the more important issue of lower cost. Magnetic disk systems have continued to improve, and their cost has continued to fall.
Smaller individual users demand more convenience, and the lower cost of tape for smaller systems has become unimportant relative to other system factors such as software and media handling features. This loss of net benefit has caused a significant sales decrease in low-end tape systems. For big systems (e.g., department and enterprise), however, the cost differential remains important because the data volumes are so much larger. The average price for a multi-terabyte-disk storage system is currently about $40 per gigabyte, while the average cost for large-system tape is about $7 per gigabyte.
Although tape has some disadvantages when compared to disk media, some of them are actually advantages in many applications. For example, the fact that tape is dismounted after use becomes an advantage to users who want to hold their data for long term retention. For archival requirements, users do not want to keep their valuable data on a spinning platform. The National Storage Industry Consortium (NSIC NSIC National Sport Information Centre (Australia)
NSIC National Storage Industry Consortium
NSIC National Strategy Information Center (US government)
NSIC Nuclear Safety Information Center ) is an industry-wide group that projects technology developments and user requirements into the future (usually about five to six years out). NSIC is comprised of such companies as IBM (International Business Machines Corporation, Armonk, NY, www.ibm.com) The world's largest computer company. IBM's product lines include the S/390 mainframes (zSeries), AS/400 midrange business systems (iSeries), RS/6000 workstations and servers (pSeries), Intel-based servers (xSeries) , Storage Tek, Seagate, Sony, Hewlett-Packard, and Quantum and has generated comparative information between tape and disk. Based on that information, NSIC expects tape to maintain better than a 3-to-1 advantage in the capacity/cost ratio when compared with hard-disk systems. However, currently there are several vertical markets--legal, financial, entertainment, government, geophysical, and medical imaging--that are already demanding WORM media without regard to the size of the data set. Each of these vertical markets, coincidentally co·in·ci·den·tal
1. Occurring as or resulting from coincidence.
2. Happening or existing at the same time.
co·in , is generating large amounts of data and, in most cases, is already using extended data set management systems such as hierarchical storage management See HSM. (HSM (1) (Hierarchical Storage Management) The automatic movement of files from hard disk to slower, less-expensive storage media. The typical hierarchy is from magnetic disk to optical disc to tape. ) and file storage management systems (FSMS FSMS Food Safety Management System
FSMS Florida Surveying and Mapping Society
FSMS Field Service Management System
FSMS FedEx Ship Manager Server
FSMS File Storage Management Systems
FSMS Fort Scott Middle School (Kansas) ). Many of these applications use WORM optical disks to meet archival requirements. This trend will likely continue as broadband communication encourages more data traffic deemed archivable by users.
WORM media is currently the preferred approach for those large data centers that do not erase data on tape. It is expected that the percentage of backup operations will continue to shift in favor of the large data set approach during the next few years, putting WORM systems in an even stronger position. Erasable e·ras·a·ble
1. Capable of being erased: erasable ink.
2. Capable of producing something that can be erased: an erasable pen. media will continue to offer cost-benefits in selected back-up operations in the future.
Current Data Storage Technology Limits: A Brief Comparison
The "Storage Technology" graphic below provides a brief survey of existing and possible emerging technologies for data storage. This chart shows storage technologies in terms of cost of storage vs. the time needed to access the stored data. Flash memory is the most expensive and fastest storage while tape offers the lowest cost and the slowest access. Holographic storage An optical technology that records data as digital holograms that fill up the entire volume of a small optical cylinder one millimeter in diameter. It truly is an amazing technology. , two-photon storage, and micro-mechanical storage devices are included, but the market availability of these is too far into the future to be considered for any near-term records management solutions.
The data storage density of devices based on traditional optical recording is limited by their physical ability to write data marks on a recording material of one wavelength. However, a major benefit of laser recording is its inherent ability to project a multiplicity of beams in close proximity, thus enabling high-capacity, multichannel Using two or more paths for transmission or processing. It can refer to a variety of architectures including (1) multiple I/O channels between the CPU and peripheral devices, (2) multiple wires in a cable, (3) multiple "logical" channels within a single wire or fiber or (4) multiple recording.
Magnetic recording has two inherent disadvantages. One is that magnetic fields magnetic fields,
n.pl the spaces in which magnetic forces are detectable; created by magnetostrictive ultrasonic scalers to cause the tips of instruments such as ultrasonic scalers to vibrate. cannot be focused at much distance, so the read/write heads must be in close contact with the recording media. This leads to heights of about 1/1000 of a human hair, requiring smooth media surfaces and limiting operation to a rigorously controlled media environment. This prohibits use of removable media in practical disk systems and limits data mark size in magnetic tape systems. Finally, due to the small "fly height" and inherently reversible media, high data density magnetic disk storage is inherently incompatible with the need for both removable media and archival storage. The highest data density currently obtained by a system using removable, archival media is by green-laser-based optical tape systems, which are not yet available commercially. The specifications will offer a remarkable advantage over current storage solutions.
A Glimpse of the Future
Many articles about holographic storage have discussed its performance characteristics and limitations. Fundamentally, the technology is a page-oriented storage in which frames of perhaps 1,000 by 1,000 data points are imaged into a crystalline material and stored there by light patterns. Changing the input angle of one of the beams permits a new hologram See holographic storage. to be stored in the same volume of material. This technology, then, could offer terabit storage capacities in small spaces. Steering beams to the various desired angles can be achieved mechanically in relatively low-cost systems. Terabit Storage Corp. expects large data capacities if adequate signal-to-noise ratio The ratio of the power or volume (amplitude) of a signal to the amount of unwanted interference (the noise) that has mixed in with it. Measured in decibels, signal-to-noise ratio (SNR or S/N) measures the clarity of the signal in a circuit or a wired or wireless transmission channel. (SNR See signal-to-noise ratio.
SNR - signal-to-noise ratio ) can be obtained in a single module. Overlaying successive pages of information into the same physical media volume inherently reduces the SNR of previously stored data.
There are technological problems that limit the useful application of holographic storage. One of these issues is the lack of a suitable material for read/write applications. Holographic storage is essentially analog rather than digital, and data longevity and the desire for writing and then totally erasing the data at low input powers are opposing requirements in high-volume analog media.
Holographic storage could be preferred over disk if not for the prohibitive cost. Holographic See holographic storage. page composer systems could be in production by 2005 with the performance characteristics mentioned, but the cost per module could be in the $2,000 range, or about $20 per gigabyte. This can be compared to the cost of $2 per gigabyte for 100-gigabyte magnetic disk drives expected in the same time frame (about $20 per gigabyte for full system cost).
Two-photon storage is a technology whereby the intersection of two optical beams in a volume storage media locates the data of interest. Specific characteristics are required in each beam. At a user price of several thousand dollars, a device of this performance is not expected to be competitive with hard-disk technology.
A limiting factor A factor or condition that, either temporarily or permanently, impedes mission accomplishment. Illustrative examples are transportation network deficiencies, lack of in-place facilities, malpositioned forces or materiel, extreme climatic conditions, distance, transit or overflight rights, in two-photon systems is the need for short-pulse lasers in the pico-second range (i.e., 1 trillionth tril·lionth
1. The ordinal number matching the number one trillion in a series.
2. One of a trillion equal parts.
tril of a second). This requirement probably places the cost of such devices beyond commercial availability for the foreseeable future. A price of perhaps $2,000 at the optimistic op·ti·mist
1. One who usually expects a favorable outcome.
2. A believer in philosophical optimism.
op 100-gigabyte capacity produces a cost of $20 per gigabyte. The technology is only competitive if ultra-short pulse lasers become available for less than $100, and that seems an unlikely prospect.
Though similar in form (e.g., cartridge, enclosure), optical recording on phase-change, write-once tape media differs somewhat from magnetic tape recording due to differences in the write/read optical head and its media. Like magnetic tape, the optical tape substrate consists of a thin Mylar[R] base. Marks made on the specially formulated, optically active (Chem. Physics) terms used of certain isomeric substances which, while identical with each other in other respects, differ in this, viz., that they do or do not produce right-handed or left-handed circular polarization of light. See optical activity.
See also: Optically media layer are permanent and unalterable, so the data on them cannot be replaced with new information.
Optical tape promises significant advances over magnetic tape in several dimensions. Based on technology demonstrations, data rates from 100 megabytes per second (unit) megabytes per second - (MBps, MB/s) Millions of bytes per second. A unit of data rate. 1 MB/s = 1,000,000 bytes per second (not 1,048,576). to 300 megabytes per second appear possible by 2005. Access times below 15 seconds have been shown. A capacity of 1 terabyte on a single 4-inch-by-4-inch cartridge has also been demonstrated, and 5 terabytes are projected by switching to the thinner base material now in use for magnetic tape. A total system-storage cost of less than $1 per gigabyte is projected for the first-generation products.
A major limiting factor for most optical memory systems is that achieving sufficient storage capacity requires moving the media relative to the optical head (such as in disks and tape). This results in slow access or providing a large, instantaneous recording field, which is impractical. A potential solution to this problem is the use of "spectral-hole burning" Here, multiple bits of information are written/read at the same location in the media by varying the write/read wavelength. Media capable of writing a thousand or more bits in the same location have already been demonstrated. Such a system potentially provides 1 gigabyte capacity, microsecond One millionth of a second. See space/time and ohnosecond.
(unit) microsecond - One millionth (10^-6) of a second. access times, and is not unduly expensive.
However, a limitation is that the stored data is slowly erased by thermal agitation and has to be refreshed every half an hour or less. The technology will probably not be cost competitive in modularities under several gigabytes and, although potentially useful, the technology is several years from any practical application.
Micro-Electro-Mechanical (MEMS (MicroElectroMechanical Systems) Tiny mechanical devices that are built onto semiconductor chips and are measured in micrometers. In the research labs since the 1980s, MEMS devices began to materialize as commercial products in the mid-1990s. )/Scanning Probe
Several data storage concepts based on a new micro-mechanical technology are now being researched. The development of atomic-force microscopes and nanometer-sized (1 nanometer = 1 billionth of a meter in width) probes has led to consideration of scanning probe-based data storage, which might be available by 2004 at the earliest. The technology is based on the premise that arrays of thousands of micron-sized probes, can be fabricated fab·ri·cate
tr.v. fab·ri·cat·ed, fab·ri·cat·ing, fab·ri·cates
1. To make; create.
2. To construct by combining or assembling diverse, typically standardized parts: like printed circuitry. In the 2004 time frame, costs must be relative to anticipated dynamic random access memory Dynamic random access memory (DRAM) is a type of random access memory that stores each bit of data in a separate capacitor within an integrated circuit. Since real capacitors leak charge, the information eventually fades unless the capacitor charge is refreshed periodically. (DRAM) at $500 per gigabyte and hard disks at $2 per gigabyte, and no greater than $50 per gigabyte to be competitive. This means $50 for a 1-gigabyte chip, a price point that seems possible. In large-scale production, price equality with hard disks is plausible. The 1 -gigabyte modularity, printed circuit board mounting, sub-millisecond access, and low power consumption make MEMS an attractive future technology.
Nano-scale recording is enabled by employing a precision electron beam A stream of electrons, or electricity, that is directed towards a receiving object. See electron beam imaging and electron beam lithography. source onto a disk at a density far beyond that achievable magnetically. This storage technology will easily out perform magnetic hard drives, rapidly surpassing the data-density limit of magnetic storage and, theoretically, will provide the ability to store many thousands of gigabytes on a CD-ROM-sized disk. In addition to offering far higher capacity and data rates, the new technology does not require a flying head--the mechanism that reads data from or writes data to a magnetic disk or tape--and, therefore, eliminates the possibility of "head crashes," the bane BANE. This word was formerly used to signify a malefactor. Bract. 1. 2, t. 8, c. 1. of existing hard drives. According to Disk/Trend Inc.'s Porter, an initial data density of 160 gigabytes per inch is expected by 2006, compared to the anticipated magnetic barrier of about 100 gigabytes per inch. The technology limit appears to be at least 500 times greater at 80 terabytes per inch. Data transfer rates well over a gigabyte per second should be achievable.
Other Revolutionary Storage Technology
In the next 15 years, scientists will have narrowed the width of lines etched etch
v. etched, etch·ing, etch·es
a. To cut into the surface of (glass, for example) by the action of acid.
b. into semiconductors to less than one-tenth of a micron, meaning that electrical signals running through those circuits will contain so few electrons that adding or subtracting a single one could make a difference in the computer's functions. To control the movements of small groups of electrons, researchers are developing quantum dots that can corral corral
a small fenced-in enclosure with high, wooden fences, suitable for holding cattle or horses.
a management system in which range cattle are put into corrals and fed hay for a period when the environment is most rambunctious electrons, allowing them to escape only when zapped by a precisely sized boost of energy from outside. Some of the possible technological developments that could benefit from a new generation of scientific advances over the next two decades include:
* Electron Trapping Optical Memory (ETOM eTOM Enhanced Telecom Operations Map
ETOM Electron Trapping Optical Memory
eTOM Electronic Telecom Operations Map (TMF) )--A new kind of optical disk in development uses the principal of electron trapping to store data. Data is stored by changing the state of electrons in the media. First-generation ETOM devices were unveiled in 1992. There is potential to store 14 gigabytes of uncompressed data on a single double-sided 5.25-inch disk with a transfer rate of 120 megabytes per second. ETOM media can store both analog and digital data. The technology remains in the lab as the developer works on a cheaper laser and improved disk manufacturing techniques.
* Liquid Crystal Technology--Data is written into specialized crystal material at 793 nanometers. The technology has shown an ability to create optical data storage densities at 8 gigabytes per square inch--a world record--and has the potential of reaching 100 gigabytes per square inch. The application is for optical switching and routing and will enable sub-gigabit-per-second to near-terabit-per-second data stream bandwidth. Devices developed will provide high speed and capacity and will work for archival storage in conjunction with magnetic disk or tape.
* Surface-Enhanced Raman Optical Data Storage (SERODS SERODS Surface-Enhanced Raman Optical Data Storage )--This highly experimental technique uses a method of scattering laser light on a molecular surface and offers storage densities as much as 100 times greater than the conventional optical disk.
If the existing "digital tsunami" continues, as most believe it will, over the next several years, the Years, The
the seven decades of Eleanor Pargiter’s life. [Br. Lit.: Benét, 1109]
See : Time size of a greater portion of data sets will expand beyond the ability of the older back-up model to maintain them. In these cases, a more affordable technology will be needed than can be provided by projected magnetic disk and tape offerings. New technologies that take advantage of optical or nanoscale At nanometer size. Any device only a few nanometers in size is nanoscale. See nanotechnology and nanometer. features are most likely to offer reasonable alternatives in the next three or four years. These technologies will present some operational and obsolescence risks to users, but such risks will be more than offset by the greatly improved price and performance features.
Ashton, Gary. "Future Trends in Storage, Interconnectivity, and Data Transfer." National Media Laboratory Report. 1996.
International Data Corp. "Worldwide Tape Drive Market 2000-2004." Available at www.idc.com (accessed 11 November 2003).
Porter, Jim. "A Fast 15 Years." Insight. July/August 2001. Available at www.datareader.org/0103/idema.html (accessed 12 November 2003).
Tobin, Paul. "The Coming Storage Explosion: A Terabyte in Your Neighborhood." THIC THIC The Heart Institute for Children Conference Presentation. 2000.
UC Berkeley School of Information The UC Berkeley School of Information or iSchool is a graduate school offering both a professional master's degree as well as a research-oriented PhD degree. Formerly known as the School of Information Management and Systems (SIMS), the School of Information sits on the Management and Science. "How Much Information?" Berkeley, CA: SIMS, 2000. Available at http://sims.berkeley.edu/edu/ research/projects/how-much-info (accessed 11 November 2003).
Joe Straub has worked in the high-performance computing High-speed computing, which typically refers to supercomputers used in scientific research. and data storage industries for the past 20 years. He is the co-founder of SRA SrA
senior airman Corp., SaxpySappy, and LOTS. He is active in advanced research and development and consults for Imation, e-Phocus, and Lockheed/Martin. He may be contacted at firstname.lastname@example.org.