Long Term Data Preservation.The past 40 years have seen the information revolution accelerate and the new millennium clearly promises to begin labeled the Information Age. Estimates now indicate that half to two-thirds of the world's data is being "born digital," meaning that its original occurrence was in a digital format. By the year 2004, it is projected that as much as 14% of the known data in the world will be captured in machine-readable digital format. Nearly 86% of the world's data will remain on paper, microfiche Pronounced "micro-feesh." A 4x6" sheet of film that holds several hundred miniaturized document pages. See micrographics. , various films, or other non-machine-readable formats. The digital data storage technologies map into a hierarchy consisting of fixed and removable media In computer storage, removable media refers to storage media which can be removed from its reader device, conferring portability on the data it carries. A removable drive is a reader device for such media. products that are making mass storage, data archiving, and electronic data vaulting Transmitting data to a computer in a different location for backup. affordable realities. Automated libraries using magnetic tape, possibly small form-factor magnetic disks, the DVD DVD: see digital versatile disc. DVD in full digital video disc or digital versatile disc Type of optical disc. The DVD represents the second generation of compact-disc (CD) technology. , and possibly other emerging storage mediums will become the foundation containing most of this mass-storage growth. New and emerging digital applications will continue to fuel a period of explosive growth for storage well into the next century as terabyte-plus databases for a variety of new applications, data warehouses, electronic voice, and video mail systems all drive up requirements. The storage demand created by the public Internet has yet to be determined and will generate countless new application and e-business developments along with many storage management challenges. Today's assessment of high-capacity data storage systems identifies data preservation as a looming looming: see mirage. crisis of dramatic proportion. Centered on data collection, storage, retrieval, and its transmission, we are faced with trying to understand how to preserve and archive this data on a long-term basis. Of the world's digital data, approximately 90% reside on mass storage or removable media technology such as magnetic tape and optical disks. The other 10% reside at the higher end Coordinates: For other places with the same name, see Billinge. Higher End or Billinge Higher End is a district of the Metropolitan Borough of Wigan, in Greater Manchester, England. of the storage hierarchy The range of memory and storage devices within the computer system. The following list starts with the slowest devices and ends with the fastest. See storage and memory. VERY SLOW Punch cards (obsolete) Punched paper tape (obsolete) FASTER on fixed magnetic disk subsystems and solid-state memory devices. The rate of change and subsequent obsolescence ob·so·les·cent adj. 1. Being in the process of passing out of use or usefulness; becoming obsolete. 2. Biology Gradually disappearing; imperfectly or only slightly developed. of high-capacity storage systems is now a strategic area of focus for suppliers and customers alike. Storage media typically will last longer than the storage hardware components and potentially leave large volumes of data in legacy formats when hardware upgrades occur. A report from the National Media Laboratory in St. Paul St. Paul as a missionary he fearlessly confronts the “perils of waters, of robbers, in the city, in the wilderness.” [N.T.: II Cor. 11:26] See : Bravery , MN published in 1998 indicated that magnetic tape formats would last between ten and twenty years TWENTY YEARS. The lapse of twenty years raises a presumption of certain facts, and after such a time, the party against whom the presumption has been raised, will be required to prove a negative to establish his rights. 2. if kept within environmental guidelines guidelines, n.pl a set of standards, criteria, or specifications to be used or followed in the performance of certain tasks. for temperature and humidity. The report also noted that human readable media could remain readable for much longer periods of time than can computer-based storage media. Newspapers can be read clearly from ten to thirty years based on environmental conditions and microfilm A continuous film strip that holds several thousand miniaturized document pages. See micrographics. Microfilm and Microfiche can enjoy a usable life of up to one hundred years or more. The industry's optimal data storage offerings readily provide for backward compatibility See backward compatible. (jargon) backward compatibility - Able to share data or commands with older versions of itself, or sometimes other older systems, particularly systems it intends to supplant. in reading older formats on the newer storage devices. At some point however, the data will have to migrate to a newer technology, as software cannot continue to support older formats forever. What is the value of having media that is readable twenty-five years or more from now when no software, replacement parts, diagnostics, or maintenance services for those storage devices will exist that understand the data format or recognize the media type? With digital content increasing on the order of 60% annually or more across all computing platforms See platform. , successful long-term data survivability sur·viv·a·ble adj. 1. Capable of surviving: survivable organisms in a hostile environment. 2. That can be survived: a survivable, but very serious, illness. relies on scalable storage architectures with high bandwidth parallel data paths for data migration to occur in a reasonable time. A 20MB/sec SCSI SCSI in full Small Computer System Interface Once common standard for connecting peripheral devices (disks, modems, printers, etc.) to small and medium-sized computers. SCSI has given way to faster standards, such as Firewire and USB. channel moving data at its maximum rated speed could move or, in this case, migrate 72GB of data per hour.A terabyte One trillion bytes. Also TB, Tbyte and T-byte. See tera and space/time. (unit) terabyte - 2^40 = 1,099,511,627,776 bytes = 1024 gigabytes or roughly 10^12 bytes. (Note the spelling - one 'r'). See prefix. of data would take 13 hours and fifty-three minutes to migrate (at rated speeds) to a newer technology. It would take a week to move just over twelve TBs at 20MB/sec. Using multiple data paths in parallel can bring a server and its corresponding application to a halt for a considerable amount of time. Is this acceptable? Wouldn't a SAN be an ideal solution for this ongoing long-term storage activity? The need for both high bandwidth and parallel transfer capability for storage subsystems The part of a computer system that provides the storage. It includes the controller and disk drives. See storage system. becomes obvious, particularly with the amount of digital content growing at 60% or more annually. Some scientific systems are now acquiring as much as five terabytes of digital data per day. By comparison, data transfer speeds have only increased on the average of about 1520% annually over the past ten years and progress has been mainly boosted by the more recent jump from 20MB/sec SCSI to 100MB/sec Fibre Channel. Storage capacities will continue to grow much faster than corresponding transfer capability and the need for an effective data migration strategy is quickly becoming more widespread. It is clear that a "capacity is everything focus" without understanding performance and throughput capability is not a strategic view. Highly effective storage strategies are beginning to address the issue of data preservation by selecting storage solutions that are scalable in capacity and bandwidth, while offering backward read capability with a new media format. Nonetheless at some point, a new media and hardware technology will need to be implemented and the data migration process must begin. To stay ahead of the potentially severe impact of this forward migration process, the most advanced storage strategies are developing specific plans to migrate between 15 and 25% of their digital content to new technologies annually, most often on an application by application basis. Given the tremendous size of overall storage pools, the focus is quickly moving to managing storage, performance, and availability by application. This implies that storage hardware environments will completely turn over in every four to five years and the removable media will completely migrate to a new media in ten years or less before facing the obsolescence factor. Along with data migration, the issues of performance, security, and 7x24 availability must all be factored into an effective data preservation strategy. Can users successfully stay on this path of digital progression in the foreseeable future? We now see the emergence of a number of new companies who offer to become the information utility or data bank for the information community. Their strategy is based on the premise that it will simply be too costly and labor intensive Labor Intensive A process or industry that requires large amounts of human effort to produce goods. Notes: A good example is the hospitality industry (hotels, restaurants, etc), they are considered to be very people-oriented. See also: Capital Intensive, Trading Dollars , even with the present progress in storage management capabilities, for many users to continue to manage their own applications and storage explosion. Keep in mind that we are not yet able to assess the impact of the public Internet on storage requirements or other eminent applications such as video mail, digital security, or electronic medicine. The growth of mass storage requirements has presented the storage industry with still another set of challenges in the name of data preservation. It is now possible to address the use of intelligent data filters at data collection points to discard the amount of clearly obsolete data being stored and never again being accessed or having no value. Made more feasible by low cost, high performance microprocessors, these smart data-agents can help us become more selective about what data we actually store. Unlike cleaning out the attic or basement once a decade, these filters work continuously to ensure that we store meaningful data while sending digital junk to the waste can. This still leaves a significant place in the world of information for non-digital content. If we want to read data well into the future, we will need a device or technology that we know will be here twentyfive to fifty years from now and one that won't likely face obsolescence every ten years. The human eye is the only safe bet at this poin t. The past forty years of data storage history has demonstrated that, for every limit approached, there is another technology in the labs that extends our vision even further. Our challenge is to identify and deliver them correctly. |
|
||||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion