How will you manage the infinite archive? It may include a return to old-fashioned film.Can you remember when most data was born non-digital? In the early 1990s, just over 90% of the world's data was born non-digital in analog format. Today, the reverse is true as more than 90% of newly created data is born in a digital format. Whatever happened to all the non-digital archives created from years of generating analog data Data that is recorded in a form that is similar to its original structure. Contrast with digital data. See analog. ? Did those archives get converted to digital storage along the way or do they remain potentially deteriorating in archives around the world? Is it even possible to convert old non-digital data to digital data storage, or is our historical data destined des·tine
tr.v. des·tined, des·tin·ing, des·tines
1. To determine beforehand; preordain: a foolish scheme destined to fail; a film destined to become a classic.
2. to stay in a non-digital state forever? Should some data stay in non-digital form?
Businesses are now more clearly focused on data and its value than ever before. However, most of this focus is on digital data. Numerous legal issues, the impact of compliance, and the fact that data may someday some·day
At an indefinite time in the future.
Usage Note: The adverbs someday and sometime express future time indefinitely: We'll succeed someday. Come sometime. have significant value which can't presently be seen, make nearly all data a candidate for archival status. (Spam is a notable exception.) What is meant by the term archival? Archival storage presents several agendas for storage managers. Some think archival refers to media that needs to be read 10, 20, even 50 or more years from the present time. Others see preserving digital data for infinite periods of time, realizing that the physical media will change many times during the lifetime of data. Some healthcare providers indicate that they will archive medical records for a person's lifetime plus seven years.
The question of how much digital data exists in the world is best addressed by the second University of California, Berkeley The University of California, Berkeley is a public research university located in Berkeley, California, United States. Commonly referred to as UC Berkeley, Berkeley and Cal , study on digital data creation. This comprehensive study defines magnetic, optical, print and film as the four types of physical media where data is stored. Per the study, magnetic disk and tape (digital) accounted for 92% of the total amount of data stored; film (non-digital) represented 7% of the total; with paper (non-digital) and optical (digital) media storing the remainder (www.sims.berkeley.edu/research/projects/how-much-info-2003/). Archival data is normally referred to as fixed content, meaning that it is rarely modified. Archival data represents a much lower I/O (Input/Output) The transfer of data between the CPU and a peripheral device. Every transfer is an output from one device and an input to another. See PC input/output.
I/O - Input/Output activity level than most other applications. These properties create many unique opportunities and challenges for archival storage device suppliers.
Magnetic tape is the most commonly used data center archival technology. The most commonly quoted figure for the archival life of current generation magnetic tape cartridges See cartridge. ranges from 15 to 30 years, in ideal environmental conditions. Even in an era of significant emphasis on compliance and records retention, that is long enough to make storage administrators comfortable that the media will last. Even if the digital media can be read many years from now Paul McCartney: Many Years from Now is a 1997 biography of Paul McCartney by Barry Miles. It is the "official" biography of McCartney and was written "based on hundreds of hours of exclusive interviews undertaken over a period of five years" according to the back cover of , the rate of change for new storage technologies make the media obsolete in less than 10 years. Finding replacement parts, trained maintenance personnel, diagnostics, and operating systems Operating systems can be categorized by technology, ownership, licensing, working state, usage, and by many other characteristics. In practice, many of these groupings may overlap. that support old devices now mandate conversion to a new archival technology well before its rated useful life is over. The significant improvements in magnetic tape media life since 2000 now enable tape media to exceed the practical limits of the tape drives themselves. Remote electronic tape libraries used as vaults and true offline tape storage remain useful for archival of records that are seldom accessed, offering additional geographic protection against disasters.
Optical media organizations have traditionally relied upon optical media and Write-Once, Read-Many (WORM) media to comply with regulatory requirements Regulatory requirements are part of the process of drug discovery and drug development. Regulatory requirements describe what is necessary for a new drug to be approved for marketing in any particular country. for "non-erasable" and "non-rewriteable" storage media. Optical media thrives in the entertainment storage business, but interchange and standards issues can make media management more time consuming for business applications. In addition, optical disk capacities have failed to keep pace with magnetic technologies in either capacity or data rate. Optical DVDs offer 4.7 gigabytes capacity writing a single layer of data on a single side of the media, and the latest Blu-ray technologies (providing up to 30 gigabytes) pale compared to the half-terabyte capacities of current tape cartridges and disk drives. The explosion in regulated data storage is pushing the limits of storage capacity well into the terabyte One trillion bytes. Also TB, Tbyte and T-byte. See tera and space/time.
(unit) terabyte - 2^40 = 1,099,511,627,776 bytes = 1024 gigabytes or roughly 10^12 bytes.
(Note the spelling - one 'r'). See prefix. range, pushing past the performance and capacity limits of optical media.
WORM Disk Arrives
In order to solve this dilemma, the popular economy magnetic disk arrays combined with user-selectable WORM functionality deliver a non-alterable storage solution that is ideal for archival and regulated data storage. WORM disks use disk arrays (typically, economical SATA (Serial ATA) A serial version of the ATA (IDE) interface, which has been the de facto standard hard disk interface for desktop PCs for more than two decades. The original Parallel ATA (PATA) interface was launched in 1986. drives) to create large second-tier online storage arrays. SATA-based storage arrays deliver TBs of online capacity at prices that bring disk storage closer to automated tape than ever before. Prices for SATA-based storage arrays are commonly in the $3-$15 per gigabyte range, compared to automated tape libraries that range for $3 to less than $.25 per gigabyte. The anticipated progress of tape cartridge capacity with data compression data compression
Process of reducing the amount of data needed for storage or transmission of a given piece of information (text, graphics, video, sound, etc.), typically by use of encoding techniques. implies that automated tape will maintain its price differential over disk for the foreseeable future.
Evolving in parallel within the SATA movement is the new concept of MAID (Massive Arrays of Idle Disks) storage. MAID is similar to the RAID concept except that in a MAID storage array, all disks (currently SATA disks) are not spinning all the time. With a MAID subsystem, disks remain dormant (powered off) until requested. Power-up time for SATA disks takes about 10 seconds. MAID is aimed at enabling the current SATA activity to handle an additional level of storage requirements partially being addressed by automated tape libraries or not being cost-effectively addressed by disk or optical storage. This concept is somewhat analogous to an automated tape library with the exception that disks are substituted for tape cartridges. MAID can be viewed as a library of disks.
By reducing the number of disks that are concurrently active, the overall storage subsystem The part of a computer system that provides the storage. It includes the controller and disk drives. See storage system. costs can be significantly lowered by simplifying controller complexity. The financial savings increase as storage environments get larger. MAID provides traditional levels of RAID data protection capability, important for SATA drives, to enable higher availability similar to current disk arrays. The most effective usage of MAID storage will be application driven. MAID isn't suitable for all applications, but it is poised to addresses mid-term archival, lower activity fixed-content data, and backup and recovery more cost effectively compared to existing disk solutions. MAID products are just now appearing in the market.
Last Ditch Data Recovery: The Return to Analog?
Digital storage has an additional requirement for access compared to analog data--electricity. What if there is an electrical outage out·age
1. A quantity or portion of something lacking after delivery or storage.
2. A temporary suspension of operation, especially of electric power. for several days? Unfortunately, events of the recent past now indicate this has become a possibility, though still a remote one. Hurricanes, fires, floods, and terrorists contribute to the odds that one day dealing without electricity for several days may happen. Therefore, the technology selections above won't help in case of a blackout A complete loss of power. See brownout. or sustained electrical failure electrical failure
Failure in which the cardiac inadequacy is secondary to disturbance of the electrical impulse. .
The only reliable read capability remaining in a prolonged pro·long
tr.v. pro·longed, pro·long·ing, pro·longs
1. To lengthen in duration; protract.
2. To lengthen in extent. blackout becomes the human eye. The human eye can read analog data. In this absolute "last ditch data recovery" scenario, moving the absolute most critical, potentially life-saving data back to analog film A plastic sheet with a photosensitive emulsion that comes in various formats for different cameras such as 35mm, 110, 120 and 220. Film was never considered analog until digital cameras came on the scene and stored their images in a digital format in memory. has value. In a life or death situation, film can provide the information needed for survival when electricity based storage has become non-functional. Paper would be another option, except that the physical space required (compared to film) is most likely prohibitive pro·hib·i·tive also pro·hib·i·to·ry
1. Prohibiting; forbidding: took prohibitive measures.
2. . Film is immune to viruses and is non-alterable, making this step to archival storage management as the last stop. As an informational note, Microbox in Germany (www.microbox.de/english/frameset_longterm.htm) has developed a most unique and advanced state-of-the art laser film recording technique--currently writing data at 12,000 dots per inch with a rated media life of 500 years. Soon CAD-CAM drawings and x-rays with no artifacts artifacts
see specimen artifacts. can be written on this film that can be read with magnification Magnification
A measure of the effectiveness of an optical system in enlarging or reducing an image. For an optical system that forms a real image, such a measure is the lateral magnification m by the human eye! Device obsolescence ob·so·les·cent
1. Being in the process of passing out of use or usefulness; becoming obsolete.
2. Biology Gradually disappearing; imperfectly or only slightly developed. , the lack of parts, operating systems support, and worms and viruses are issues that don't exist in the case of film.
A final consideration for archival storage is the requirement to periodically move the data from older technologies to newer ones. The challenge, practicality and time required to do these conversions increase every day as the amount of data steadily accumulates in each business. Moving terabytes (and soon petabytes) of data through servers becomes impractical and degrades the overall system. Device-to-device data transfer offers the most promise, though a work-able implementation remains distant. This issue has no clear-cut solution today. Most progressive data centers continuously move 15-25% of their data to newer technologies each year to minimize the disruption, making this possibly the only viable option.
Though the complete journey from analog and digital data creation to long-term archival storage could involve a return to analog data, in either case the issue of managing archival storage is without a doubt a pressing one. The archival choices we make today will make possible archival data retrieval tomorrow.