Distributed backup is the key to ILM.ILM solutions can significantly reduce the cost and complexity of data storage, but to reap the greatest rewards, ILM relies on a backup system Noun 1. backup system - a computer system for making backups
ADP system, ADPS, automatic data processing system, computer system, computing system - a system of one or more computers and associated software with common storage that is ILM-aware. ILM has two goals. One is to minimize administration costs. The other is to make the most efficient use of storage hardware. Without a backup architecture that maximizes or even enables ILM, these goals cannot be realized effectively.
The Case for ILM
Since enterprises are so dependent on information about their processes, products, customers and suppliers, data storage is a challenge for IT executives and storage administrators everywhere. Reliable and secure data storage is crucial to business continuity plans. Many industries, such as finance and health care, face new regulatory policies that mandate ever-increasing durations of data retention.
Because of the combination of more data and longer retention times, the cost of managing information throughout its lifecycle grows as much as 20% to 30% per year, according to according to
1. As stated or indicated by; on the authority of: according to historians.
2. In keeping with: according to instructions.
3. some estimates.
Though opinions vary, for the purposes of this article ILM will be defined as a data archiving process that automatically moves data to the most cost-effective storage media, based on predetermined pre·de·ter·mine
v. pre·de·ter·mined, pre·de·ter·min·ing, pre·de·ter·mines
1. To determine, decide, or establish in advance: policies of accessibility, security and long-term storage. Data is transferred automatically, with no manual intervention required, reducing hardware and real estate costs. As a result, ILM vendors promise a significant Return on Investment (ROI (Return On Investment) The monetary benefits derived from having spent money on developing or revising a system. In the IT world, there are more ways to compute ROI than Carter has liver pills (and for those of you who never heard of that expression, it means a lot). ).
Archiving Versus Backup
All of an enterprise's data can be placed into one of two categories. Critical information is that which is needed for day-to-day operations and resides in the system's primary storage for fast access. Important information is the historical, legal and regulatory information that can safely be archived to secondary storage--lower cost disk or tapes stored offsite.
Critical data is typically accessed often. However, as a given file is accessed less and less frequently, over time this data eventually changes from critical to important. If, as a matter of policy, a file ceases to be critical and becomes important after ninety days of inactivity inactivity Sedentary activity Internal medicine An absence of physical activity and/or exercise, a predictor of obesity. See Couch potato. Physical activity, Vigorous exercise , an ILM solution automatically archives this data after ninety days to secondary storage, without any intervention by IT personnel. ILM solutions create a pointer pointer, breed of large sporting dog developed in England more than 300 years ago. It stands between 23 and 26 in. (58.4–66.4 cm) high at the shoulder and weighs between 50 and 60 lb (22.7–27.2 kg). or placeholder place·hold·er
1. One who holds an office or place, especially:
a. One who acts as a deputy or proxy.
b. One who holds an appointed office in a government.
2. for every file moved to secondary storage. Should a user ask for a file after ninety days (if the important information becomes critical) this placeholder points to the new location and the system can retrieve it and move it back to primary storage.
Archiving data that is no longer needed for day-to-day operations by moving it to long-term storage is distinctly, functionally different from backup operations which protect operational, critical data before it can be archived.
One key failing of backup systems that are not ILM-aware is that they will continue to store backup files A file on a tape, removable disk or the fixed disk of another computer that is a copy kept for backup purposes. See backup types. on tape or secondary disk, even though this data has been archived elsewhere. Since this secondary storage must still be managed, the overall return on the ILM investment will be considerably less than anticipated.
The Figure illustrates this process in a typical e-mail setup. This architecture includes a backup system that protects critical data on primary storage before it is archived to lower-cost disks or tape by an ILM solution. This traditional tape-based backup is the ILM solution's Achilles' heel when it comes to ROI.
The Problem With Backup
Typically, the backup saves files from primary storage to secondary storage on a daily basis. As long as a file remains critical (on primary storage) it will be backed up routinely--daily, in most enterprises. This means that the same file, often in multiple versions, is saved and stored many times, resulting in excessive hardware or media costs, administration time, and storage real estate, both onsite and offsite. A backup approach that is ILM-aware, and overcomes this problem, is Distributed Backup.
One advantage of using Distributed Backup in the ILM environment is that it eliminates the need for daily backups to tape, and the subsequent rotation, retrieval and storage of these tapes.
A Distributed Backup system collects the data to be backed up from LAN (Local Area Network) A communications network that serves users within a confined geographical area. The "clients" are the user's workstations typically running Windows, although Mac and Linux clients are also used. clients and sends it to offsite disk storage in a compressed and encrypted en·crypt
tr.v. en·crypt·ed, en·crypt·ing, en·crypts
1. To put into code or cipher.
2. Computer Science format. It also retrieves this data from offsite when it is needed for a restore. Because the process is fast and fully automated au·to·mate
v. au·to·mat·ed, au·to·mat·ing, au·to·mates
1. To convert to automatic operation: automate a factory.
2. , backups can take place as often as desired.
ILM-aware Distributed Backup or, more simply. Backup Lifecycle Management (BLM BLM n abbr (US) (= Bureau of Land Management) → les domaines ), takes advantage of the ILM archive's placeholders to keep only one copy of the file on either backup or secondary storage--but not both. These placeholders help the backup determine which files have already been archived. This allows it to automatically remove them from the backup disks A disk used to hold duplicate copies of important files. A variety of removable media are used for backup, including floppy, Zip and Jaz disks, CD-Rs, CD-RWs and DVD-RAMs. See backup. , freeing up storage space and eliminating file duplication duplication /du·pli·ca·tion/ (doo-pli-ka´shun)
1. the act or process of doubling, or the state of being doubled.
When BLM recognizes a placeholder in the backup data received from the client, it knows that the associated file has been transferred to secondary storage. It therefore searches the backup disk for the original file, deletes it, and saves only the placeholder.
Thus, only current data in primary storage is backed up to the disk, keeping disk size and cost, to a minimum. Compared to tape backup Using magnetic tape for storing duplicate copies of hard disk files. Users can add an internal or external tape drive to their desktop computers for backup purposes, and files are typically copied to the tapes using a backup utility that updates on a periodic schedule. , hardware and storage costs are lowered dramatically, and day-to-day backup administration is virtually eliminated. Distributed Backup also results in faster, more frequent backups and simpler restore operations.
While there are many backup solutions on the market, not all are ILM-ready, even among those that back up to disk. It is important to note that simply replacing tape with low-cost disk will not provide the technological advantages of a tested, technologically distinct BLM architecture.
Information Lifecycle Management Information Lifecycle Management refers to a wide-ranging set of strategies for administering storage systems on computing devices. Specifically, four categories of storage strategies may be considered under the auspices of ILM. is a growing trend that promises substantial savings in hardware and administration, but not if the existing backup system is ILM-unfriendly. To achieve the expected ROI, most enterprises will find it well worth choosing Distributed Backup that replaces traditional tape backup and integrates with ILM's unique technology for the greatest reduction in cost and complexity.
Eran Farajun is senior executive vice president of Asigra, Inc. (Toronto, Canada)