Backup and disaster recovery techniques: a look back and forward.If you used backup and disaster recovery techniques from just five years ago in today's datacenter, you definitely wouldn't get the job done. If you tried techniques from ten years ago, you couldn't even get started. To begin, the average size of a large data center has grown from hundreds of gigabytes to hundreds of terabytes, and some have grown to over a petabyte One quadrillion bytes (one trillion kilobytes). Also PB, Pbyte and P-byte. See peta, binary values and space/time. (unit) petabyte - 2^50 = 1,125,899,906,842,624 bytes = 1024 terabytes or roughly 10^15 bytes. 1024 petabytes is one exabyte. . In addition, the number of applications has grown as well. Finally, recovery expectations have also increased. The storage industry and storage managers have done their best to keep up with the demand, but not every advancement has helped. The basic defense ten or so years ago was locally attached Refers to disks, printers and other peripheral devices that are connected directly to a computer via USB or FireWire. Such devices may be designed for desktop use or portability. For example, a locally-attached, stationary hard drive may weigh three to five pounds, while one made for tape drives and native backup and recovery tools. This proved to be quite difficult to manage, bringing in the advent of network-based backup programs Software that copies data from a single machine or from selected computers in a network to a secondary storage medium. Backups can be scheduled at periodic intervals, or individual files can be automatically backed up right after they have been updated. , which have dominated the backup and recovery market for several years now. The basic design of such a system is one or more tape drives behind a backup server A computer in a network used to store copies of files from client machines or other servers. Such servers typically have their disks set up in a RAID configuration to provide fault tolerance. See backup program, RAID, SAN and LAN free backup. that backs up all other clients across the network. While the network backup server solved a lot of problems, the growth in the average size of a datacenter created more problems. The first line of defense was the use of faster and larger tape drives. Tape drives ten years ago had native speeds and capacities in single digits: 2-5 MB/s and 10 GB. The latest tape drives have native speeds of 80+ MB/s and native capacities of 200+ GB. This is both good news and bad news. While these tape drives can handle much more data, a single tape drive is faster than what a single backup server can handle. After an average compression ratio compression ratio Degree to which the fuel mixture in an internal-combustion engine is compressed before ignition. It is defined as the volume of the combustion chamber with the piston farthest out divided by the volume with the piston in the full-compression position ( of 1.5:1, an 80 MB/s tape drive becomes a 120 MB/s tape drive, and that is way beyond the capability of a Gigabyte Ethernet connection. The first challenge with tape drives faster than a network connection is that bigger and faster tape drives won't allow us to back up larger systems across the network. This brings us to our next advancement: LAN-free backups A LAN-free backup is a backup of server data to a shared, central storage device without sending the data over the local area network (LAN). It is usually achieved by using a storage area network (SAN). . It's essentially a return to the locally attached tape drives of ten years ago, with the addition of centralized cen·tral·ize v. cen·tral·ized, cen·tral·iz·ing, cen·tral·iz·es v.tr. 1. To draw into or toward a center; consolidate. 2. scheduling, reporting, and indexing, making it much easier to manage. And, unlike the locally attached tape drives of ten years ago, these tape drives are shared via the SAN. The second challenge with such fast tape drives is that you cannot stream a modern tape drive that's placed behind a network backup server. It's been difficult to stream tape drives across the network for a while, but now it's impossible. You simply cannot stream a 120 MB/s tape drive with a 50-75 MB/s Ethernet connection. The answer to this has been disk-to-disk-to-tape (D2D2T (Disk-to-Disk-to-Tape) Refers to backing up data on disks first and tape (or optical disc) second. Backing up onto tape is performed at less frequent intervals than from disk to disk. See D2D and virtual tape. ) backups, where backups are first sent to disk, and then sent to tape for offsite storage. Several systems can be backed up simultaneously, resulting in several serialized images residing on a disk locally attached to backup server. These backups can then be easily copied, cloned, migrated or duplicated to tape for offsite storage. (The virtual tape cartridge See cartridge. [VTC VTC - video teleconferencing ] will soon allow even the offsite "tape" to actually be disk.) To deal with the ever increasing complexity of today's datacenters, backup software See backup program. (tool, software) backup software - Software for doing a backup, often included as part of the operating system. Backup software should provide ways to specify what files get backed up and to where. products and application vendors have also created a bevy bevy a flock of birds. of special purpose agents to handle different scenarios. There are database agents, image-based agents to backup millions of files as one large image, block level incremental backup See backup types. (operating system) incremental backup - A kind of backup that copies all files which have changed since the date of the previous backup. The first backup of a file system should include all files - a "full backup". Call this level 0. agents to increase the speed of the image-based agents, open-file agents, and a host of other similar agents. While each agent solves a particular problem, it also adds to the complexity of the backup system Noun 1. backup system - a computer system for making backups ADP system, ADPS, automatic data processing system, computer system, computing system - a system of one or more computers and associated software with common storage . Backup systems have gotten faster and more reliable over the years, as the commercial backup hardware and software market has matured. However, there are some problems that traditional backup software simply cannot solve, starting with backups of remote sites. Remote backup systems are hard to manage, and proper off-site practices require a contract with a vaulting vaulting Gymnastics exercise in which the athlete leaps over a form that was originally intended to mimic a horse. At one time, the pommel horse was used in the vaulting exercise, with the pommels (handles) removed. vendor for every remote site--a costly proposition. The second challenge is that some recovery time objectives (RTOs) and recovery point objectives (RPOs) are impossible to meet with traditional backup. For example, how would you use a traditional backup system to recover a 1TB system in fifteen minutes, without losing more than five minutes worth of data? Good luck. The final challenge with traditional backup systems is their complete inability to create consistency groups. That is, they cannot restore multiple systems to the same point in time--a basic requirement in all disaster recovery systems. These challenges are why most DR planners have switched from tape or virtual tape backups Using magnetic tape for storing duplicate copies of hard disk files. Users can add an internal or external tape drive to their desktop computers for backup purposes, and files are typically copied to the tapes using a backup utility that updates on a periodic schedule. to replication for DR purposes. Historically, only replication could meet the challenges of today's DR systems. However, in recent years, this is no longer the case. There are actually three advanced backup and recovery methods that can be used to perform operational recovery and disaster recovery with a single system. Replication coupled with snapshots. The first advanced method is replication coupled with snapshots. Snapshots provide the historical aspect needed for operational recovery, and replication provides the ability to get data offsite and make it available for immediate use without a restore. This is the most common of the three advanced methods, with hundreds of customers using it to provide on-site and off-site backups without moving tape anywhere. (Sometimes tapes are created off-site for longer-term storage, but these tapes can stay where they are, since they're already off-site.) Object-based backup and delta-block incremental backup. The second advanced method uses object-based backup and delta-block incremental backup. When a traditional backup system backs up a changed file, it backs up the entire file. Both of these types of systems backup only the new blocks that have changed in that file. An object based backup system saves even more space and bandwidth by backing up only files it has never seen before. If a file (e.g. COMMAND.COM) has been backed up on another system, it just stores a pointer to that backup. It's important to understand that both systems store their backups in such a way that they can restore data just as fast (if not faster) than a traditional backup system. Some can even present a mountable image that can be used for business continuity while you're restoring the production system. Continuous Data Protection. Finally, there are continuous data protection (CDP CDP (cytidine diphosphate): see cytosine. (1) (Certificate in Data Processing) An earlier award for the successful completion of an examination in hardware, software, systems analysis, programming, management and accounting, ) systems that act like replication with a back up button. Like replication, a CDP system copies blocks to the backup system as soon as they're changed on the client. Where replication systems overwrite (1) A data entry mode that writes over existing characters on screen when new characters are typed in. Contrast with insert mode. (2) To record new data on top of existing data such as when a disk record or file is updated. blocks on the destination device when they're changed on the source device, CDP systems store the data in a log that allows them to present any point in time for recovery purposes. CDP systems can perform fast recoveries by restoring just the blocks that have changed, and instant recoveries by presenting a mountable volume for BC purposes, just like snapshots and some object-based backup systems. Over time, these new backup methods should be adopted as specialized agents for more traditional backup systems. This will bring the much needed benefits of centralized scheduling, reporting, and management to these wonderful new technologies. In summary, traditional backup systems are being enhanced with D2D2T systems, LAN-free backups, and specialized agents. However, even these enhancements cannot meet some recovery requirements such as remote sites, aggressive RTOs and RPOs, and some DR requirements. Therefore, some customers are now meeting these requirements with snapshot/replication based backup, object-based backup, delta-block backups, and continuous data protection systems. Hopefully these advanced systems will be more readily available as the need for them becomes even more widespread. W. Curtis Preston is vice president of Data Protection at GlassHouse Technologies, Inc. (Framingham, MA). www.glasshouse.com (1) (Computer Output Microfilm) Creating microfilm or microfiche from the computer. A COM machine receives print-image output from the computer either online or via tape or disk and creates a film image of each page. Opening shots in continuing stories ... |
|
||||||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion