Building practical data protection strategies.In 1952, the world's first successful tape drive was delivered, the IBM (International Business Machines Corporation, Armonk, NY, www.ibm.com) The world's largest computer company. IBM's product lines include the S/390 mainframes (zSeries), AS/400 midrange business systems (iSeries), RS/6000 workstations and servers (pSeries), Intel-based servers (xSeries) 726 with 12,500 bytes of capacity per reel. In 1956 the world's first disk drive was delivered, the Ramac 350 with 5 megabytes of capacity. Though no one knew it at the time, two key events in the storage industry had occurred; 1) the storage hierarchy The range of memory and storage devices within the computer system. The following list starts with the slowest devices and ends with the fastest. See storage and memory. VERY SLOW Punch cards (obsolete) Punched paper tape (obsolete) FASTER was created with online and offline storage Refers to disks and tapes that are kept in a data library. Offline data cannot be accessed from a computer or terminal until it is mounted in the drive. and 2) the first storage management applications were born, namely backup and recovery. Backup and recovery would become the primary storage management application for the next 50 years as protecting data became increasingly important. Will this traditional application survive the increasing demands for ultra-high availability and the need for nearly instantaneous recovery? Data protection has become the most critical piece of most IT strategies today. There are four fundamental stages in the lifecycle of digital data: 1) data creation 2) data access 3) data archive and 4) data deletion/destruction. The deletion/destruction phase no longer applies to all data types as considerable amounts of data is being stored indefinitely if not forever for a variety of reasons. The overarching o·ver·arch·ing adj. 1. Forming an arch overhead or above: overarching branches. 2. Extending over or throughout: "I am not sure whether the missing ingredient . . . goal of data protection is to protect information that cannot be easily replaced or replaced at all throughout it meaningful lifecycle. Different levels of data protection exist and as expected, higher levels of data protection cost more to implement. If the MTBF (Mean Time Between Failure) The average time a component works without failure. It is the number of failures divided by the hours under observation. MTBF - Mean Time Between Failures (Mean Time Between Failure) of hardware devices would have been 100% during the early years of the IT industry, businesses would have only needed to engage in straight forward backup and recovery processes. Software errors, human errors, natural disasters, the increasing number of power failures, building damages, and destructive intrusion such as worms and viruses have turned data protection into a complex process. Better data protection and security have evolved over the years from simply improving the MTBF of devices to implementing a variety of local and remote strategies to address the numerous causes of downtime The time during which a computer is not functioning due to hardware, operating system or application program failure. . RPO RPO Recruitment Process Outsourcing RPO Recovery Point Objective (disaster recovery) RPO Royal Philharmonic Orchestra RPO Rochester Philharmonic Orchestra RPO Representative Poetry Online RPO Railway Post Office (Recovery Point Objective) -- The desired amount of time between data protection events. RTO (Recovery Time Objective) The amount of time a computer system or application can stop functioning before it is considered intolerable to the enterprise. It can be computed to be from seconds to days, depending on how critical the application is to the organization. (Recovery Time Objective) -- The time needed to recover from a data loss event and return to service. In other words Adv. 1. in other words - otherwise stated; "in other words, we are broke" put differently , this requires classifying data or an application by its criticality or value to the business and determining how long the business can survive without this data. This area is under immense pressure to greatly reduce the amount of time it takes to recover an application. DPW DPW n abbr (US) (= Department of Public Works) → ministerio de obras públicas (Data Protection or Backup Window) -- This is the maximum amount of time available for an application to be interrupted while data is copied to another physical location for backup purposes. This component is seeing many new developments to reduce the amount of time needed for the traditional backup window. Data Protection Options Backup/restore is the most traditional disaster recovery method moving data, usually a complete file or full volume, from primary disk to either disk or tape for backup. The backed copy is not executable and must be restored to become accessible. In most cases, traditional backup causes the application being backed up to be impacted or even stop for the duration of the restore. The larger the object being restored, the longer the application and its customers must wait. For mission critical or revenue generating applications, any amount of time spent waiting for a recovery operation to complete in unacceptable. Full file or full volume backup and restores are the most time consuming of all the data protection techniques. Backing up full disk volumes or files can become very time consuming and may be difficult to schedule. Tradeoffs exist when choosing either of these options for an effective backup strategy. For incremental backups See backup types. (operating system) incremental backup - A kind of backup that copies all files which have changed since the date of the previous backup. The first backup of a file system should include all files - a "full backup". Call this level 0. , only the data that has changed since the last incremental backup is backed up. This minimizes the amount of data backed up each time and therefore reduces the amount of time needed for the "backup window" making it different from differential backup See backup types. (operating system) differential backup - A kind of backup that copies all files that have changed since the last full backup. Each differential backup will include all files in previous differential backups since the full backup so to restore a version of . However a full restore takes longer as each incremental backup will have to be restored to get all files to their last known state and is generally a more complex process. Often a full backup See backup types. will be performed weekly while an incremental backup is performed daily. Incremental backup minimizes the amount of backup time. For differential backup, the same data that was backed up on the previous differential backup plus any new changes are also backed up on the next differential backup. That's why differentials typically grow in size each day between full backups. This means that daily backups get gradually larger and therefore take longer, but the restore time is simpler and usually shorter compared to full or incremental backups. A full restore only requires the last full backup and the last differential copy to complete. Incremental backup minimizes the backup time and differential backup minimizes the restore time and the specific application may require one or the other. These tradeoffs are often confusing and time consuming for storage administrators. Most businesses want to shrink the painful amount of backup and the recovery time, not just one or the other. Disk mirroring is implemented as a block-for-block replica Earlier document exchange software from Farallon Communications, Inc. that converted a Windows or Mac document into a proprietary viewing format. The viewer could be distributed separately or embedded within the document itself, turning it into a single-document viewer. of a file, a logical unit, or a physical disk volume normally using local or remote disk drives for all copies. Once the mirrored data element is established by copying the original data element, the mirror is maintained from write operations in two (or more) places creating identical or nearly identical copies. Disk mirroring eliminates the backup window but doubles the amount of disk storage required adding significant acquisition and operational expense. Storage administrators must choose to implement either asynchronous Refers to events that are not synchronized, or coordinated, in time. The following are considered asynchronous operations. The interval between transmitting A and B is not the same as between B and C. The ability to initiate a transmission at either end. or synchronous Refers to events that are synchronized, or coordinated, in time. For example, the interval between transmitting A and B is the same as between B and C, and completing the current operation before the next one is started are considered synchronous operations. Contrast with asynchronous. mirroring and tradeoffs exist for each case. Synchronous mirroring is frequently used in z/OS (mainframe) environments given the critical nature of its applications. In synchronous mirroring, both the source and the target devices must acknowledge the write is completed before the next write can occur. This degrades application performance but keeps the mirrored elements synchronized syn·chro·nize v. syn·chro·nized, syn·chro·niz·ing, syn·chro·niz·es v.intr. 1. To occur at the same time; be simultaneous. 2. To operate in unison. v.tr. 1. as true mirror images of each other. For asynchronous mirroring, the source and target devices do not have to synchronize See synchronization. their writes and the second and subsequent writes occur independently. Therefore, asynchronous mirroring is faster than synchronous mirroring but the secondary copies are slightly out-of-synch (fuzzy fuzz·y adj. fuzz·i·er, fuzz·i·est 1. Covered with fuzz. 2. Of or resembling fuzz. 3. Not clear; indistinct: a fuzzy recollection of past events. 4. ) with the primary copy. Asynchronous mirroring is often used to replicate rep·li·cate v. 1. To duplicate, copy, reproduce, or repeat. 2. To reproduce or make an exact copy or copies of genetic material, a cell, or an organism. n. A repetition of an experiment or a procedure. data to locations hundreds of miles away. In reality, the secondary data element is rarely any more than one minute behind or out-of-synch with the primary copy. This can become a significant exposure for mission critical or write-intensive applications. Mirroring is used for many mission critical applications and it is the fastest way to recover data from a hardware or subsystem A unit or device that is part of a larger system. For example, a disk subsystem is a part of a computer system. A bus is a part of the computer. A subsystem usually refers to hardware, but it may be used to describe software. failure since restore operations occur in no more than a few seconds by switching to a mirrored copy. Note that mirroring does not help protect against a data corruption Data corruption refers to errors in computer data that occur during transmission or retrieval, introducing unintended changes to the original data. Computer storage and transmission systems use a number of measures to provide data integrity, the lack of errors. problem (hacker A person who writes programs in assembly language or in system-level languages, such as C. The term often refers to any programmer, but its true meaning is someone with a strong technical background who is "hacking away" at the bits and bytes. , worm, virus, intrusion, human or software error) as it generates two or more copies of corrupted data. For best practices, mirroring should always be accompanied by other data protection schemes that can permit a recovery or restore to occur from clean data that existed before the corruption occurred. Disk mirroring is defined and commonly referred to as RAID 1. Given the many tradeoffs and weaknesses in these traditional data protection methods, several other techniques are gaining momentum to reduce some of the traditional tradeoffs. Replication provides an executable image of data at a specific point-in-time. Like a series of still images, replicated or point-in-time copies are complete data images taken at specified points in time. Replicated copies enable an administrator to go back to a specific point-in-time to immediately execute or restore data from its most recent stable state before the corruption or other damages occurred. This represents the most complete method to protect from human errors, software problems, hardware viruses, intrusions, data corruption and should accompany any mirroring implementation for a more complete strategy. A replicated image can improve the RTO significantly. Again, tradeoffs exist. The more frequent the replicated copy is taken, the more physical storage is required, adding hardware expense, and the more time it takes to determine which copy is the correct one to restore from. Snapshot copy presents a consistent point-in-time view of changing data. There many variations of snapshot copy. When using snapshot copy and write operations occur, the changed areas (writes) are saved in a separate area or partition A reserved part of disk or memory that is set aside for some purpose. On a PC, new hard disks must be partitioned before they can be formatted for the operating system, and the Fdisk utility is used for this task. on disk of disk storage specifically reserved for snapshot activity. Here the old value of the affected area or block can be saved in case the new block(s) are corrupted or to permit a fuzzy data image that can be used for a non-disruptive backup. Storage administrators must manage the number and currency of snapshots. Snapshots provide data protection from intrusion and data corruption but not from a device failure. Again, tradeoffs exist. The challenge for snapshot copy is that it is not easy to find the exact snapshot copy just before the corruption takes place. CDP CDP (cytidine diphosphate): see cytosine. (1) (Certificate in Data Processing) An earlier award for the successful completion of an examination in hardware, software, systems analysis, programming, management and accounting, (Continuous Data Protection) enables data recovery where every write and update operation is continuously written to another device that may or may not be the same as the primary device. If Snapshot Copy is a series of still images, then CDP is like a movie. Unlike mirroring however, the secondary copy is a sequential history of write events. All write operations are queued to the secondary device, or the journal device, which may be disk or tape. Journals are typically kept as a continuous history for 2-4 days covering the period of maximum likelihood for a data recovery action to occur. Journals are especially good for protecting from intrusion and data corruption enabling restores to go back in time to a point before the corruption occurred but Again, tradeoffs exist. CDP does not replace traditional backup or provide protection in the event of data center loss. De-duplication Arrives A new technique called de-duplication is quickly gaining interest and offers significant improvement for traditional data protection methods. De-duplication can significantly change the economics of storage and it is beginning to do so in the disk-to-disk backup market. De-duplication goes by many different names including commonality com·mon·al·i·ty n. pl. com·mon·al·i·ties 1. a. The possession, along with another or others, of a certain attribute or set of attributes: a political movement's commonality of purpose. factoring, global compression, and capacity optimization A compression technique for storing multiple backup versions. If backups are routinely made of the same files and databases, there may be only a small number of changes between versions with most of the data in the latest backup identical to the previous backup. . De-duplication segments the incoming data stream, uniquely identifies the data segments, and then compares them to segments previously stored. If an incoming data segment is a duplicate of what has already been stored, the segment is not stored again but a pointer is created for it. If the segment is deemed to be unique, it is then further compressed with conventional algorithms for an additional average 2:1 size reduction, and stored to disk. A 50 gigabyte file or volume that is backed up seven times in a week if data changes or not and creates a significant amount of duplicate data. This is costly in terms of storage and difficult to sort out. Is there any reason that duplicate data should be stored seven times? De-duplication algorithms analyze the data and can delete six of the seven copies of that file. More sophisticated approaches actually de-duplicate data on a block level yielding even further reductions. De-duplication of data can significantly reduce the amount of storage capacity required since it only stores unique data. Additionally, combining de-duplication with data compression data compression Process of reducing the amount of data needed for storage or transmission of a given piece of information (text, graphics, video, sound, etc.), typically by use of encoding techniques. makes the capacity savings even more compelling. Early examples indicate reductions of 80% for backup data storage requirements. Therefore the biggest advantage of de-duplicated data is that the amount of time spent in data recovery is significantly reduced. Again, tradeoffs exist. Doing data de-duplication See deduplication. in-band is processor intensive can cause some performance degradation and depending on the size of the storage pool, can require users to invest in multiple appliances. The overall benefits, space savings and improved RTO should outweigh out·weigh tr.v. out·weighed, out·weigh·ing, out·weighs 1. To weigh more than. 2. To be more significant than; exceed in value or importance: The benefits outweigh the risks. this tradeoff in most cases. Conclusion Data protection is a critical IT discipline and businesses often choose the simplest approach after sustaining three years of downsizing (1) Converting mainframe and mini-based systems to client/server LANs. (2) To reduce equipment and associated costs by switching to a less-expensive system. (jargon) downsizing and cutbacks. Historically this has been full backups to tape subsystems. However the simplest approach may not provide the highest availability and severe business losses often occur. Today's IT environments are demanding a more comprehensive strategy for data protection, security and high-availability than ever before based on causes of data loss. Replication options must match the applications business requirements in order to yield the highest availability. New emerging data protection solutions such as de-duplication are gaining momentum improving availability while increasing the probability for a business to survive most types of outages. This is critical since most businesses in the modern world will not survive without IT. Making IT resilient to machine failures, intrusions, human mistakes, accidents and the digital crime wave the price to pay for implementing data protection well worth it. Fred Moore
Fred Moore (born September 7, 1911 in Los Angeles, California, USA; died November 23, 1952 in Burbank, California, USA in a road accident), was an American character is president of Horison, Inc. (Boulder, CO). www.horizon.com
Type of disruption % of failures Solutions
Software (bugs and 18% Snapshots, transaction logs
corruption)*
Hardware* 38%
Disk RAID 1-6, backups
Tape Backups
Server Clusters, failover architecture
Network* 23% Redundancy
Natural disasters, 7% Off-site facilities
power, flood, fire
Theft 2% Encryption
Intrusion/security 8% Firewalls, authentication,
anti-virus, filtering
Other 4%
*Includes operational failures
Source: Estimates by Horison Information Strategies
The Causes of Downtime and Data Loss
|
|
||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion