Printer Friendly
The Free Library
22,738,802 articles and books

How to evaluate a recovery management solution.

Last year's hurricane season Hurricane season refers to a period in a year when hurricanes usually form. For more information see: Tropical cyclone#Times of formation.

For a lists of past seasons, see:
  • The Atlantic hurricane season (see also )
 started a national discussion about how prepared the nation is to cope with a major disaster. Business in the Gulf Coast was devastated dev·as·tate  
tr.v. dev·as·tat·ed, dev·as·tat·ing, dev·as·tates
1. To lay waste; destroy.

2. To overwhelm; confound; stun: was devastated by the rude remark.
 and in the boardrooms across America senior executives have tasked IT professionals with creating and implementing solutions that will ensure mission critical data is continuously protected. IT professionals across all industries have come to realize that even with thorough planning the ability to restore data and bring systems back online quickly with zero loss of data can be an overwhelming task.

The complexity and cost of solving data protection and recovery issues today is rooted in the fact that it takes multiple tools to deliver a solution that still doesn't meet the new requirements of today's data center. This leaves IT professionals spending countless hours trying to integrate disparate tools and manually recovering data in an attempt to build a real-time infrastructure to support their enterprise. Because there are a variety of protection and recovery tools to choose from, it is crucial to arrive at core metrics metrics Managed care A popular term for standards by which the quality of a product, service, or outcome of a particular form of Pt management is evaluated. See TQM.  to enable IT management to choose the best recovery management solution for their environment.

Recovery management is defined as the act, manner, or practice of managing a return to normal conditions
This article is about the philosophical argument; for normal conditions in the sense of standards see the corresponding articles, e.g. Standard conditions for temperature and pressure.
. In the IT industry the definition is more specific--it describes how organizations return systems, applications, and data back to "normal" conditions. When unexpected failures occur, the goal is to bring IT systems back to its most recent consistent state and to restore business operations Business operations are those activities involved in the running of a business for the purpose of producing value for the stakeholders. Compare business processes. The outcome of business operations is the harvesting of value from assets  within minutes, to reduce downtime The time during which a computer is not functioning due to hardware, operating system or application program failure. , and prevent significant financial loss.

The Evaluation Metrics

In order to evaluate a recovery management solution, one must have properly defined metrics. Data recovery service level agreements (SLAs) are traditionally measured by recovery time objectives (RTO (Recovery Time Objective) The amount of time a computer system or application can stop functioning before it is considered intolerable to the enterprise. It can be computed to be from seconds to days, depending on how critical the application is to the organization. ) and recovery point objectives (RPO RPO Recruitment Process Outsourcing
RPO Recovery Point Objective (disaster recovery)
RPO Royal Philharmonic Orchestra
RPO Rochester Philharmonic Orchestra
RPO Representative Poetry Online
RPO Railway Post Office
). RTO defines the time required to recover a set unit of missing data, and RPO defines the potential data loss--the time gap between the most recent application consistent recovery point and the physical failure point. RTO and RPO are good objectives for setting SLAs with regard to data recovery, but they are not sufficient for measuring a recovery management solution. For example, a snapshot (1) A saved copy of memory including the contents of all memory bytes, hardware registers and status indicators. It is periodically taken in order to restore the system in the event of failure.

(2) A saved copy of a file before it is updated.
 tool may recover a server's data in minutes; however, a snapshot tool does not have the ability to recover a granular granular /gran·u·lar/ (gran´u-lar) made up of or marked by presence of granules or grains.

1. Composed or appearing to be composed of granules or grains.

 object. When one needs to locate a lost object from snapshots, the process is manual and the RTO could be many hours. In this case, RTO has nothing to do with the tool per se, inasmuch as in·as·much as  
1. Because of the fact that; since.

2. To the extent that; insofar as.

inasmuch as

1. since; because

 it is entirely dependent on the manual process. While a data replication tool is capable of delivering zero or near zero RPO when a server fails, it is not capable of recovering business data if the data is corrupted, and the corrupted data is replicated.

As a result of examples like these, IT requires more comprehensive metrics to properly evaluate a recovery management solution. There are ten core metrics that fall into three categories--Recovery Time Characteristics, Recovered Data Characteristics, and Recovery Scalability Characteristics. The following chart explores these metrics in detail.


Recovery Time Characteristics

Recovery Time Objective (RTO). RTO defines how fast the solution is capable of recovering the data and application it is designed to protect. The RTO of most recovery solutions depends on whether or not a data verification process is needed during the recovery, and the size of the data set to be recovered. A solution that provides instant recovery regardless of data set size greatly reduces or eliminates business down time.

Recovery Time Granularity The degree of modularity of a system. More granularity implies more flexibility in customizing a system, because there are more, smaller increments (granules) from which to choose.  (RTG RTG

abbreviation for ready to go; used in medical records.
). RTG determines the time spacing for selecting a recovery point; this is an important parameter for recovering from logical failures. Unlike RPO, which determines the last recovery point prior to a physical failure, RTG defines recovery point selection options prior to the most recent recovery point.

Recovered Data Characteristics

Recovery Point Objectives (RPO). RPO defines the minimum time gap between the last failure and the point-in-time where data can be recovered. The smaller the gap, the less data is lost.

Recovery Object Granularity (ROG ROG Roger
ROG Rouge (Everquest)
ROG Republic of Gamers
ROG Royal Observatory Greenwich (UK)
ROG Reactive Organic Gas
ROG Receipt Of Goods
ROG Rise Off Ground
). ROG measures the level of objects that a solution is capable of recovering. For instance, object granularity may be a storage volume, a file system, a database table, a transaction, a mailbox A simulated mailbox in the computer that holds e-mail messages. Mailboxes are stored on disk as a file of messages, a database of messages or as an individual file for each message. The standard mailboxes are usually In, Out, Trash and Junk (Spam). , an email message, etc.

Recovery Event Granularity (REG). REG measures the capability of a recovery management solution to track events and to recover a failed application or missing data to a specific event.

Recovery Consistency Characteristics (RCC RCC - An extensible language. ). RCC defines the usability of recovered data by the associated application. RCC of a recovery management solution depends not only on how data is captured and stored, but also on the data type being protected.

Recovery Scalability Characteristics

Recovery Location Scope (RLS Restless legs syndrome (RLS)
A disorder in which the patient experiences crawling, aching, or other disagreeable sensations in the calves that can be relieved by movement. RLS is a frequent cause of difficulty falling asleep at night.
). RLS defines where the protected data must be stored when recovery takes place. Most data protection solutions are designed such that the protected data is stored locally. Robust recovery management solutions can protect and recover data over LAN (Local Area Network) A communications network that serves users within a confined geographical area. The "clients" are the user's workstations typically running Windows, although Mac and Linux clients are also used.  and WAN.

Recovery Service Scalability (RSS (Really Simple Syndication) A syndication format that was developed by Netscape in 1999 and became very popular for aggregating updates to blogs and the news sites. RSS has also stood for "Rich Site Summary" and "RDF Site Summary. ). RSS is measured by service (number of applications or data sets the solution is capable of protecting) and capacity (the maximum size of the data it can store).

Recovery Service Resiliency (RSR RSR Regular sinus rhythm, see there ). RSR defines how well a recovery management solution tolerates failures. This includes system and data failures as well as data security authorization. For instance, if a system component fails, can the solution continue such that an application would be continuously protected? And can it also self-recover from any internal failures?

Recovery Management Cost (RMC RMC Royal Military College
RMC Radio Monte Carlo
RMC Randolph-Macon College (Ashland, Virginia)
RMC Regional Medical Center
RMC Robert Morris College (Illinois)
RMC Rocky Mountain College
). RMC defines the cost efficiency of a recovery management solution. Data services such as backup, snapshots, replication, policy management, and others are traditionally separate tools with very different architectures. For better RMC, find a consolidated recovery management platform which simplifies IT administration by reducing the amount of tools necessary to manage data. For further efficiency, utilize a solution which reduces the storage and network resources necessary to protect and recover data.

Recovery Management Scorecard

As we discussed earlier, there are a myriad of protection and recovery tools to choose from so it made sense to come up with the core metrics necessary to enable IT management to evaluate which solutions would best fit their environment. Now that we have an understanding of the "Top Ten" metrics necessary to evaluate a recovery management solution, let's apply these metrics to solutions that exist in the market today. Practical application of the metrics enables not only a solidified so·lid·i·fy  
v. so·lid·i·fied, so·lid·i·fy·ing, so·lid·i·fies
1. To make solid, compact, or hard.

2. To make strong or united.

 understanding of the metrics, but also a better comprehension of available solutions and how they compare.


In most industries today, the service level agreements for data protection and recovery have moved to a point where there is no time for backup windows, no tolerance for data loss, and very little margin for recovery downtime. Add to that the increased business demands for disaster recovery of mission and business critical data, along with new compliance requirements Compliance requirements are a series of directives established by United States Federal government agencies that summarize hundreds of Federal laws and regulations applicable to Federal assistance (also known as Federal aid or Federal funds).  and you can quickly determine that the legacy tools of data protection and recovery are ill-equipped to handle today's requirements. The ten metrics of recovery management above enable IT management to apply thoughtful consideration to their own internal business requirements against the products they are evaluating.

Marty Ward is vice president of marketing and products at Asempra Technologies in Sunnyvale, Calif.
Recovery Management     level Recovery  Traditional    Block-level
Requirement             Management      Backup         Replication

Recovery Time      RTO  Sec to Minutes  Hours to Days  Min to Hours
Recovery Time      RTG  Seconds         24 hours       None
Recovery Point     RPO  Near Zero       24 hours       Near Zero or
Objective                                              Minutes
Recovery Object    ROG  Transaction     File           Storage blocks
Recovery Event     REG  Fine-grained-   Coarse-manual  Coarse-manual
Granularity             consistency     checkpoints    checkpoints
Recovery           RCC  Strong          Strong         Crash consistent
Consistency             Consistency     Consistency    only
Recovery Location  RLS  LAN & WAN       LAN-only       LAN & WAN
Recovery Service   RSS  High            Medium         Low
Recovery Service   RSR  High            Medium         Low
Recovery           RMC  $               $$$            $$$$$
Management TCO

Recovery Management     Block-level Continuous  File-level Continuous
Requirement             Data Protection         Data Protection

Recovery Time      RTO  Min to Hours            Min to Hours
Recovery Time      RTG  Sec to Hours            Sec to Hours
Recovery Point     RPO  Near Zero               Near Zero
Recovery Object    ROG  Storage blocks          File
Recovery Event     REG  Coarse-manual           Coarse-manual
Granularity             checkpoints             checkpoints
Recovery           RCC  Crash consistent only   Crash consistent only
Recovery Location  RLS  LAN-only                LAN-only
Recovery Service   RSS  Medium                  Low
Recovery Service   RSR  Medium                  Medium
Recovery           RMC  $$$                     $$
Management TCO

Table. 1
COPYRIGHT 2006 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2006, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion




Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Disaster Recovery & Backup/Restore
Author:Ward, Marty
Publication:Computer Technology Review
Geographic Code:1USA
Date:Mar 1, 2006
Previous Article:ICM: beyond just storage.
Next Article:NAS virtualization simplifies file storage management.

Related Articles
Sofftek DR Manager new disaster recovery software. (VIRUS NOTES).
Prepare for the worst: portable tape drives put the "recovery" in mobile disaster recovery.
Plan for the worst, hope for the best: backup and disaster recovery.
Preparing for disaster with an effective business continuity strategy: overcoming potential dangers to your information infrastructure.
TCO analysis: where D2D fits--part 2.
Overcoming recovery barriers: rapid and reliable system and data recovery.
Peace of mind: disaster recovery plans can keep your business alive.
Understanding the new generation of data protection solutions.
Personal disaster recovery software: an essential part of business disaster recovery plans.
Preparing for the unthinkable: disaster recovery.

Terms of use | Copyright © 2014 Farlex, Inc. | Feedback | For webmasters