Cost-effective disaster recovery: with snapshot-enhanced, any-to-any data mirroring.
For many years mirroring of production data for disaster recovery (DR) purposes has been available for both mainframe and open systems computing computing - computer environments. Unfortunately, due to the cost and complexity of these types of solutions, they have mostly been deployed in larger organizations with larger IT budgets.
However, the demand for cost-effective disaster recovery solutions has never been higher, as organizations realize the value of their stored data and the high costs associated with any type of downtime The time during which a computer is not functioning due to hardware, operating system or application program failure. . In fact, many organizations today have established a DR strategy or requirement, but have not actually implemented a solution due to budgetary or other restrictions.
Fortunately, a new generation of affordable data mirroring solutions has emerged that brings sophisticated DR capabilities to virtually any size organization. Some of the key features include:
* Compatibility with a wide range of existing storage devices
* Ability to mirror data between different devices from different vendors
* Using snapshot-enhanced mirroring to ensure data integrity and rapid recovery after a disaster or other disruption disruption /dis·rup·tion/ (dis-rup´shun) a morphologic defect resulting from the extrinsic breakdown of, or interference with, a developmental process.
Synchronous Refers to events that are synchronized, or coordinated, in time. For example, the interval between transmitting A and B is the same as between B and C, and completing the current operation before the next one is started are considered synchronous operations. Contrast with asynchronous. vs. Asynchronous Refers to events that are not synchronized, or coordinated, in time. The following are considered asynchronous operations. The interval between transmitting A and B is not the same as between B and C. The ability to initiate a transmission at either end. Mirroring
In a synchronous mirroring environment, each time an application attempts to write data to disk, the transaction is sent to both the local and remote storage devices in parallel. It is not until both devices have committed the write to disk that the system acknowledges that the transaction is complete. The application that initiated the write must wait until it receives the acknowledgement before it can continue on to the next task.
In an asynchronous environment, each write transaction is acknowledged as soon as the local storage device completes the request, even if the remote system has not yet received and/or processed the request.
From a performance standpoint, a synchronous approach will always incur some level of performance degradation--even when the two storage devices are nearby--simply because both systems must complete each transaction before the application can continue. On the other hand, since an asynchronous approach acknowledges the write request without waiting for confirmation from the remote storage device, the performance of the system is virtually identical to that of a non-mirrored system.
From a cost standpoint, a synchronous approach usually requires higher bandwidth and more equipment in order to maintain acceptable performance for several reasons:
Bi-directional traffic: Since each write transaction must be transmitted to the remote system and an acknowledgement received back, the transmission infrastructure must have sufficient bandwidth and performance to avoid becoming a bottleneck A lessening of throughput. It often refers to networks that are overloaded, which is caused by the inability of the hardware and transmission lines to support the traffic. It can also refer to a mismatch inside the computer where slower-speed peripheral buses and devices prevent the CPU in this process.
Latency (1) The time between initiating a request in the computer and receiving the answer. Data latency may refer to the time between a query and the results arriving at the screen or the time between initiating a transaction that modifies one or more databases and its completion. during peak periods: A worst-case scenario worst-case scenario n → Schlimmstfallszenario nt should be factored into the design of the transmission network, since spikes spikes
see peplomer. in data activity could degrade TO DEGRADE, DEGRADING. To, sink or lower a person in the estimation of the public.
2. As a man's character is of great importance to him, and it is his interest to retain the good opinion of all mankind, when he is a witness, he cannot be compelled to disclose overall performance, or cause application time-outs due to extended latencies.
Scalability: SANs are designed to support multiple host severs, but as the number of hosts in a SAN increases, the synchronous mirroring infrastructure may not easily or economically scale to accommodate the increased data traffic.
As a result, a synchronous solution usually requires some level of over-provisioning of both the bandwidth and the available switch ports, in order to ensure sufficient performance during peak periods.
On the other hand, an asynchronous solution usually requires minimal bandwidth, as bi-directional traffic is significantly lower and communication latencies do not affect application performance. In addition, asynchronous solutions are designed to flexibly adapt to spikes in activity by buffering transactions in a queue Pronounced "Q." A temporary holding place for data. See queuing, message queue and print queue.
(programming) queue - A first-in first-out data structure used to sequence objects. Objects are added to the tail of the queue ("enqueued") and taken off the head ("dequeued"). until sufficient bandwidth becomes available to complete each transaction.
The optimal approach is to offer both synchronous and asynchronous mirroring solutions. In doing so, it is possible to impartially im·par·tial
Not partial or biased; unprejudiced. See Synonyms at fair1.
impar·ti·al analyze each user's requirements before recommending an appropriate solution.
Data Integrity During Mirroring
One of the most critical factors in selecting a mirroring solution is the ability to ensure the integrity of the data being replicated between sites. Obviously, it makes little sense to mirror data unless you are confident that the data will be usable USable is a special idea contest to transfer US American ideas into practice in Germany. USable is initiated by the German Körber-Stiftung (foundation Körber). It is doted with 150,000 Euro and awarded every two years. when needed. Mirroring must address two issues when it comes to data integrity:
1. The vast majority of disasters are not a single, instantaneous in·stan·ta·ne·ous
1. Occurring or completed without perceptible delay: Relief was instantaneous.
2. event. Instead, disasters usually unfold unfold - inline over a period of minutes or even hours (intermittent intermittent /in·ter·mit·tent/ (-mit´ent) marked by alternating periods of activity and inactivity.
1. Stopping and starting at intervals.
2. power outages This is a list of famous wide-scale power outages. 1965
2. The total time needed to recover from a disruption. In a synchronous mirroring approach, all data (whether corrupt or not) is immediately replicated to the secondary storage device. In other words Adv. 1. in other words - otherwise stated; "in other words, we are broke"
put differently , a database or file system that is corrupted at one end will become corrupted at the other end as well. Recovering from this type of corruption typically takes hours or even days, and in some instances may be nearly impossible.
[FIGURE 1 OMITTED]
Snapshot-Enhanced Asynchronous Mirroring
One method to address the two data integrity issues discussed above is to use "snapshot-enhanced" mirroring. This technology combines platform-independent, any-to-any, asynchronous mirroring with low-capacity, instant point-in-time snapshots to ensure data integrity between sites while enabling rapid recovery after a disaster.
There are several important factors to be considered when looking at snapshot (1) A saved copy of memory including the contents of all memory bytes, hardware registers and status indicators. It is periodically taken in order to restore the system in the event of failure.
(2) A saved copy of a file before it is updated. functionality. When evaluating a snapshot implementation, investigate the following issues:
* Does the snapshot feature require full-size copies of volumes, or can it create instant volume snapshots that begin at zero capacity?
* Does the disk space for snapshot need to be preallocated or reserved? Or can it enable more efficient use of existing capacity by allocating as needed as needed prn. See prn order. ?
* Can the consistency groups allow snapshots to be created of logical groupings of volumes, such as the data and log files in a database?
* Is there a scheduling feature allowing the user to specify how frequently snapshots are created (e.g., every few minutes)?
* Are application-aware data consistency Data consistency summarizes the validity, accuracy, usability and integrity of related data between applications and across the IT enterprise. This ensures that each user observes a consistent view of the data, including visible changes made by the user's own transactions and capabilities available, allowing applications such as databases to be quiesced prior to creating snapshots, ensuring the data integrity of each snapshot's contents?
Figure 1 is an example of "enhanced" snapshot features:
* The initial zero-capacity snapshot of production data is created (Snapshot 1).
* Snapshot 1 begins accumulating a copy of any production data that changes.
* On a user-defined schedule, Snapshot 1 is "frozen" and the next snapshot is automatically created (Snapshot 2).
The contents of Snapshot 1 are mirrored from Site 1 to Site 2, and Snapshot 1 is then retained at both Sites for a user-defined length of time.
Each site is now assured of having an identical copy of data as of a specific point-in-time. The above process is repeated for each subsequent snapshot.
Zero-Downtime Backup, Non-Disruptive Application Testing application testing - system testing , Decision Support and Other Critical Tasks
One of the biggest benefits of using snapshot-enhanced mirroring is the ability to utilize the same snapshots for other purposes as well. Since each snapshot is a separate read/write volume and is instantly available for use, these alternate uses of snapshots can include:
* Zero-downtime backup: Backups may be done "in the background" using snapshot copies rather than production data. Backups can be started anytime, and finish anytime, without impacting normal operations Generally and collectively, the broad functions that a combatant commander undertakes when assigned responsibility for a given geographic or functional area. Except as otherwise qualified in certain unified command plan paragraphs that relate to particular commands, "normal operations" of or applications.
* Application testing: Snapshots can be used in application testing without disrupting production data or applications.
* Decision support: Snapshots can be used in refreshing data warehouses and other decision support systems, once again without disrupting production data or applications.
* Instantly available, read/write snapshots are used to mirror data between sites.
* Any snapshot at any location may also be used for non-mirroring activities, such as zero-downtime backup, non-disruptive application testing, decision support system (DSS (1) (Digital Signature Standard) A National Security Administration standard for authenticating an electronic message. See RSA and digital signature.
(2) (Digital Satellite S ) updates and more.
* An unlimited number of snapshots may be retained for future use, and may also be deleted Deleted
A security that is no longer included on a specified market. Sometimes referred to as "delisted".
Reasons for delisting include violating regulations, failing to meet financial specifications set out by the stock exchange and going bankrupt. at any time when no longer needed.
Optimizing Performance Over Limited Bandwidth Connections
One of the major costs of any mirroring solution is the ongoing monthly fee for maintaining a communication link between data centers. Generally speaking, the lower the bandwidth of a connection, the lower the monthly cost. Therefore, mirroring solutions that can utilize a low bandwidth connection while still maintaining acceptable performance can significantly lower the total ownership costs.
However, a key issue that may limit the use of low bandwidth connections is something commonly known as "I/O (Input/Output) The transfer of data between the CPU and a peripheral device. Every transfer is an output from one device and an input to another. See PC input/output.
I/O - Input/Output ordering." This issue occurs when data is transmitted over any type of IP connection, since IP does not guarantee in-order delivery of each I/O. For example, here is a common "I/O ordering" situation that mirroring solutions must contend with:
[FIGURE 3 OMITTED]
* Two I/O's that change the same block of data occur within a few seconds of each other.
* The I/O's are mirrored to another location using IP.
* The remote location receives the I/O's out of order (i.e., the second I/O is received before the first I/O).
* Unless the remote site re-orders the I/O's, they will be applied "out of sequence" and the remote database will no longer be consistent with the source database.
In this example, the remote mirror is now "out of synch" with the source data, and if the primary site experiences a failure or other disruption, the remote database would be corrupted and not suitable for use.
To ensure that I/O's are correctly applied to remote data sets, and to also minimize the transmission bandwidth required between sites, mirroring technologies should incorporate a "Last Block Changed" algorithm that examines the transactions contained within each snapshot prior to transmission. If the same block of data within a snapshot has changed multiple times, the Last Block Changed algorithm only transmits the last known change for each block of data. This Last Block Changed technique not only reduces the amount of data that must be mirrored, but also ensures that both sites will have consistent copies of data between sites, even if the data arrives out of sequence.
In addition to eliminating the "I/O ordering" problems encountered in data mirroring, Last Block Changed technology also prevents the following problems typically associated with asynchronous mirroring:
* Since duplicate DUPLICATE. The double of anything.
2. It is usually applied to agreements, letters, receipts, and the like, when two originals are made of either of them. Each copy has the same effect. changes to the same data are discarded dis·card
v. dis·card·ed, dis·card·ing, dis·cards
1. To throw away; reject.
a. To throw out (a playing card) from one's hand.
b. , the size of each snapshot is minimized.
* This reduces the need for large buffers, or the possibility of having a buffer overflow A common cause of malfunctioning software. If the amount of data written into a buffer exceeds the size of the buffer, the additional data will be written into adjacent areas, which could be buffers, constants, flags or variables. disrupt the mirroring process.
[FIGURE 2 OMITTED]
* Latencies resulting from long transmission distances do not affect data integrity.
* During any transmission outages, snapshots continue to accumulate Accumulate
Broker/analyst recommendation that could mean slightly different things depending on the broker/analyst. In general, it means to increase the number of shares of a particular security over the near term, but not to liquidate other parts of the portfolio to buy a security data changes until transmission links are restored, and then synchronize See synchronization. changes with the remote site(s).
An illustration of the Last Block Changed technique is shown in Figure 3.
* Snapshot 1 accumulates changes to data over a period of time.
* During that time, Block 1 may change several times, while Block 2 and Block 3 may only change once.
* The snapshot is "frozen" and Last Block Changed technology only transmits the last change to each data block.
* This significantly reduces the amount of data that must be mirrored, minimizing the bandwidth needed between sites and eliminating the problems associated with out of sequence I/O delivery.
* Once the snapshot's contents are received and processed at the remote site, both sites now have copies of data that are consistent with each other at the point in time that the snapshot was originally "frozen."
Application-Aware Data Consistency
In addition to scheduling the frequency of snapshots, vendor solutions should enable application-aware snapshots to be created for each volume or group of volumes associated with an application. For example, when creating a snapshot for a database, both the data and log volumes will be included, and the application will be temporarily quiesced to make sure any "in flight" transactions are completed prior to the snapshot creation. Once the snapshot is complete (just a few seconds), the application is returned to normal operation.
Application-aware snapshots should be managed using a CLI (1) (Call Level Interface) A database programming interface from the SQL Access Group (SAG), an SQL membership organization. SAG's CLI is an attempt to standardize the SQL language for database access. and/or third-party management applications.
Affordable Any-to-Any Mirroring
Network-based mirroring solutions offer more flexibility, in that they provide a device-independent layer that resides within the Fibre Channel switched In a computer storage field, a Fibre Channel switch is a network switch compatible with Fibre Channel (FC) protocol. It allows the creation of a Fibre Channel fabric, that is currently the core component of most storage area networks. fabric that connects host servers to storage devices. This independence allows them to perform "any-to-any mirroring," where data can be mirrored from any device, to any device, at any location.
Any-to-any mirroring can be significantly less expensive than proprietary mirroring solutions that only mirror data between identical devices from the same vendor. With any-to-any mirroring, you have the freedom to select the most appropriate storage devices for each location without worrying about vendor-imposed restrictions on data movement and replication In database management, the ability to keep distributed databases synchronized by routinely copying the entire database or subsets of the database to other servers in the network.
There are various replication methods. .
Offloading Mirroring From Servers and Storage
Traditional mirroring solutions often come with hidden performance penalties. For example:
* Server-based mirroring ules the host's CPU CPU
in full central processing unit
Principal component of a digital computer, composed of a control unit, an instruction-decoding unit, and an arithmetic-logic unit. to manage data replication, which negatively impacts the performance of any application running on the server. In addition, each server's mirroring process must be managed individually, which increases the burden on storage administrators.
* Storage-based mirroring either burdens the storage controllers with the replication processes, or requires the use of dedicated mirroring controllers, which in turn limits the number of controllers that can be used for day-to-day processing tasks. In addition, data can typically only be mirrored to an identical storage device from the same vendor, limiting the ability to use less expensive devices at remote locations.
* On the other hand, network-based, asynchronous mirroring solutions typically avoid these problems by using network-based appliances to handle data replication processes. Since these appliances work independently of the servers and storage devices in use, several key benefits are realized:
* Data mirroring occurs without involving or impacting the performance of the servers or storage devices.
* Data can be mirrored between storage devices from any vendor (any-to-any mirroring).
* All mirroring processes are centrally managed.
Other Benefits of Snapshot-Enhanced Asynchronous Mirroring
In addition to the advantages discussed above, here are some other reasons why snapshot-enhanced mirroring presents a highly reliable yet cost-effective method of disaster recovery:
* In the event of a temporary communication link disruption, a snapshot is added to the mirroring queue. Until the link is restored, additional snapshots will continue to be created and added to the queue. Once the link is restored, all snapshots in the queue are transmitted to the secondary sites.
* In the event of planned or unplanned downtime, the secondary sites can be quickly brought online using the last received snapshot. This ensures that the secondary site commences operations using a complete and up-to-date copy of the primary site's production data.
* In the event of data corruption Data corruption refers to errors in computer data that occur during transmission or retrieval, introducing unintended changes to the original data. Computer storage and transmission systems use a number of measures to provide data integrity, the lack of errors. , administrators can quickly "rollback A DBMS feature that reverses the current transaction out of the database, returning the data to its former state. A rollback is performed when processing a transaction fails at some point, and it is necessary to start over. See two-phase commit. " the secondary sites to the last known good point in time. Recovering a system using online snapshots can reduce recovery time to minutes, instead of hours or days.
Snapshot-enabled asynchronous mirroring combines the cost and scalability advantages of asynchronous mirroring with the data integrity and online recovery benefits of point-in-time snapshots.
Compared to other mirroring solutions, the key benefits of implementing mirroring with snapshot-enhancements include:
Lower total cost of ownership
* Higher performance using less bandwidth
* Any-to-any mirroring between different storage devices from different vendors
* Higher levels of data integrity in the event of data corruption or "rolling disasters"
* Rapid online recovery to the last known good point in time.
Nelson Nahum is chief technology officer at StoreAge (Irvine, CA)