Data replication.In the past, only the largest companies could afford data replication technology to implement a disaster recovery solution for their e-mail systems. Most other companies relied on tape solutions that could lead to extended outages, or even data loss, in the case of a disaster. Over the years, mainframe-class replication solutions have appeared in the Open Systems environment and costs have dropped. Today's replication solutions make it affordable for most companies to at least consider replicating their e-mail data as part of their disaster recovery plans. The ultimate goal of data replication is to create a complete copy of the source data. There are a variety of ways to do this: Snapshots The simplest form of replication is the snapshot (1) A saved copy of memory including the contents of all memory bytes, hardware registers and status indicators. It is periodically taken in order to restore the system in the event of failure. (2) A saved copy of a file before it is updated. . Snapshot technology is considered replication because replication is, essentially, just a copy of data--which is exactly what a snapshot is. With snapshot technology, data can be replicated with minimal impact on the host application. Usually, it takes longer for the e-mail application to shut down or the e-mail database to enter some sort of "hot-backup mode" than it takes to perform the snapshot. There are a couple of ways to perform snapshots: * Snap-copy (or Volume-copy) snapshots create a complete second copy of data that can be used at a later time to recover the data. The advantage to this method is that there is a full copy of the original data that may be stored on separate physicals disk drives. The disadvantages are that creating the snapshot each time can be time-consuming and affect performance while being created. In addition, with a snap-copy snapshot there must be enough storage to accommodate not only the original copy but the snapshot as well (100% overhead per snapshot). When a snap-copy is created, the data is copied to another area of storage; availability of the snapshot depends on how long the copy takes to complete. * Pointer-based snapshots are not exact copies of the data but a set of pointers that point to the original data. When a block of data is written to the snapshot source, the changed block is written to the snapshot reserve area, the pointer pointer, breed of large sporting dog developed in England more than 300 years ago. It stands between 23 and 26 in. (58.4–66.4 cm) high at the shoulder and weighs between 50 and 60 lb (22.7–27.2 kg). for that block is changed to point to the copied block and the new block is written to the snapshot source. This process is called "copy on write." Subsequent writes to the original data are not copied to the snapshot reserve area because the original data has already been moved. Pointer-based snapshots are very attractive because the snapshot is available instantaneously in·stan·ta·ne·ous adj. 1. Occurring or completed without perceptible delay: Relief was instantaneous. 2. and the snapshot reserve area needs just a fraction of the original disk space, since only the changed blocks are copied. Because pointer-based snapshots require such a small amount of additional space, they can be implemented cost effectively. The disadvantage to pointer-based snapshots is that if the source is write-intensive, maintaining the "copy on write" can affect the performance of the source. Remote Replication Snapshots are generally controlled by a single device (host, disk subsystem A unit or device that is part of a larger system. For example, a disk subsystem is a part of a computer system. A bus is a part of the computer. A subsystem usually refers to hardware, but it may be used to describe software. or SAN appliance). If there is a need to have a copy of the data on another disk subsystem or in a remote location, then a remote replication solution should be considered. There are, essentially, two options when replicating data remotely. * Synchronous Refers to events that are synchronized, or coordinated, in time. For example, the interval between transmitting A and B is the same as between B and C, and completing the current operation before the next one is started are considered synchronous operations. Contrast with asynchronous. replication has been around for a long time in the high-end, mainframe-class storage devices and more recently in the higher-end open-systems storage devices. Due to cost and complexity, synchronous replication has traditionally been implemented by larger enterprises. Synchronous replication is where every write from the application is sent to the local disk system, which sends it to the remote storage system. The write is not acknowledged back to the application until the remote system has received the write and acknowledged that the write is complete. In other words Adv. 1. in other words - otherwise stated; "in other words, we are broke" put differently , the application has to wait for the written data to be written to the local and remote storage before it can continue processing. This "double-write penalty" can be very significant to an application that does a large amount of writes. As the distance between the local and remote systems grows, the delay in each write of data will also increase. The networking between the two systems needs to be as fast as possible to keep the latency (1) The time between initiating a request in the computer and receiving the answer. Data latency may refer to the time between a query and the results arriving at the screen or the time between initiating a transaction that modifies one or more databases and its completion. to a minimum. The benefit of synchronous mirroring is that in the case of a disaster at the primary site, the remote site can be brought online and processing can continue from the exact point in time that the primary site died. Unfortunately, the networking between the sites that is needed to implement a synchronous solution has historically been very expensive. As a result, only the most critical applications can afford to take advantage of it (banking systems, brokerage houses, government, etc). Today, synchronous replication has been ported from very high-end solutions to much more affordable solutions and has allowed many more companies to implement solutions they would not have been able to deploy a few years ago. * Asynchronous Refers to events that are not synchronized, or coordinated, in time. The following are considered asynchronous operations. The interval between transmitting A and B is not the same as between B and C. The ability to initiate a transmission at either end. replication, like synchronous replication, creates an exact copy of the local data on a remote system. Unlike synchronous replication, there is no practical distance limitation. Where synchronous replication waits for a response from the remote system before acknowledging the write back to the application, asynchronous replication acknowledges write back to the application immediately and then transmits the data to the remote site. Since the application is not waiting for the remote system to respond, the application performance is not impacted, no matter how long the response takes. With no requirement for a timely response, the remote system can now be hundreds or thousands of miles away. The obvious benefit of asynchronous replication is that there is no performance impact to the application. One drawback DRAWBACK, com. law. An allowance made by the government to merchants on the reexportation of certain imported goods liable to duties, which, in some cases, consists of the whole; in others, of a part of the duties which had been paid upon the importation. is that the remote data is not necessarily useable because writes may occur out of order (many databases need to have the writes occur in order). Some asynchronous implementations provide a facility called "write order consistency groups". Write order consistency groups guarantee that all writes are delivered in the proper order. For example, all the volumes used by an e-mail system can be grouped in a write order consistency group, which will guarantee that the remote copy of the data is useable. A third type of remote replication is to initially establish the remote mirror and then suspend replication. Periodically, replication can be resumed and when the remote site is back in sync replication can be suspended again. This methodology allows users to have stable, point-in-time copies of their data at a remote location. Types (or Options) As discussed, replication used to be very expensive with limited options available to those trying to implement a solution. In the past, replication solutions resided in large, monolithic Single object. Self contained. One unit. storage systems. Today, there are a variety of options available to users and replication solutions are available for various points in the I/O (Input/Output) The transfer of data between the CPU and a peripheral device. Every transfer is an output from one device and an input to another. See PC input/output. I/O - Input/Output path. Host Based (1) A system controlled by a central or main computer. A host-based system typically refers to a hierarchical communications system controlled by a central computer. (2) : Host based replication resides on the application server that needs to have its data replicated. Software that supports all forms of snapshots and remote replication is available for most of the popular operating systems Operating systems can be categorized by technology, ownership, licensing, working state, usage, and by many other characteristics. In practice, many of these groupings may overlap. . The major benefits for a host-based solution are that the cost can be very low and heterogeneous storage can be used. As more servers need to use replication, the cost goes up (initial cost of software, implementation, service and ongoing maintenance). Software may need to be purchased from different vendors to support the mix of servers which means that there will be different management interfaces which causes management of the environment to become very complicated. Users of less mainstream operating systems may have a challenge locating products to implement host-based replication. Operating system operating system (OS) Software that controls the operation of a computer, directs the input and output of data, keeps track of files, and controls the processing of computer programs. upgrades or patches may cause the replication software to stop functioning. Another issue with host-based replication is that it takes processing cycles away from the applications running on the host (i.e. the operating system has to use resources (CPU CPU in full central processing unit Principal component of a digital computer, composed of a control unit, an instruction-decoding unit, and an arithmetic-logic unit. , memory, network) to manage the replicated data). Appliance Based: Appliance-based replication (also called SAN-based) technology, like host-based, supports all the types of replication. Unlike the host-based solutions, all intelligence needed to perform the replication is housed in an appliance that resides in the I/O path between the host and the storage, typically in a SAN. Appliance-based replication has many advantages over host-based. To start, there is no overhead on the application server and the application has little or no knowledge that the appliance exists or that the replication is taking place. Management is centralized cen·tral·ize v. cen·tral·ized, cen·tral·iz·ing, cen·tral·iz·es v.tr. 1. To draw into or toward a center; consolidate. 2. on the appliance and any operating system that the appliance supports can utilize the replication features. Like host-based solutions, a heterogeneous storage pool can be utilized. There are some major issues with an appliance-based solution. For a high-available solution, there should be at least two appliances in the local site, configured con·fig·ure tr.v. con·fig·ured, con·fig·ur·ing, con·fig·ures To design, arrange, set up, or shape with a view to specific applications or uses: as failover Invoking a secondary system to take over when the primary system fails. Up-to-date copies of all required data and applications are maintained on the secondary system in order to respond immediately if the primary system becomes unusable. Also called "fallover." See replication. for each other, and at least one appliance remotely. Since the appliance is involved with every I/O (not just the replicated data) each appliance should use at least four switch ports and typically more, which can add significant cost and complexity to the SAN infrastructure. Modern disk subsystems can deliver huge I/Os per second (IOPS IOPS Input/Output Per Second IOPS Input/Output Operations Per Second (server performance measurement) IOPS International Organization of Pension Supervisors IOPS Information Operations Planning System IOPS Internet Official Protocol Standards ) and megabyte per second A megabyte per second (MB/s or MBps) is a unit of data transfer rate equal to:
tr.v. o·ver·pow·ered, o·ver·pow·er·ing, o·ver·pow·ers 1. To overcome or vanquish by superior force; subdue. 2. To affect so strongly as to make helpless or ineffective; overwhelm. 3. a pair of appliances. Some appliance-based solutions are limited to a pair of appliances while others can scale beyond two. As additional appliances are added, the cost of the solution rises (cost of appliance, SAN switch ports, support, etc.). Storage Based: Storage-based replication combines the best aspects of host-based and appliance-based solutions. The application servers have little or no knowledge of the replication; there is no overhead on the application servers. Management is centralized and any host supported by the storage system can use the replication functions of the storage device. Unlike appliance-based solutions, there are no extra SAN switch ports needed to implement storage-based replication. Since the replication is native to the storage controllers, the impact is minimal to the application servers utilizing the storage. The only drawback to storage-based replication is that replication can only take place between homogeneous The same. Contrast with heterogeneous. homogeneous - (Or "homogenous") Of uniform nature, similar in kind. 1. In the context of distributed systems, middleware makes heterogeneous systems appear as a homogeneous entity. For example see: interoperable network. storage systems. In the past, this could prove to be costly; but today, many storage devices allow replication from more expensive Fibre-Channel drives to less expensive SATA (Serial ATA) A serial version of the ATA (IDE) interface, which has been the de facto standard hard disk interface for desktop PCs for more than two decades. The original Parallel ATA (PATA) interface was launched in 1986. drives and also support remote replication from a higher-end model to a lower end model. For example, a StorageTek FLX FLX Finger Lakes (New York) FLX Fort Lauderdale Executive (airport code) FLX Federal Learning eXchange FLX Flatfishes 280 with a mix of FC drive and SATA can perform snapshots of the data on the FC drives and store them on the SATA drives and also remotely replicate rep·li·cate v. 1. To duplicate, copy, reproduce, or repeat. 2. To reproduce or make an exact copy or copies of genetic material, a cell, or an organism. n. A repetition of an experiment or a procedure. the data on the FC drives to a StorageTek FLX240 that may have only SATA drives. Uses Implementing replication can be a costly proposal but can usually address an important business issue or multiple issues that make it easy to justify the purchase. Disaster Recovery: Replication technology has typically been used to address disaster recovery issues. Disaster recovery is still the driving business case behind replication. Remote replication can be implemented from the production site to one or more remote sites across a campus, across town, across a state or across the country. When a disaster strikes the primary location, the applications can be brought up at the remote site and continue processing against the replicated copies. When the primary site is back online, the replication can be reversed and when the data is resynchronized, processing can be switched back to the primary site and business can continue. In the past, if an e-mail system experienced a disaster it was an "oh well" moment. The loss of a day or more of e-mail was not considered important. Today, e-mail is a critical component of many companies' business plans and recovering e-mail after a disaster quickly and completely is required. Maintenance: Once a disaster recovery solution is in place and fully tested with documented processes and procedures, the infrastructure can be used to solve other business needs. E-mail servers See mail server. may need periodic maintenance that can take hours to complete. With remote replication in place, the downtime The time during which a computer is not functioning due to hardware, operating system or application program failure. can be minimal (as long as it takes to bring the remote peer of the primary e-mail server online). The primary server can be worked on (patches, hardware upgrades, etc.) and then brought back online and into production. A whole datacenter can be failed over to a remote site on purpose to perform maintenance on generators, air conditioning air conditioning, mechanical process for controlling the humidity, temperature, cleanliness, and circulation of air in buildings and rooms. Indoor air is conditioned and regulated to maintain the temperature-humidity ratio that is most comfortable and healthful. , etc. Replication can also be used to perform a datacenter move with minimal downtime (fail everything to the DR site, move the production datacenter to its new location then fail the DR site back to the new datacenter). Backup: Backing up data is frequently the biggest daily challenge for an IT manager. Backup windows have been shrinking while data has been growing. In the past, the only way to address the issue was to add larger and larger tape libraries. Today, by using snapshots and SAN technology, backup windows can shrink to virtually zero. For example, an e-mail server can be placed in "hot-backup mode" and its data can be snapped. The database can then be placed back into normal operation. The snapshot can be mounted onto a dedicated backup server A computer in a network used to store copies of files from client machines or other servers. Such servers typically have their disks set up in a RAID configuration to provide fault tolerance. See backup program, RAID, SAN and LAN free backup. , backed up directly to tape or SATA disk, and then archived to tape. Summary The role of e-mail systems in the enterprise has gained in stature since their humble beginnings Humble Beginnings was an American pop punk band from New Jersey. While never gaining large-scale success, many of the band's members went on to mainstream success with other outfits. . Thankfully thank·ful adj. 1. Aware and appreciative of a benefit; grateful. 2. Expressive of gratitude: a thankful smile. , technologies that used to support an e-mail systems infrastructure have also matured at the same time. Replication is one of those technologies that has greatly matured since its inception. Today's entry-level replication solutions are as robust and reliable as enterprise-class solutions from 20 years ago and far more affordable. The attractive pricing allows all sizes of companies to protect from disasters and implement business efficiencies that were previously available to only the largest organizations. Jim McKinstry is senior systems engineer at Engenio Information Technologies, Inc. (Milpitas, CA) www.engenio.com |
|
||||||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion