Impact of new regulations on the data protection requirements for the financial industry.
Data protection and disaster recovery have been important issues for financial institutions for many years. Recent events, however, have catapulted them into the forefront, causing financial and government organizations to re-evaluate business continuity policies, plans and regulations, with a focus on strengthening "the resilience of critical U.S. financial markets."
In September 2002, government agencies--the Federal Reserve, the Securities and Exchange Commission (SEC), and the Office of the Comptroller of the Currency (OCC)--published jointly the "Draft Interagency White Paper on Sound Practices to Strengthen the Resilience of the U.S. Financial System" in direct response to the terrorist attacks of September 11, 2001 (www.sec.gov/rules/concept/34-46432.htm). The white paper outlined "preliminary conclusions with respect to the factors affecting the resilience of critical markets and activities in the U.S. financial system; sound practices to strengthen financial system resilience; and an appropriate timetable for implementing these sound practices." The agencies solicited comments on the draft white paper, and received about ninety letters from leaders of financial firms, industry associations, technology companies, etc. (www.sec.gov/rules/concept/s73202.shtml).
[FIGURE 1 OMITTED]
Probably the most controversial aspect of this interagency white paper was the suggestion that those financial institutions that "play significant roles in critical financial markets" must have fully operational recovery sites located at least 200-300 miles away from the primary data center site. Many organizations complained that, at best, this demand would be cost-prohibitive due to the very high cost of bandwidth, of intermediary storage, and of related infrastructure required by traditional long distance replication solutions (i.e., which require multi-hop configurations). At worst, they contended, it would be technologically impossible, due to the limitations of conventional technologies.
"The regulators are trying to do the right thing, but let's do it in a way where we don't impose excessively high costs or implement something that is technologically not possible to do," said John Carlson, senior director of the Banking Information Secretariat (BITS), a Washington-based organization made up of the hundred largest financial services firms in the U.S.
The federal agencies published their final regulations in April 2003 (www.sec.gov/news/studies/34-47638.htm). In an apparent response to the feedback to the paper, they do not immediately require a specific distance between primary and recovery data center sites; however, they still strongly recommend enough distance between these sites to ensure that they do not share the same infrastructure components (e.g., water, electricity, telecommunications). This revised approach was intended to give financial institutions the ability to focus on recovery objectives, and the flexibility to "manage costs effectively and allow for technological improvements."
In order to address the critical issues highlighted by federal regulators, it is important to understand the challenges faced by financial institutions as they consider data protection strategies and solutions.
The data that drives today's globally oriented financial businesses is stored on networks of interconnected computers and data storage devices. This data must be 100% available and always accessible and up-to-date, even in the face of local or regional disasters. These conditions must be met at a cost that is affordable and without hampering normal company operations in any way.
Distance is the key to reducing the risk. Power outages (like the blackout of 2003), fires, floods, earthquakes, wars and malicious attacks all can temporarily or permanently halt normal business operations where they strike. The industry watchdogs such as the SEC state that "long-standing principles of business continuity planning suggest that back-up arrangements should be as far away from the primary site as necessary to avoid being subject to the same set of risks as the primary location." Back-up sites should not rely on the same infrastructure components used by the primary site.
To reduce the business risk of an unplanned event of this type, an enterprise must ensure that an up-to-date replica of its business-critical data is stored at a secondary location. In addition, any local condition that damages data at the primary site must not be able to harm the replica of that data. Synchronous replication, used so effectively to create perfect copies in local networks, performs poorly over longer distances, requires high bandwidth and can easily replicate corrupted data (undetected) to the secondary site.
A copy of critical company data at a secondary site is also a prerequisite for continuity in business operations. For an organization that already has multiple geographically dispersed installations, one of the existing locations is a logical first choice for the secondary site. A disaster recovery solution must be able to "bridge" the distance. The system must support rapid failover to the secondary site, and it must support transparent failback following correction of the problem at the primary site.
Current replication solutions have been unable to adequately satisfy all of these requirements (some of which are seemingly contradictory) especially over longer distances. Those that provide true up-to-date replication (through synchronous replication) cannot support the longer distances without severely degrading the performance of the host applications. Those that most effectively conserve bandwidth are not absolutely up-to-date. None offers rapid failover and failback to any point in time when corrupted data has damaged its replica, rendering it unusable for recovery.
Financial organizations need a cost-effective solution that provides synchronous levels of protection with no distance limitations and with no application degradation. These organizations also need a solution that is flexible, providing the ability to adapt to changing business requirements. Furthermore, they require a solution that will integrate with their current infrastructure, so as to minimize disruption, leverage existing investments, and minimize costs. In summary, financial organizations need a solution that breaks the limitations of current technologies, enabling enterprise data protection--with no limits.
One viable solution is an advanced enterprise-class disaster recovery subsystem. Developed from the ground up with a completely new approach, such a technology overcomes many of the technical limitations of existing data protection products. The architecture is based on an appliance that connects to the SAN and IP infrastructure and provides bi-directional data replication across any distance for heterogeneous storage and server environments. At its core are a number of advanced patent-pending technologies that unleash powerful data protection solutions at a fraction of the cost of existing products.
The advanced architecture includes:
* Cluster of independent, intelligent appliances in the primary and secondary sites, with excellent data handling capability, enabling the solutions to:
* Guarantee a consistent replica of business-critical data in the event of any failure or disaster
* Deploy enterprise-class out-of-band (i.e., outside the data path) replication, non-disruptively and non-invasively.
Positioning at the junction between the SAN and the IP infrastructure enables such solutions to:
* Deliver synchronous levels of data protection at any distance and without intermediate storage, satisfying the SEC recommendations at a very reasonable cost
* Support heterogeneous server, SAN, and storage platforms
* Eliminate the need for protocol converters or special networking equipment and edge connect devices.
Advanced algorithms, that:
* Dramatically reduce the WAN bandwidth requirements, enabling much more up-to-date data protection at any distance--at a fraction of the cost of existing solutions
* Automatically manage the replication process, with strict adherence to user-defined policies that are tied to user-specified business objectives
* Use real-time information to enable the system to dynamically adapt its replication mode
* Maintain a consistent image of data at the remote site at all times, from which core business applications can be brought online in just minutes.
Synchronous Levels of Protection Over Long Distances
Financial organizations are required to recover very quickly and with no loss of critical transactions following any disruption of operations. When federal regulators recommended a 200-300 mile separation between primary and secondary data centers, many organizations responded that it would be technically impossible (i.e., that synchronous replication over long distances would lead to unacceptable latency and that quick recovery would be compromised), and/or that it would be cost-prohibitive. Although true of traditional data protection technologies, some solutions address those limitations.
Every time a host application writes a transaction to the local disk storage subsystem, a data protection appliance writes it in parallel to the local compatible appliance. One can use this synchronous connection together with its buffer to deliver synchronous, up-to-date levels of protection with no application degradation, no distance limitation, and without the need for additional storage. In the event of a primary site failure, the system enables failover by flushing its buffer to the secondary site, with absolutely no loss of data. The buffer is in the appliance, which can be located Fibre-Channel distances away; this copy of the data should not be in close proximity to the local/primary data for better protection against local disasters. This capability shatters today's distance/latency limitations, and enables completely up-to-date protection from regional disasters with no impact on application performance.
Furthermore, the appliance enables rapid recovery by ensuring that applications can be brought on line in just minutes with the most up-to-date replica of the data located at the secondary site. The ability to recover quickly is maintained even in the case of data corruption or even a rolling disaster (see "Rapid Recovery From Data Corruption or Rolling Disasters" below).
As a result, financial organizations are able to establish secondary sites in geographically dispersed locations (with no distance limitation), with up-to-date replicas of their business critical data, without introducing latency, in a cost-effective manner (without requiring costly intermediary storage).
Intelligent Use of Bandwidth
Synchronous replication over long distances can require large amounts of expensive bandwidth, and costly networking equipment such as DWDM and protocol converters often making it cost-prohibitive. The appliance, however, uses standard IP interconnect and intelligent bandwidth reduction technologies that deliver unprecedented reduction in bandwidth requirements. As a result, it delivers the best possible levels of protection from the available bandwidth, while using low-cost connectivity and achieving dramatically reduced WAN costs, particularly over long distances.
Data reduction is achieved through a number of application-aware and storage-aware algorithmic techniques. These advanced algorithms, running on an optimized appliance conserve bandwidth to the extent not possible with traditional compression technologies.
[FIGURE 2 OMITTED]
Universal Data Protection
Financial institutions have developed sophisticated, complex IT infrastructures that are deployed to address the full range of requirements of the organization. In the vast majority of cases, these infrastructures include a wide range of server, SAN and storage platforms, from a variety of vendors. As a result, selecting and deploying a data protection strategy is a complex (and costly) endeavor, since current technologies are often tied to the specific platform or vendor.
In contrast to many of today's storage-based and server-based data protection technologies, the appliance must be network-based (i.e., located between the SAN and IP infrastructure). As a result, it can offer an end-to-end solution for data replication across heterogeneous server and storage platforms, enabling a complete data protection solution for the entire enterprise. Accordingly, the storage systems at the primary and secondary sites do not have to be the same, offering the flexibility to deploy lower-cost storage or to leverage existing storage.
Business-Driven Data Protection
Financial institutions are faced with the challenge of protecting a wide spectrum of data and applications, ranging from business-critical core business systems, to important but less critical applications. It is important to have the flexibility of multiple replication modes to match the desired recovery objective of each application. Unfortunately, however, this often requires multiple replication solutions, as well as the overhead required to manually manage the different options.
The appliance offers the full continuum of replication modes, from synchronous, to asynchronous, to small aperture-snapshot, to point-in-time. The replication process is managed automatically, with strict adherence to user-defined policies that are tied to business objectives. The system supports multiple policies, and adapts its replication dynamically, based on the available bandwidth and the application workload, to achieve the stated business objectives for each application. This greatly simplifies data and disaster recovery management for complex and heterogeneous environments.
Efficient, Non-Disruptive Testing
A critical component of any business continuity plan includes the regular testing of disaster recovery plans. The April 2003 "Interagency Paper on Sound Practices to Strengthen the Resilience of the U.S. Financial System" states that one of three key business continuity objectives of special importance for all financial firms is "a high level of confidence, through ongoing use of robust testing, that critical internal and external continuity arrangements are effective and compatible."
A solution should enable direct Read/Write access to data at the secondary site, without the need to first make an additional copy and without disrupting the replication process. As a result, disaster recovery plans can be tested at the secondary site on a regular basis, without requiring additional storage and without impacting ongoing replication. Other workloads that can be deployed in the same non-disruptive manner at the secondary site include development and backups.
Rapid Recovery from Data Corruption and Rolling Disasters
Many disasters and other failures can cause data corruption in the primary site; unfortunately, failures may go undetected for some time, and the corrupted data is replicated to the secondary site. In these situations, backups or past snapshots have to be restored, which often means a long delay in recovery time.
Any worthwhile solution efficiently maintains a snapshot history to allow convenient rollback to any point-in-time, enabling quick recovery. It supports multiple transactional-consistent snapshots at the remote site, allowing reliable recovery in database environments. In addition, frequent, small-aperture snapshots (seconds apart) are utilized to minimize the risk of data loss due to data corruption. In the event of a rolling disaster, reassembly of the data from the most-recent uncorrupted snapshot can be completed, dramatically reducing recovery.
In response to the events of September 11th, federal organizations have proposed a set of guidelines and sound practices that are designed to "strengthen the resilience of the U.S. financial system."
Again, one of the requirements included the need to establish backup sites 200-300 miles away from primary data centers. Many responses from industry leaders indicated that this requirement was not realistic--that it was technically impossible to maintain a no-data-loss environment at such distances, without requiring cost-prohibitive investments. Faced with technology and cost limitation, federal regulators dropped the specified mileage requirement, giving financial organizations the flexibility to "manage costs effectively and allow for technological improvements."
Mehran Hadipour is vice president, product marketing, at Kashya (San Jose, CA)
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Storage Networking|
|Publication:||Computer Technology Review|
|Date:||Nov 1, 2003|
|Previous Article:||What you don't know about compliance and its impact on Information Lifecycle Management.|
|Next Article:||Maximizing FC SAN investments with iSCSI.|