Printer Friendly

Storage over SONET/SDH connectivity. (Internet).

Increasingly, more and more information is becoming digital and needs to be reliably and securely stored. This information includes X-rays, MRIs, ultrasounds, video, financial data, and the list goes on. This new reality has vaulted storage systems into being an independent "core system" within IT architectures today, with very specific requirements that need to be addressed.

One such requirement is the realization that we are living in an era of increasing threats, both man-made and natural, that place all this information at serious risk of being lost. As digital content represents the life blood of companies, and given the potentially disastrous results that would occur should this data ever be lost or compromised for any reason, Disaster Recovery and Business Continuance have become critical. To mitigate the risk of losing data, enterprises are increasingly adopting storage extension technologies to replicate their business critical data to a secondary remote site. Transmitting this information over distance requires a carrier grade environment with zero data loss, scalable throughput, low latency, low jitter, high security and ability to travel long distances.

This article will discuss the business drivers and challenges for extended Storage Area Network architectures, the advantages of implementing a storage extension strategy over a SONET/SDH architecture and the critical limitations in IP storage architectures.

The Enterprise Storage Challenge

Storage systems have rarely been viewed as a "core system" within the enterprise. As a direct result, many companies have neglected to implement a Disaster Recovery (DR) plan as illustrated by various studies on this topic which show that approximately only one in four companies even have a disaster recovery plan and, of those that do, many have never tested their DR plan.

This reality has CEOs/CEOs and insurance companies now asking the question, "How fast can we recover from an unforeseen event?" The word "recovery" has gained a new level of respect and is correspondingly receiving new scrutiny within the enterprise.

These realities have enterprise companies reviewing their storage strategy for the following reasons:

* Compliance to new regulatory requirements (HIPAA, SEC),

* Improve business continuity with one common plan and infrastructure that includes remote mirroring, backup, disaster recovery and other aspects,

* Reduce operation costs by consolidating and managing a single storage infrastructure in lieu of multiple islands with multiple staff, and

* Enhance productivity with information sharing and improved SLAS.

To address these realities, there are four basic architectures for storage extension that are all available today. These include:

* Legacy access solutions that provide a low-cost solution for connecting storage systems with low bandwidth requirements.

* Storage over WDM (lambda) provides the highest bandwidth, scalability and reliability for the most demanding storage applications. This solution is dependent on the availability of fiber and greater in

cost to deploy.

* Storage over IP, which is becoming more common with the advent of the Internet and prevalence of IP. These are low reliability solutions due to the unpredictability of IP traffic and ill-suited for mission critical storage traffic.

* Storage over SONET/SDH is rapidly becoming a popular choice for subgig data rate storage requirements. Storage over SONET/SDH provides carrier grade reliability, predictability and flexibility across multiple carriers and over extended distances. This approach leverages the incumbent SONET/SDH infrastructure installed today enabling economical reliable storage services to Enterprise customers.

Table 1 illustrates the storage extension trade-offs that need to be considered before selecting an architecture regarding which storage connectivity solution to implement.

Today's Storage Area Networks

Many larger enterprise companies have already implemented storage extension utilizing WDM technology as shown in Figure 1.

In this configuration, the various enterprise sites have their Fibre Channel storage networks provisioned directly over a wavelength. This ensures that storage applications maintain their inherent bandwidth and avoid any bandwidth restrictions due to metro link speeds that are below the native Fibre Channel data rate of 1 (FC100) or 2 (FC200) gigabits per second. This configuration is common for large enterprise organizations that have intensive and business critical storage applications that need interconnecting. These requirements also justify the higher costs associated with deploying a dedicated wavelength storage connectivity service.

In order to address the needs of lower storage bandwidth needs of smaller or remote regional offices, a different, extended storage architecture is required. This architecture is storage extension via SONET/SDH, which becomes a more cost-effective infrastructure to address these sub-gigabit rate storage requirements.

Flow Control Mechanisms

When implementing a remote storage area network, the transitioning from a 1-gigabit native FC rate in the local storage network to a slower sub-gig WAN link creates a certain degree of congestion and blockage due to the link speed mismatch. Therefore, flow control mechanisms are needed to properly groom the storage traffic and prevent any traffic congestion.

The two flow control mechanisms that will be discussed here are the native FC and TCP congestion avoidance/control mechanisms.

The Fibre Channel (FC) flow control protocol is a credit-based mechanism, which creates additional challenges when extending FC over long distances that must be planned for. With FC flow control, when a source storage device intends to send data to a target storage device, the initiating storage device must receive credits from the target device. For every credit the initiating device obtains, it is permitted to transmit a FC frame. This frame can vary in length from 36 bytes to a full 2K (2112 bytes) frame. By using a credit-based approach, congestion is always avoided in the network and the traffic is groomed to the actual bandwidth.

To avoid speed bumps from the latency created by distance extension, sufficient buffer port credits must be available to compensate. This is vital to achieving maximum link efficiency and throughput. In order to perform this calculation for buffer credits, the following factors must be taken into account. The first factor is that the light travels 1 KM every 5 [micro]secs in fiber, which is slower than the speed of light in a vacuum. The other aspect is the FC line encoding mechanism of 8B/10B, which adds 25 percent more bits when transmitting the data for clock recovery.

Given this information, the fiber distance required to transmit a FC frame can be computed. This is needed to determine the recommended amount of FC buffer credits to ensure optimal performance over distance, which can mathematically be expressed as follows:

Fiber Distance (KM) =

Frame Size in bytes

Times 8 to convert from bytes to bits

Times 1.25 to add in the 8B/10B encoding overhead

divided by the LineRate in bps

divided by 5 [micro]secs which is the speed of light per KM in fibre

which can be simplified as Frame Size (bytes) / Line Rate (bps)*2 x [10.sup.6]

The actual amount of buffer credits required can be calculated by extending the above formula as follows:

Buffer Credits Required = One way Fiber Length (1(M) *2 / [Ave. Frame Size (bytes) / Line Rate (bps) * 2 x[10.sup.6]] + 1

If the average FC frame size is 1KB and the one-way fiber length is 80KM, utilizing this formula would yield a recommendation of 84 buffer credits for ensuring maximum link efficiency and optimum performance.

In addition to supporting buffer credits, various storage centric WAN networking vendors support an extended flow control mechanism that further compensates for the latency induced by distance extension. These flow control mechanisms enable efficient storage extension over distances exceeding thousands of kilometers.

A credit-based self-regulating flow control mechanism prevents receiver buffer overruns from occurring and thus avoids frame loss in the storage fabric due to congestion.

This is not true in the case of TCP, which utilizes a sliding window algorithm for congestion management and flow control that attempts to find a steady state threshold that can be maintained by all data flows. In a bursty packet orientated network, congestion occurs and packets are discarded (dropped). When this occurs, the TCP windows throttle back to reduce the data flow permissible to permit the TCP network to discover a new maintainable steady state. This flow control approach tends to induce global synchronization of IP flows and creates wide variances in delay that is unacceptable in storage networks. Should this variance become extreme, the SAN fabric would fail the remote device and remove it as an active available storage device.

In addition to the flow control mechanisms for IP previously discussed, Gigabit Ethernet at the link layer implements a flow control mechanism that is completely independent and unaware of the IP flow control mechanism. Gigabit Ethernet uses 'pause' frames (IEEE 802.3x) to inform the upstream network element (typically a switch) to stop sending any more data. Although this flow control mechanism is initiated before congestion occurs in the local switch, it tends to create significant head-of-line blocking on the upstream switches, which affects all flows in the network and not just the ones destined to this switch. In addition, this link layer approach has no standards based mechanism available to inform the storage application to throttle back to avoid future occurrences of link congestion.

It is significant to realize that P's flow control is a reactive mechanism that is activated only after congestion and packet loss have been incurred. This is in complete contrast to a properly designed storage credit-based flow control mechanism that ensures congestion never occurs to protect the storage network.

SONET/SDH: King of the Core

It seems a lifetime ago, but in 1989 SONET and SDH were celebrating their standardization within ANSI and CCITT (now ITU) respectively. By 1996, this standard was so accepted worldwide that the deployment of SONET and SDH products dominated the fiber optic transport market and, in 2001, the North America metro core was 61 percent dominated by SONET technologies, which continue to grow steadily.

Next Gen SONET/SDH systems are providing enhanced standardized functionality, including Generic Framing Procedure (GFP) and Virtual Concatenation (VCAT), which will be discussed later in this article. Any rumors or predictions on the demise of SONET/SDH would indeed not be grounded in reality, as no alternative standardized technologies can match the universality, robustness, security and Quality of Service that SONET/SDH systems provide today, across distance and multiple carriers. This is why SONET/SDH is ideally suited for carrying storage traffic.

A typical SONET-based network is shown in Figure 2. (See page 25.) This consists of a regional/national SONET ring and a Metro access SONET ring. This configuration is optimized to provide carrier grade 99.999 percent availability for guaranteed point-to-point (circuit) services such as voice, dedicated T1/T3 connections, Ethernet and now Storage private line services.

SONET/SDH networks provide complete failover detection and protection within 50msec of the fault occurring. This is opposed to TCP/IP networks, where fault detection could take minutes to detect, which excludes the time required for actual fault correction.

As SONET/SDH systems continue to evolve, various multiple proprietary approaches for mapping Fibre Channel and Ethernet over SONET/SDH became available. As a result, the ITU undertook a standards based initiative for a standardized mapping of multiple protocols (Ethernet, FC, FICON, ESCON, etc.) into SONETISDH. This mapping became known as the Generic Framing Procedure (GFP) or ITU-T G.7041.

The IP SAN Tax

Figure 3 illustrates how GFP provides a very efficient and direct mapping of multiple SAN protocols into SONET/SDH without imposing any IP SAN tax.

The IP SAN tax is that extra protocol overhead that is inherited whenever data is translated to IP from any storage native protocol. For example, taking a highly efficient Fibre Channel frame and then translating that frame into IP imposes the first level of SAN taxation, by inheriting IP inefficiencies for the handling and encapsulation of FC storage traffic. A second level of taxation is incurred when mapping this adapted FC to IP frame into smaller Ethernet frames. Fibre Channel can transport 2112 data bytes in a single FC frame that would need to be fragmented for transmission on Ethernet that only has a maximum data size of 1497 bytes. This represents the second level of IP SAN taxation and increased inefficiency for transporting FC over IP. A final IP SAN tax is incurred when Ethernet is converted into PPP for Packet over SONET/SDH, then an HDLC frame for onward transmission on SONET/SDH. This three-tier taxation for carrying over IP over the incumbent WAN SONET/SDH infrastructure introduces serious ine fficiencies, especially when transporting storage traffic over sub-gigabit services.

With GFP, the Fibre Channel frame is efficiently and directly mapped into SONET/SDH with no IP SAN taxation being incurred.

The Unpredictability of IP Network Flows

In addition to the IP SAN tax, these IP-based approaches do not resolve the underlying unpredictability and infrastructure security of an IP network.

We were reminded of this aspect on January 25, 2003, when the Slammer worm hit the Web and private networks. This resulted in various ATM machines being taken offline that were most likely on a private VPN IP network and even cancelled various airline flights. During this attack, the packet loss on core Internet routers was averaging 20 percent and it actually doubled in size (voltime) every 8.5 seconds, according to U.S. security experts. This attack, once again, illustrates the fragility of an IP infrastructure in that once a denial of service (DoS) attack commences, any traffic in that IP path will be affected including prioritized MPLS VPN traffic.

Although these attacks are few and far apart, storage applications are business critical. Companies need to be able to recover data from the remote backup site at all times and not just when the IP network is operating normally. Informing a CEO/CFO that she cannot recover her business critical data until the IP network worm attack has subsided, which "should" be somewhere in the next 24 to 48 hours, would not be an acceptable DR/BC solution.

Part of this fragility results from the connectionless nature of IP where IP ports can burst to the full line rate, thereby affecting other IP data flows competing for service on that IP network. In addition, the management control plane is contained within the data plane. As a result, when the data plane is being attacked, the management plane equally becomes inaccessible for management purposes. Various management data such as routing tables and name servers can also be equally attacked (poisoned), thereby creating IP path disruptions and instability. None of these problems exist in a carrier grade circuit-orientated SONET/SDH infrastructure, which makes SONET/SDH ideal for transportation of critical storage information.

As the storage network is extended to greater and greater distances over IP, more and more IP routers are now involved in the end-to-end path. Every additional router in the path unfortunately lowers the overall reliability of the end-to-end path and becomes an attack point for any Internet worm, denial of service attack or hacker. As the routed end-to-end path length increases, so does the overall latency and jitter, since each router in the path adds an unpredictable amount of latency and jitter (see Figure 4).

The other aspect to consider is the overall security of systems that are attached to connectionless IP networks. Although security has improved, we are reminded of these security risks such as the event that occurred on February 18, 2003. On this day, an Internet hacker successfully accessed an estimated 8 million credit cards at a credit card processing center. This was a processing center that would have had significant security systems in place to protect this financial data. However, even the best of security systems can be breached, as it was in this case. As P is a connectionless technology, this leaves little information available for tracing the call and finding the perpetrator.

These are the realities of today's IP networks that cannot be overlooked. If a storage network were to use IP technology then someone needs to ask the question "What would happen if a hacker successfully remote-mounted and completely copied all the storage volumes within the SAN?" If that sounds implausible, keep in mind that SANs over IP is still very embryonic, and lest we forget the 8 million credit cards recently stolen.

In contrast, what makes SONET/SDH the ideal infrastructure for transportation of storage in a secure manner is the circuit-orientated nature, complete with carrier-grade circuit parameters. SONET/SDH circuits remain contained within their path and completely unaffected by other circuit traffic. The SONET/SDH control plane is completely separated from the data plane and accessible only by Carriers/SP and not the customer traffic on the circuit, unlike IP, since P has no isolation between these planes.

How Much Bandwidth?

The actual determination of which sub-gig service rate is appropriate for implementation is dependent on the actual storage application requirements that will be transported over the infrastructure. Table 2 provides a simplistic perspective for implementing sub-gig services. In the first column is the amount of data in gigabytes that needs to be transported across the network. The second column is the actual committed information rate in megabits per second shown at STS-n increments. The third column is the amount of time that will be "ideally" required to transport that amount of data at the specified information rate across a non-blocking network. (Please note the estimated time is a theoretical estimation that does not take into account any path/protocol overhead.)

The decision of which service data rate to subscribe to will be based on the storage application requirements and the economics of that solution. For example, if the application is a tape vaulting application where 10 gigs of data will be remotely backed up and restored, then an STS-1 (51.84Mbps) service can perform a 10-gig restore within a 32-minute window. This 10-gig restore time could be reduced to 16 minutes with a STS-1-2v (103Mbps) service, hence the decision of which service depends on the amount of time the business can reasonably wait for having this data restored.

Other client/server-based ERP/CRM applications are not as data intensive from a client side perspective. In these applications, the client will typically send and receive small blocks of data that can readily be handled by a lower rate service. An equally important factor in this type of application is the delay the application can actually tolerate, which will be dependent on how the application was actually implemented.

Finally, if the application requires real-time mirroring or a non-blocking storage infrastructure or high-storage bandwidth, then full FC line rates will be required that are best implemented over a wavelength orientated service. This approach eliminates the need for traffic shaping by ensuring the availability of full FC lines rates as compared to a lower sub-gigabit SONET/SDH bandwidth service.


Storage over SONET/SDH and Storage over WDM enable the Enterprise to reliably and securely implement their BC and DR applications over longer distances than previously available. These new sub-gigabit storage service offerings further enable smaller regional branch offices to have their lower bandwidth storage requirements consolidated into the corporate SAN, resulting in reduced operational costs.

As increasingly more data is digitized and stored, cost effective carrier-grade storage services become paramount, especially for implementing Disaster Recovery and Business Continuance strategies. There are many extended storage architectures being promulgated by numerous vendors, but only a few architectures can deliver guaranteed, secure, reliable, fault-tolerant, tax-free, multi-carrier storage connectivity.

Selecting the proper extended storage architecture is significant for companies attempting to maximize their cost effectiveness, and it is critical for companies implementing DR and BC applications. As these applications represent the last line of defense, should trouble occur, this extended storage infrastructure must perform now. Informing the CIO one needs to wait for an IP storm to pass before customers can place any orders could certainly result in career limitations.
Table 1

 Access Storage Storage
 Solutions Over over

 Economical Extended YES YES NO
 Long Haul distances

 Performance FAIR POOR GOOD
 (Latency, Jitter)

 Throughput LOW Weakest IP Link FC rates
 and unpredictable

 Economics for POOR GOOD FAIR with
 medium throughput CWDM
 Economics for high POOR POOR GOOD
 requirements - e.g.
 financial institutions

 monitoring and SLA
 statistics gathering

Flexibility to provision POOR POOR GOOD
 and scale

 <50msec protection YES NO YES
 and restoration



 Available now YES YES YES

 Storage over

 Economical Extended YES
 Long Haul distances

 Performance GOOD
 (Latency, Jitter)

 Throughput Scalable from
 to OC-

 Economics for GOOD
 medium throughput
 Economics for high GOOD
 requirements - e.g.
 financial institutions

 Performance GOOD
 monitoring and SLA
 statistics gathering

Flexibility to provision GOOD
 and scale

 <50msec protection YES
 and restoration

 Reliability GOOD

 Security GOOD

 Available now YES

Table 2

Date Size Transport BW Time Required
(in GB) Mbps (in mins)

 1 51.84 (STS-1) 3.2
 10 51.84 (STS-1) 32.2
 100 51.84 (STS-1) 321.5
 1,000 51.84 (STS-1) 3215.0

 1 103.68 (STS-1-2v) 1.6
 10 103.68 (STS-1-2v) 16.1
 100 103.68 (STS-1-2v) 160.8
 1,000 103.68 (STS-1-2v) 1607.5

 1 207.36 (STS-1-4v) 0.8
 10 207.36 (STS-1-4v) 8.0
 100 207.36 (STS-1-4v) 80.4
 1,000 207.36 (STS-1-4v) 803.8

Al Lounsbury is a senior manager for Optical Enterprise Technical marketing at Nortel Networks (Brampton, Ontario, Canada)
COPYRIGHT 2003 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2003, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

Article Details
Printer friendly Cite/link Email Feedback
Author:Lounsbury, Al
Publication:Computer Technology Review
Geographic Code:1USA
Date:May 1, 2003
Previous Article:Perpendicular recording: the next generation of magnetic recording. (Tape/Disk/Optical Storage).
Next Article:Exabyte's VXA-2 tape drive technology wins key endorsements, poised to replace DDS. (Advertisement).

Related Articles
Data replication over the WAN.
High availability WAN Clusters.

Terms of use | Privacy policy | Copyright © 2022 Farlex, Inc. | Feedback | For webmasters |