Maintaining quality of service for WAN storage over IP.As wide area networking of storage proliferates in today's demanding business continuity environment, quality of service (QoS) for Fibre Channel over IP (Internet Protocol See Internet and TCP/IP.
(networking) Internet Protocol - (IP) The network layer for the TCP/IP protocol suite widely used on Ethernet networks, defined in STD 5, RFC 791. IP is a connectionless, best-effort packet switching protocol. ) is a topic that many IT organizations are addressing--or soon will be.
IP is a mature, widely used network protocol that's emerging as the network of choice for storage-over-WAN applications. It wasn't too many years ago when IP was used only for file sharing Copying files from one computer to another. See peer-to-peer network, file sharing protocol and file and printer sharing. and messaging between users. As more open systems platforms were deployed and IP enhancements were implemented, IP began to become the network of choice for data transfer to remote peripheral applications (even mainframe-based). Now, some of the largest banks in the U.S., as just one example, run their entire banking operations via IP networks. Until recently, the exception to this has been business continuity (disk mirroring) and disaster recovery (tape backup/restore) applications. These large, block-oriented, synchronous/semi-synchronous applications traditionally had to be tightly coupled See tight coupling. directly to the processor owning the data. But improved QoS is now enabling widespread use of IP networks for wide area storage applications.
[FIGURE 1 OMITTED]
This article should give you a better understanding of the issues involved in Fibre Channel over IP QoS for storage applications, and introduce you to solutions that address these issues.
If your organization uses a dedicated, private network, there are distinct steps you and your consultants can take to design your network and select the right products that will enable you to manage these QoS issues. You should take care to design your network down to the application level--which is a big topic all by itself (and thus beyond the scope of this article). In particular, you should consider how you can design your network to use the least amount of bandwidth--the most expensive part.
If you use a public network, the QoS issues we discuss here are largely out of your control. You're guaranteed a certain QoS in the service level agreement (SLA (1) (StereoLithography Apparatus) See 3D printing.
(2) (Service Level Agreement) A contract between the provider and the user that specifies the level of service expected during its term. ) you have with your provider. However, by understanding the issues presented here, you'll be in a much better position to negotiate that QoS in the first place. (In other words Adv. 1. in other words - otherwise stated; "in other words, we are broke"
put differently , forewarned is forearmed!) For example, storage applications require less than 1% packet loss, and you will find that many providers simply cannot provide that level of service.
Storage Traffic Is Demanding
Applications such as disk replication and remote tape backup Using magnetic tape for storing duplicate copies of hard disk files. Users can add an internal or external tape drive to their desktop computers for backup purposes, and files are typically copied to the tapes using a backup utility that updates on a periodic schedule. are high-speed streaming applications that can generate data streams exceeding 100 megabytes per second (unit) megabytes per second - (MBps, MB/s) Millions of bytes per second. A unit of data rate. 1 MB/s = 1,000,000 bytes per second (not 1,048,576). (MBps). When translated into networking terms, that's approximately 800 megabits per second (unit) megabits per second - (Mbps, Mb/s) Millions of bits per second. A unit of data rate. 1 Mb/s = 1,000,000 bits per second (not 1,048,576).
E.g. Ethernet can carry 10 Mbps. (Mbps). Network speeds can be virtually unlimited, but the costs can be very prohibitive to end users, who typically have 3-5 year contracts for bandwidth that require monthly payments. Costs vary depending on the parameters the application requires. Some of these parameters include network speed (in Mbps), length of network route, packet loss, and jitter A flicker or fluctuation in a transmission signal or display image. The term is used in several ways, but it always refers to some offset of time and space from the norm. For example, in a network transmission, jitter would be a bit arriving either ahead or behind a standard clock cycle to name a few. All of these parameters are considered when defining a service level agreement (SLA), to which the provider must adhere or the end user can impose penalties.
QoS Issues for Storage Over IP Networks
There are several issues any enterprise IT organization should be aware of to sustain high-speed storage over an IP network. In large part these issues are manageable, or are becoming more so, as new advances in storage networking are developed.
Networks that people believe are error-free do, in fact, have errors. In the world of IP, lots of things can cause network impairment or stress when it comes to wide area storage networks, including packet loss, jitter, and latency. Packet loss, for example, is common in any discussion of IP networks--in fact, it's a long-known IP fact-of-life. But what some might consider "normal" levels of packet loss are simply not acceptable when it comes to storage applications over wide area networks.
The existence of packet loss does not negate ne·gate
tr.v. ne·gat·ed, ne·gat·ing, ne·gates
1. To make ineffective or invalid; nullify.
2. To rule out; deny. See Synonyms at deny.
3. the viability for storage traffic over IP. There are technologies to reduce the impact of packet loss on storage applications by performing error recovery and retransmission Retransmission might refer to:
Data Integrity Is Critical
One of the most important issues regarding using IP networks for storage over wide area networks is reliable data transfer, or data integrity. Making sure the data is received and delivered intact creates a strong component of QoS, even though it is not associated with "standards" on IP networks. (One must remember that there is a QoS that has to transfer from the server HBA (Host Bus Adapter) See host adapter. , through the fabric and the WAN network, and back again.)
IP networks are known for operating in a "send and forget" mode. That is, the sending device assumes the data will arrive at its destination, so it moves on to the next task without checking on the success or failure of what was sent previously. Error recovery in this case is usually an email from the intended recipient asking that the data be resent.
For storage over WAN, this is obviously unacceptable. The integrity of the data transfer is tightly coupled to the application, wherein the application has preset preset Cardiac pacing A parameter of a pacemaker that is programmed permanently when manufactured timers that tick away waiting for a response from the storage device. If the timer expires, the application will resend the data block. If network errors occur in large numbers, the performance of the application will drop dramatically. If errors to a specific storage device are persistent enough, the application may flag the storage device as being "dead" and no longer attempt data transfers to it--not good!
To solve the issue of data integrity, a technology called "cyclical redundancy check In communications, a method for detecting transmission errors by appending a calculated number onto the end of each segment of data. See CRC. " (CRC (Cyclical Redundancy Checking) An error checking technique used to ensure the accuracy of transmitting digital data. The transmitted messages are divided into predetermined lengths which, used as dividends, are divided by a fixed divisor. ) is employed. In order for CRC checking to be effective in a networked storage environment, the storage router See data mover. must calculate the CRC as the data block is being received from the sending device and append To add to the end of an existing structure. the CRC to the data block for transfer across the network. The receiving storage router then calculates a CRC on its own and compares it to the CRC in the data block. If they match, data integrity is assured. If they don't match, an error is flagged, the data block is discarded, and a retransmission of the original block takes place from the sending device. The CRC process creates an extremely high level of data integrity assurance.
The most advanced storage routers available today (such as those available from my company, CNT (Carbon NanoTube) See nanotube. ) perform error recovery and retransmission at the IP packet level, resulting in much less retransmitted data and much quicker recovery. CNT's solution also uses an adaptive recovery algorithm that accommodates the kinds of loss common in most IP networks. So, while packet loss will reduce overall throughput, its impact can be made much smaller compared to other forms of recovery or other storage networking products.
In our extensive experience in storage networking at CNT, we've found the threshold of packet loss to be a maximum of 1% for an effective storage network when using our products and technology. Less than 1% packet loss can usually yield a workable storage network. More than 1% causes more packet error retries re·tries
Third person singular present tense of retry. and its accompanying latency variability than is viable for most storage applications.
An Important New QoS Capability: Rate Limiting In computer networks, rate limiting is used to control the rate of traffic sent or received on a network interface. Traffic that is less than or equal to the specified rate is sent, whereas traffic that exceeds the rate is dropped or delayed.
While both packet loss and data integrity are critical factors that can affect QoS, another factor, as mentioned previously, is the tendency of storage applications to "hog" bandwidth on a shared network resource. Storage applications can generate intense data rates--so much so that, if allowed, storage traffic can consume all the network bandwidth to which it is connected.
An IP technique that provides users the ability to utilize shared network capacity with multiple applications (or users) is called "rate limiting." It is specifically useful in storage over shared IP networks.
[FIGURE 2 OMITTED]
[FIGURE 3 OMITTED]
SLAs needed for high-end data traffic applications can increase costs for some organizations beyond their means. However, if the network resource can be shared among several applications, it becomes an expense that is more manageable, since it can be spread across several IT initiatives, users, or applications.
If a single network resource can be "allocated" or sized according to according to
1. As stated or indicated by; on the authority of: according to historians.
2. In keeping with: according to instructions.
3. the different application needs, it can suffice for those multiple applications. However, when storage is introduced, that single network resource, which may share web access, client/server traffic, and FTP FTP
in full file transfer protocol
Internet protocol that allows a computer to send files to or receive files from another computer. Like many Internet resources, FTP works by means of a client-server architecture; the user runs client software to connect to traffic, could be easily "run over." That means there is too much traffic and the network circuit becomes over-subscribed. When a circuit becomes oversubscribed Refers to connecting more users to a system than can be fully supported if all of them were using it at the same time. Networks and servers are almost always designed with some amount of oversubscription, counting on the fact that everybody does not need the service simultaneously. , some if not all of the applications will have trouble completing, if they ever do in fact complete.
Rate limiting is a parameter that can be set at an end point to limit the amount of network capacity a given application can use. So, for instance, if a single 1Gbps IP resource is supporting three applications that would consume its capacity in normal usage, each application would be assigned only that amount of resource it is "sized" to use. Using this technique, then, provides a quality of service and SLA for each individual user (or application). Rate limiting therefore protects the network from being oversubscribed when storage traffic is introduced.
Using rate limiting will augment the ability for users to apply multiple layers of QoS for expensive, shared-network resources. Figures 1 and 2 illustrate how assignment of network capacity can be divided within a single network resource.
Storage Router Advancements
Several key technologies have been developed over the years to deal with network impairment issues in wide area networking in the mainframe environment. Recently, CNT has incorporated many of these features into a "new, flexible, and highly manageable platform, the UltraNet Edge 3000 storage router, making these capabilities available for the first time to the broader storage networking market.
This new storage router platform offers user several advancements to enhance QoS:
* The ability for applications to share bandwidth--which is, by far and away, the most expensive aspect of any wide area storage network
* Rate limiting, as discussed above
* Client traffic prioritization, which provides user-specified application priority for network access
The new management capabilities of this storage router help IT organizations drive down the cost of their storage infrastructure while meeting service level agreements to their internal customers. By far, the single largest cost of a remote storage solution is the cost of bandwidth. The CNT UltraNet Edge 3000 minimizes that investment by means of state-of-the-art hardware compression technology. Via the GUI (Graphical User Interface) A graphics-based user interface that incorporates movable windows, icons and a mouse. The ability to resize application windows and change style and size of fonts are the significant advantages of a GUI vs. a character-based interface. , customers can allocate portions of their available bandwidth to specific applications during production hours, or during different times of day, ensuring quality of service across a shared network. These new capabilities help IT organizations reduce storage infrastructure costs while providing required data availability Refers to the degree to which data can be instantly accessed. The term is mostly associated with service levels that are set up either by the internal IT organization or that may be guaranteed by a third party datacenter or storage provider. and protection. In essence, this new storage router technology enables what we call "flexible QoS."
Storage QoS Can Be Effectively Managed
While IP service providers have made gigantic improvements in the quality of service in their backbone networks, there is still more improvement to be made. For e-mail or Web browsing, the quality is pretty good: for more demanding high-performance storage networks, it pays to be careful.
Network traffic for daily client/server communications is usually sparse, intermittent and of brief duration, and is thus relatively impervious to congestion The condition of a network when there is not enough bandwidth to support the current traffic load.
congestion - When the offered load of a data communication path exceeds the capacity. , packet loss, latency instability, and other issues commonly inherent to IP networks.
Storage traffic, on the other hand, is comprised of large chunks of densely packed data moving over long periods of time. These characteristics make storage traffic susceptible to network inconsistencies that can dramatically decrease data throughput. If affected storage traffic shares the same network with normal IP communications A general term for networks that use the IP protocol for voice (VoIP) and video traffic. See IP telephony. , then both types of network traffic are negatively impacted.
Most storage applications are more bandwidth intensive and latency intolerant than other applications that can use IP circuits. Applications such as OLTP (OnLine Transaction Processing) See transaction processing and OLCP.
OLTP - On-Line Transaction Processing databases simply cannot tolerate the latency and jitter that web or e-mail users consider common.
But, by taking advantage of the latest storage networking technology, as discussed above. IT organizations can ensure a predictable, manageable quality of service for their wide area storage applications.
Brian Larsen Brian Larsen (born April 9, 1986 in Laurel, Maryland) is a guitarist, singer, and record producer. "Twilights Moon" is Larsen's main musical project, and as a solo artist and contributor to other artists' work, he has been awarded six RIAA gold/platinum sales awards, but is is senior director of Connectivity and Extension Products at CNT (Minneapolis, MN)