Printer Friendly

The cold hard truth about TCP/IP performance over the WAN.

TCP is a transport technology that is commonly used for the electronic movement of data between servers and other devices. Many storage vendors are beginning to use the TCP transport for replicating data between storage devices. However they are finding out that TCP as a transport has some basic limitations that cause many applications to perform poorly, especially over distance. TCP/IP performs sufficiently over short-distance LAN environments; however it was not well designed for transmission over Wide Area Networks (WANs). This article explores the challenges of TCP performance over the WAN and ways to mitigate these performance challenges with new data center appliances.

TCP Challenges

Window Size Limitations

Window size is the amount of data that is allowed to be outstanding (in the air) at any given point-in-time by the transport software. The available window size on a given bandwidth pipe is the rate of the bandwidth times the round-trip delay or latency. Using a cross-country OC-3 link (approximately 60ms based on a total 6000-mile roundtrip) creates an available data window of 155 Mbps X 60ms = 1,163 Kbytes. A DS3 satellite connection (540ms roundtrip) creates an available data window of 45 Mbps X 540ms = 3,038 Kbytes.

When this is contrasted with standard and even enhanced versions of TCP, there is a very large gap between the available window and the window utilized. Most standard TCP implementations are limited to 65-Kbyte windows. There are a few enhanced TCP versions capable of using up to 512 Kbytes or larger windows. Either case means an incredibly large amount of "dead air" and very inefficient bandwidth utilization resulting in poor performance for applications that are typically mission-critical.

Slow Start by Design

TCP data transfers start slowly and ramp-up to their maximum transfer rate, resulting in poor performance for short sessions. Slow start is used to avoid congestion due to assumptions that large numbers of sessions will be competing for the bandwidth.

[GRAPHIC OMITTED]

Inefficient Error Recovery

During error recovery, TCP causes the entire stream from any lost portion to be retransmitted in its entirety. High-bit-error rates or packet-loss scenarios will cause large amounts of bandwidth to be wasted in resending data that has already been successfully received, all with the long latency time of the path. Each retransmission is additionally subjected to the performance penalty issues of slow start, which was explained above.

Packet Loss is Disruptive

Packet loss describes an error condition in which data packets appear to be transmitted correctly at one end of a connection, but never arrive at the other end. This is mainly due to:

* Poor network conditions causing damage to packets in transit.

* The packet was deliberately dropped by a router and/or switch because of WAN congestion.

Packet loss can be disruptive to applications that must move data within windows of time. With more data that must be moved on a regular basis and the fact that backup windows are not growing to meet the data demands, packet loss can have a negative impact on meeting service-level agreements and production for many organizations.

The Figure shows a standard TCP stream of data running over an OC-12 (622Mbs).

Session Free-For-All is Not Free

Each TCP session is throttled and contends for network resources independently, which can cause over-subscription of resources relative to each individual session.

The net result of these issues is very poor bandwidth utilization. The typical bandwidth utilization for large data transfers over long-haul networks is usually less than 30%, and more often less than 10%. As fast as bandwidth costs are dropping, they are still not free.

How to Mitigate TCP/IP Performance Issues

Consider Using an IP Application Accelerator (Appliance)

Many new data center appliances are being used to optimize data delivery for IP applications. Some appliances mitigate performance issues by simply caching the data and/or compressing the data prior to transfer. Others have the ability to mitigate several TCP issues because of the superior architecture.

Whatever technology is used, it is important the appliances have the ability to mitigate latency issues, compress the data and shield the application from network disruptions. It is also important that these new data center appliances are transparent to operations and provide the same transparency to the IP application.

Transport Protocol Conversion

Some data center appliances provide alternative transport delivery mechanisms between appliances. In doing so, they re-ceive the optimized buffers from the local application and deliver them to the destination appliance for subsequent delivery to the remote application process. Alternative transport technologies are responsible for maintaining acknowledgements of data buffers and resending buffers when required. It is important to maintain a flow control mechanism on each connection, in order to optimize the performance of each connection to match the available bandwidth and network capacity.

Some appliances provide a complete transport mechanism for managing data delivery and use UDP socket calls as an efficient, low overhead, data streaming protocol to read and write from the network.

Compression Engine

A compression engine as part of the data center appliance compresses the aggregated packets that are in the highly efficient IP accelerator appliance buffers. This provides an even greater level of compression efficiency since a large block of data is compressed at once rather than multiple small packets being compressed individually. Allowing compression to occur in the LAN connected appliance frees up significant CPU cycles on the server where the application is resident.

Overcoming Packet Loss

The largest challenge in TCP/IP Performance improvements centers on the issues of packet loss. Packet loss is caused by network errors or changes better known as network exceptions. Most networks have some packet loss, usually in the .01% to .5% in optical WANs to .01 to 1% in copper-based TDM networks. Either way, the loss of up to 1 or more packets in every 100 causes the TCP transport to retransmit packets, slow down the transmission of packets from a given source, and re-enter slow-start mode each time a packet is lost. This error recovery process causes the effective throughput of a WAN to drop to as low as 10% of whatever the available bandwidth is between two sites. IP application accelerators, such as HyperIP from NetEx Software, optimize blocks of data traversing the WAN by maintaining acknowledgements of the data buffers and only sending the buffers that didn't make it, not the whole frame. This allows for the use of a better transport protocol that won't retract data or move into a slow start mode. Using a more efficient transport protocol has lower overhead and streams the data on reads and writes from source to destination. This is completely transparent to the process running a given server application.

Summary and Conclusion

Business continuity realities, regulations and requirements mean organizations must implement some form of storage-to-storage replication application. Most will turn to TCP/IP WANs and will be disappointed with either the performance or the bandwidth costs. Implementing an RFC3135 TCP/IP performance enhancing proxy such as HyperIP will eliminate that disappointment.

www.netex.com

Steve Thompson is director of storage networking at NetEx Software (Maple Grove, MN)
COPYRIGHT 2004 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2004, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Storage Networking; Wide Area Network; Transmission Control Protocol/Internet Protocol
Author:Thompson, Steve
Publication:Computer Technology Review
Date:Aug 1, 2004
Words:1179
Previous Article:IP SAN for dummies.
Next Article:Information Lifecycle Management and the government.
Topics:


Related Articles
The Challenge Of Implementing IP Multicast.
Storage Over IP: Death Of The SAN, Or New Beginning?
A new breed: IP SANs show great promise for networked storage. (Storage Networking).
Network emulation tools in IP networks: test system infrastructure before deployment. (Connectivity).
Pitfalls and promises: will IP storage supplant Fibre Channel? (Storage Networking).
Storage over SONET/SDH connectivity. (Internet).
Overcoming TCP/IP distance and BER limitations.
SMB migration to IP SANs.
iSCSI deployment in business IP storage network.
iSCSI over distance: how to avoid disappointment.

Terms of use | Privacy policy | Copyright © 2021 Farlex, Inc. | Feedback | For webmasters |