Overcoming TCP/IP distance and BER limitations.
The Need for Enhancement
Several characteristics of TCP/IP cause it to perform poorly over high bandwidth and long distances:
Window Size: Window size is the amount of data allowed to be outstanding (in-the-air) at any given point-in-time. The available window size on a given bandwidth pipe is the rate of the bandwidth times the round-trip delay or latency. Using a cross-country OC-3 link (approximately 60 ms based on a total 6000-mile roundtrip) creates an available data window of 155Mbps X 60ms = 1,163Kbytes. A DS3 satellite connection (540 ms roundtrip) creates an available data window of 45Mbps X 540ms = 3,038Kbytes.
When this is contrasted with standard and even enhanced versions of TCP, there is a very large gap between the available window and the window utilized. Most standard TCP implementations are limited to 65Kbytes windows. There are a few enhanced TCP versions capable of using up to 512Kbytes or larger windows. Either case means an incredibly large amount of "dead air" and very inefficient bandwidth utilization.
Acknowledgement Scheme: TCP causes the entire stream from any lost portion to be retransmitted in its entirety. In high bit-error-rate (BER) scenarios this will cause large amounts of bandwidth to be wasted in resending data that has already been successfully received, all with the long latency time of the path. Each retransmission is additionally subjected to the performance penalty issues of "Slow Start."
Slow Start: TCP data transfers start slowly to avoid congestion due to possible large numbers of sessions competing for the bandwidth, and ramp-up to their maximum transfer rate, resulting in poor performance for short sessions.
Session Free-For-All: Each TCP session is throttled and contends for network resources independently, which can cause over-subscription of resources relative to each individual session.
The net result of these issues is very poor bandwidth utilization. The typical bandwidth utilization for large data transfers over long-haul networks is usually less than 30% and more often less than 10%. As fast as bandwidth costs are dropping, they are still not free.
Implications For Storage-to-Storage Replication
New regulations, business continuity and disaster recovery has led to a surge of storage-to-storage replication applications over the WAN. Both man-made and natural disasters have driven demand. TCP/IP has quickly become the preferred storage-to-storage replication WAN protocol of choice. There are three reasons for this:
* The market perceives bandwidth is essentially free: This is because the TCP/IP WAN bandwidth already exists for interactive traffic. Conventional wisdom is that storage-to-storage replication applications occur at night or on weekends. This is when the majority of users are not utilizing the network. Thus allowing already existing TCP/IP bandwidth to be leveraged by the storage-to-storage replication applications without negatively impacting current applications.
* Dedicated, separate storage-to-storage replication WANs are not required: This also eliminates separate WAN management.
* Additional bandwidth implemented for the storage-to-storage replication applications will be shared by the interactive TCP/IP applications.
The facts show that TCP/IP bandwidth is neither free nor is there typically enough to accomplish the storage-to-storage replication in the window of time allotted. TCP/IP bandwidth utilization and long haul issues are rarely taken into account in calculating bandwidth requirements. The most likely result is a bandwidth shortfall. This means either the storage-to-storage application cannot complete within the window of time allotted, or the user must buy more bandwidth, otherwise known as a conundrum.
A Cost-Effective Solution
NetEx HyperIP was designed specifically for large amounts of data over big bandwidth and long distance, to be highly efficient regardless of the BER. HyperIP is a standard TCP/IP network node requiring no modifications to LAN/WAN infrastructures and no proprietary hardware. It provides transparent "acceleration" across long-haul high bandwidth WANs. HyperIP provides the following benefits:
Window size: The HyperIP transport protocol keeps the available network bandwidth pipe full. The results are 90+% efficient link utilization. It eliminates the discrepancy between maximum available bandwidth and the results provided by native TCP/IP.
[FIGURE 1 OMITTED]
Acknowledgement scheme: HyperIP transport protocol retransmits only the NAK'd segments and not all the data that has already been successfully sent.
Slow Start: Configuration parameters allow HyperIP to start transmissions at a close approximation of the available session bandwidth.
Dynamic adjustments: When feedback from the receiver in the acknowledgement protocol is received, HyperIP quickly "zeroes-in" on the appropriate send rate for current conditions.
Session pipeline: HyperIP design allows traffic from multiple TCP sessions to be aggregated over a smaller set of connections between the HyperIP devices, enabling a more efficient use of the bandwidth and less protocol overhead acknowledging many small messages for individual connections.
TCP/IP with Extensions vs. HyperIP: A portion of HyperIP's benefits can also be achieved by implementing TCP/IP(e). This is a complex task requiring fairly significant skills to achieve. It requires system level kernel and/or TCP/IP stack tuning work to be done on each of the servers. It must be completed on ALL servers in the data path exacerbating the task complexity, as different sets of skills are required for different server platforms. There are cases where particular server platforms do not even have this capability. TCP/IP Extensions also requires application changes to be made in order to enable usage of the Extensions for that application. This is not even possible in all cases since the user may not have control of the application.
Also, network routers supporting TCP/IP Extensions must all have the capability to support this protocol, end-to-end, as well as have latest microcode version installed throughout the network.
HyperIP provides all of the above benefits without TCP/IP Extensions, totally independent of the platform, server, and/or OS, and without requiring any application changes and network router upgrades to be made.
HyperIP Test Results
EMC tested HyperIP with SRDF over TCP/IP on their Symmetrix GigE Director. The tests did not include the HyperIP compression engine and the test results were still quite impressive. EMC witnessed Symmetrix Remote Data Facility (SRDF) achieving very high bandwidth utilization consistently from distances of hundreds of miles (with high bit error rates on dirty lines) to as far as geosynchronous satellite distances.
Telstra's testing of HyperIP with VERITAS Volume Replicator had results that maintained the native performance regardless of distance and bit error rates.
Additional compression testing in the NetEx Software labs increased the effective data transfer rates approximately 200% on high-speed circuits such as DS3 and OC3. The higher the capacity of the line, the less impact the compression had.
Storage-to-storage replication applications can now meet the allocated time windows. Effective bandwidth actually does become free.
Storage-to-Storage Replication Applications Performance Enhanced
Storage-to-storage replication applications that benefit the most from HyperIP have the following characteristics:
* Utilize TCP/IP natively
* Typically asynchronous
* Require the movement of lots of data over big band-width and long haul
* Throughput sensitive
Tested storage-to-storage replication applications with significant performance throughput improvements include:
* EMC SRDF adaptive copy and SRDF/A in TCP and GigE director
* VERITAS Volume Replicator
* NSI Double-take
* DataCore Software SAN Symphony
* Sterling Commerce Connect: Direct
* HyperIP software components
IP-Packet Edge Intercept
This component intercepts IP packets, optimizes for performance, and reroutes over the HyperIP protocol on the network.
When a message is intercepted and rerouted, the original IP addressing information is retained and sent as additional protocol information. This allows each message to be reconstructed with the original addressing information at the destination side. A pre-built configuration file describing the HyperIP configuration is processed at initialization.
This component establishes and maintains connections with other HyperIP nodes on the IP network. This IP Accelerator receives intercepted packets from each of the Edge processes. It aggregates these packets into more efficient buffers, and then passes these buffers to the HyperIP Transport component, which sends them to the HyperIP node on the other side of the network.
The remote HyperIP receives these aggregated buffers on the network and passes them on to the IP Accelerator, which sends the packets from the buffer on to the appropriate Edge Interface process.
The transport component provides the transport delivery mechanism between HyperIP nodes. It receives the optimized buffers from the IP Accelerator and delivers them to the destination HyperIP node for subsequent delivery to the end destination. It is responsible for maintaining acknowledgements of data buffers and resending buffers when required.
It maintains a flow control mechanism on each connection, and optimizes the performance of each connection to match the available bandwidth and network capacity.
Since HyperIP provides a complete transport mechanism for managing data delivery, it uses UDP socket calls as an efficient, low overhead, data streaming protocol to read and write from the network.
The HyperIP LZO-based compression engine compresses the aggregated packets that are in the highly efficient IP Accelerator buffers. This provides an even greater level of compression efficiency since a large block of data is compressed at once rather than multiple small packets being compressed individually.
Business continuity realities, regulations, and requirements mean organizations must implement some form of storage-to-storage replication application. Most will turn to TCP/IP WANs and will be disappointed with either the performance or the bandwidth costs. Implementing an RFC3135 TCP/IP performance enhancing proxy such as HyperIP will eliminate that disappointment.
Steve Thompson is director of storage networking at NetEx Software, Inc. (Maple Grove. MN)
|Printer friendly Cite/link Email Feedback|
|Publication:||Computer Technology Review|
|Date:||Nov 1, 2003|
|Previous Article:||Expanders: an indispensable component of the SAS architecture.|
|Next Article:||Practical considerations for iSCSI target deployment.|