The value of compression for data protection over TCP/IP WANs.
Contrary to popular views, the vast majority of data is neither transmitted nor stored in a compressed form. Data is usually transmitted or stored in the way it makes it easiest for an application to use. Examples of this are ASCII text for e-mail, word processing, spreadsheets, etc., or computer OS executable binary code. Typically, these easy-to-use encoding methods require data files that range from 2x to 40x or more larger than required to represent the information. Data compression optimizes data for compactness. Data decompression restores the data back to its original form.
There are two principal types of data compression/decompression that address this situation. The first is lossless data compression. And just as it sounds, lossless data compression/decompression means that restored data is identical to the original data. Lossless data compression is used for data that must not be changed even a single bit. This is the standard for business data. LZO compression is an example of a portable lossless data compression library written in ANSI C. It provides fast compression and very fast decompression. Decompression requires no memory.
The second type of data compression is called lossy. Lossy data compression means that the restored data may not be completely identical to the original data. Lossy data compression is usually used for data that may have some random noise where additional noise or losses will not matter. Photos and video are examples of this type of data. JPEG (photos) compression and MPEG (video) compression are examples of lossy data compression. Both provide very fast compression and decompression capabilities. Lossy data compression is not appropriate for business data. Both lossless and lossy compression can be found in software, drivers, firmware, and in some cases even ASICs. Each has a fit.
This article will focus on the use and implementation of lossless data compression, specifically when used with data protection applications over TCP/IP WANs.
Data Protection, Bandwidth and TCP/IP
WAN bandwidth costs have been in a steep decline since the Telco bubble burst at the turn of the century. If unemployment numbers in Dallas, RTP, Ontario, Morristown, and San Jose are any indication, Telco has not yet made a dent in its supply of bandwidth and costs continue to decline worldwide. Even with the decreasing costs, bandwidth is one of the largest operating expenses for the IT organization. It is not free.
TCP/IP is the principal WAN protocol of choice for data protection applications. This is because of the continuing myth that TCP/IP makes bandwidth free for data protection applications. Conventional wisdom is that data protection applications usually occur at night or on weekends when the TCP/IP network is sparsely utilized. In this way, it piggybacks on the same WAN links at no additional charge. Hence the perception that it is free. The logic is flawed.
Data protection applications have significantly increased requirements beyond the day-to-day business applications. In some cases, they have been known to overwhelm the IP routers. It is also a false notion that the data protection applications will run only in the "off" hours. Depending on the type of data needing protection, regulations involved, and requirements for recovery, these applications will be running during the prime business day.
The Market Problem: Data Protection Throughput Over TCP/IP
Another common myth is that standard TCP/IP will always meet the unique needs of these applications. Although this is for the most part true, it is not always true. TCP/IP over the WAN was never designed to handle the large amounts of bulk data that a data protection program can and often does generate. And when that TCP/IP WAN has the typical packet loss of approximately 1%, data protection windows for operations such as complete volume replications, can be and are missed.
Packet loss is a direct result of bit error rate (BER), jitter, network congestion, distance, router buffer overruns, and multiple service providers. One way to mitigate the packet loss problem is through lossless data compression.
TCP Bandwidth Long Haul Problems That Limit Data Protection Throughput
Several characteristics of TCP/IP cause it to perform poorly over high bandwidth and long distances.
Packet Loss: Most TCP/IP WANs are designed around an average packet loss of 1%. This is a relatively low number for standard interactive business traffic. It is a high number for storage data protection applications. Packet loss increases when there is a high bit error rate known as BER (10-10 to 10-6), or jitter becomes an issue, or when congestion is high. Multiple service providers typically have different network vendors increasing the probability of BER and jitter.
Window Size: Window size is the amount of data allowed to be outstanding (in-the-air) at any given point in time. The available window size on a given bandwidth pipe is the speed of the bandwidth times the round-trip delay or latency. Using a cross North American continent OC-3 link (approximately 60ms based on a total 3000-mile roundtrip) creates an available data window of 155Mbps X 60ms = 1,163 Kbytes. A DS3 satellite connection (540ms roundtrip) creates an available data window of 45Mbps X 540ms = 3,038 Kbytes.
When this is contrasted with standard and even enhanced versions of TCP, there is a very large gap between the available window and the window utilized. Most standard TCP implementations are limited to 65 Kbyte windows. There are a few enhanced TCP versions that may be capable of using up to 512 Kbytes or larger windows. Either case means an incredibly large amount of "dead air" and very inefficient bandwidth utilization. The amount a packet can be compressed is very dependent on the size of the packet. The larger the window size, the larger the packet. The larger the packet, the more it can be compressed.
Acknowledgement Scheme: TCP causes the entire stream from any lost portion to be retransmitted in its entirety. In high bit-error-rate scenarios this will cause large amounts of bandwidth to be wasted in resending data that has already been successfully received, all with the long latency time of the path. Each retransmission is additionally subjected to the performance penalty issues of "Slow Start".
Slow Start: TCP data transfers start slowly to avoid congestion due to possible large numbers of sessions competing for the bandwidth, and ramp-up to their maximum transfer rate, resulting in poor performance for short sessions.
Session Free-For-All: Each TCP session is throttled and contends for network resources independently, which can cause over-subscription of resources relative to each individual session. This increases the congestion and packet loss.
Lossless Compression: One Piece in Solving the Problem
Lossless data compression can mitigate some of the throughput decreases caused by TCP/IP packet loss. It does that by increasing the payload of each packet.
There are limits to how much lossless data compression can compress and increase the payload of a TCP/IP data packet. If the data within the packet is already compressed, the answer is, not much. If there is a lot of null data (blanks) within that block, the answer becomes quite a bit. The amount a lossless compression algorithm can compress a packet is also dependent on the size of the packet. The larger the packet size, the more likely measurable compression gains can and will take place. Small packets do not compress well. And typically TCP/IP Local networks are limited by the Ethernet's maximum packet size of 1500 bytes and TCP/IP window sizes.
Some data protection network vendors have developed very clever lossless data compression schemes that overcome the limits of TCP/IP packets. One of these is Network Executive Software (Maple Grove, MN) with their HyperIP compression engine.
HyperIP Compression Engine
The HyperIP compression engine compresses aggregated or concatenated HyperIP packets versus individual TCP/IP packets and much larger window sizes. The increased compression comes by compressing the gaps between the packets as well as the packets themselves. This implementation can increase throughput by up to fifteen times more than standard lossless data compression. FTP test results at a very large insurance company (see Figure) demonstrated compression rates up to approximately 10 times the bandwidth available and 187 times the native TCP throughput. These are impressive numbers.
Summary and Conclusion
Data protection applications such as backup, volume replication, snapshot and mirroring typically use standard TCP/IP WANs for long haul. Standard TCP/IP WANs are less than ideal for the throughput required. TCP/IP packet loss limits data protection throughput to the point where protection windows can and are often missed.
Lossless data compression can increase packet payload so that it mitigates throughput decreases from TCP/IP packet loss. Smart implementations such as Network Executive Software's HyperIP increases throughput up to eight times greater than standard lossless compression.
Lossless data compression is only part of the solution to increase throughput for data protection applications over TCP/IP WANs. Ultimately, the total solution must shield the application from the impact of TCP/IP WAN packet loss while maximizing bandwidth utilization.
Table 1: HyperIP[R] Compression Test Results vs. Native TCP 35Mbps bandwidth Mbps w/60ms one-way FTP w/HyperIP & delay FTP Native TCP FTP w/HyperIP Compression Throughput w/no bit errors 33.3 34.2 327 Throughput w/.01% bit errors 18 34.2 324 Throughput w/1% bit errors 1.73 34.2 324 Note: Table made from bar graph.
Steve Thompson is director, Storage Networking, NetEx Software, Inc. (Maple Grove, MN)
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Disaster Recovery & Backup/Restore|
|Publication:||Computer Technology Review|
|Date:||Jun 1, 2004|
|Previous Article:||Keeping the enterprise afloat: the drive to terabyte-class tape cartridges.|
|Next Article:||High availability WAN Clusters.|