Printer Friendly
The Free Library
14,715,918 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Increasing network performance using molecular sequence reduction technology.


This article is the first in a two-part series.

How fit is your network? Enterprise wide area networks (WANs) are a mission-critical resource and must be tuned for optimum performance. Many network managers are often asked to evaluate the performance of their enterprise network. The network "health" metrics that they most commonly use for this purpose are link utilization, round-trip delay, and packet loss.

Unfortunately, these traditional measures do not capture the inherent redundancy of the data being transported over today's networks. During my tenure at Stanford University Stanford University, at Stanford, Calif.; coeducational; chartered 1885, opened 1891 as Leland Stanford Junior Univ. (still the legal name). The original campus was designed by Frederick Law Olmsted. David Starr Jordan was its first president.  and as the co-founder of Peribit Networks, my research team and I discovered that most networks contain numerous repetitive data patterns that span virtually all applications and user sessions A count of how many times all users access a Web site regardless whether the same person came back several times during the measurement period. If a user leaves and returns within a short time, some systems count those sessions as one. Contrast with unique visitors. See also user session. , severely degrading TO DEGRADE, DEGRADING. To, sink or lower a person in the estimation of the public.
     2. As a man's character is of great importance to him, and it is his interest to retain the good opinion of all mankind, when he is a witness, he cannot be compelled to disclose
 network performance. This means that both transmission links and routers continuously process vast amounts of redundant data. In fact, measured results from over 100 networks show average repetition rates of 60 to 90 percent. Therefore, most WANs are not running anywhere near their true potential.

Some network managers know they have a capacity problem because of congestion The condition of a network when there is not enough bandwidth to support the current traffic load.

congestion - When the offered load of a data communication path exceeds the capacity.
, packet loss, and end user complaints. Other network managers may find that their WAN links are within performance targets. However, even these WAN links likely transport repetitive data and therefore could be downsized to cut expenses while still delivering the same (or higher) network performance.

Let's compare network performance to health and fitness. You may believe you are healthy because you are not ill. However, this does not mean that you are fit and in optimal physical condition. By analogy, your network may have acceptable delays and appear to be healthy. However, the network could in fact be far from its maximum potential due to the many repetitions that are wasting network resources.

What causes repetitive network traffic? There are three common sources of repetitive data traversing corporate networks:

* Business process flows

* Application overhead

* Commonly used strings, phrases, or objects

Business Process Flows

Common business practices and workflows generate huge amounts of repetitive data in networks. Employees frequently copy or forward email messages with attachments, resulting in multiple repeated transmissions of the same or similar data. Organizations typically maintain centralized cen·tral·ize  
v. cen·tral·ized, cen·tral·iz·ing, cen·tral·iz·es

v.tr.
1. To draw into or toward a center; consolidate.

2.
 databases and servers that are frequently accessed by employees. Many queries to these databases retrieve the same information, e.g., when salespeople pull up contact, account, or status update information. Finally, frequently used information like HR benefits are often posted to an internal Web site as a means to efficiently disseminate data to all employees. These files are frequently downloaded causing repeated transmission of the same data within the enterprise.

Application Overhead

Distributed applications An application made up of distinct components running in separate runtime environments, usually on different platforms connected via a network. Typical distributed applications  are designed to be easy to use and must guarantee the reliability and consistency of data. Thus enterprise applications are often very "chatty chat·ty  
adj. chat·ti·er, chat·ti·est
1. Inclined to chat; friendly and talkative.

2. Full of or in the style of light informal talk: a chatty letter.
" and frequently communicate with distributed end points to ensure that they are correctly synchronized syn·chro·nize  
v. syn·chro·nized, syn·chro·niz·ing, syn·chro·niz·es

v.intr.
1. To occur at the same time; be simultaneous.

2. To operate in unison.

v.tr.
1.
 and that data consistency Data consistency summarizes the validity, accuracy, usability and integrity of related data between applications and across the IT enterprise. This ensures that each user observes a consistent view of the data, including visible changes made by the user's own transactions and  is maintained. These update messages along with full database replications are typical of virtually all enterprise applications and generate a high degree of repetitions across the WAN. In addition to the internal application traffic, user communications with distributed applications are also very redundant. All requests to an application server or database must follow a fixed format and protocol and the response from the application must also be in a fixed recognizable format. These application protocols are designed to be easy to use, fault-tolerant, portable, and very extensible and thus often generate significant communication overhead that is highly redundant.

Commonly Used Strings, Phrases, or Objects

In English, the words "the" "and" "to" "you" occur very often in normal conversation. Likewise, many common phrases exist that are supersets of these words, e.g. "Talk to you soon." Furthermore, inside companies and other communities there are even more common phrases that are used over and over again like "quarterly financial results," "project management status update," and "company confidential." These commonly used patterns can range in size from a few words (such as the previous examples) to large paragraphs (e.g., a common company disclaimer or backgrounder back·ground·er  
n.
An informal news briefing for reporters by an official often speaking off the record.

Noun 1. backgrounder
).

In addition to the repetitions that appear in text, there is often a much greater degree of redundancy due to the many common objects that are accessed and transferred throughout an organization. These repeated objects could range from common images that are embedded Inserted into. See embedded system.  in various documents to tables and slides that are used in multiple presentations. Users typically generate data through an "evolutionary" process where previous instances of the file or object are gradually modified and combined to create new versions. Hence, the combined data generated by all the applications and users on a network can be very highly repetitious rep·e·ti·tious  
adj.
Filled with repetition, especially needless or tedious repetition.



repe·ti
.

The above analysis makes it clear that repetitions in network traffic could range in size from small repeated words or phrases to entire database replications or large files. These repetitions may be transmitted over high speed headquarter head·quar·ter  
v. head·quar·tered, head·quar·ter·ing, head·quar·ters Usage Problem

v.tr.
To provide with headquarters:
 links (e.g. 45Mbps) or low speed remote office links (e.g. 256Kbps). The challenge in identifying and removing these redundancies is that they exist across applications and across user sessions and vary in length from several bytes to several megabytes.

Traditional Solutions: More Bandwidth and Compression

Typically an enterprise will notice that links are at maximum capacity via network performance management systems or end user complaints. When networks are capacity constrained, most customers choose the common alternative of purchasing additional bandwidth, thereby increasing their monthly transmission costs. In addition to the increase in recurring monthly expenses, network upgrades may take months to get installed, may require a router upgrade or network interface card purchase, and may cause network downtime The time during which a computer is not functioning due to hardware, operating system or application program failure.  to install.

Another option is to use compression that is available in many routers to increase the throughput of the wide area network links. Most compression technologies are based on the Lempel-Ziv algorithms developed in the late 1970s. The degree of data reduction that can be achieved by these compression algorithms is a direct function of the number of repeated patterns they can discover. The number of discovered patterns depends, in turn, on the amount of "historical" data the algorithm can store as well as the computational complexity computational complexity

Inherent cost of solving a problem in large-scale scientific computation, measured by the number of operations required as well as the amount of memory used and the order in which it is used.
 required to search for these patterns in this buffer. Most compression algorithms are typically limited to finite search buffer size and can therefore only "look back" within a small window of data to search for repetitions. As network speeds increase, this finite buffer size significantly limits the number of patterns that can be discovered and therefore the effectiveness of the compression algorithms. If the search buffer space is increased to accommodate faster WAN links, the processing power required by the compression technology increases dramatically, which in turn significantly degrades the throughput of the algorithm.

Compression techniques also typically introduce significant latency into the network due to the non-incremental nature of their data processing data processing or information processing, operations (e.g., handling, merging, sorting, and computing) performed upon data in accordance with strictly defined procedures, such as recording and summarizing the financial transactions of a  and encoding See encode.  algorithms. These compression technologies are therefore constrained by both small buffer sizes that prevent the discovery of widely separated patterns and by unacceptable latency and network delay. As a result, compression reduction rates are frequently at or below 10 percent on links with speeds of 256Kbps or greater. Anecdotally, we find that most networks today do not even activate the common compression algorithms that exist in most touters for link speeds above 56 or 128Kbps.

www.peribit.com

Amit P. Singh is co-founder and CTO (Chief Technical Officer) The executive responsible for the technical direction of an organization. See CIO and salary survey.  of Peribit Networks (Santa Clara Santa Clara, city, Cuba
Santa Clara (sän`tä klä`rä), city (1994 est. pop. 217,000), capital of Villa Clara prov., central Cuba.
, CA).
COPYRIGHT 2002 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2002, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Internet
Author:Singh, Amit P.
Publication:Computer Technology Review
Date:Feb 1, 2002
Words:1215
Previous Article:Adaptec adapts to enterprise server market with 5400s.(Server & PC)
Next Article:Time to say "bye-bye" to WiFi? Faster competition comes faster than expected.



Related Articles
Molecular Approaches to Diagnosing and Managing Infectious Diseases: Practicality and Costs.(Statistical Data Included)
Alter wheat dough viscoelasticity.
Tolly Group Independent Testing Confirms Dramatic Network Capacity Gains From Peribit Networks' SR-50.
Peribit Networks Named Network Computing Company to Watch for 2002.
BAYER TO ACQUIRE VISIBLE GENETICS FOR $61.4 MILLION IN CASH.
Controlled vision systems. (Instruments).
Increasing network performance using Molecular Sequence Reduction technology, Part 2.(Internet)
Peribit Introduces Application Acceleration technology that enables e-mail, Web and file server consolidation.(Peribit Networks Inc.)
National Center for Biotechnology Information.(txgnet)

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles