Making Sense Of New Network Chip Technology.
New Gigabit Ethernet An Ethernet standard that transmits at 1 Gbps. Used mostly to connect high-end workstations and servers as well as for network backbones, Gigabit Ethernet transmits full duplex from point to point using switches and half duplex in a shared environment (CSMA/CD) using a hub. chipsets offer a range of options for today's network mega-switch architects. Powerful network processors, low-cost transceivers, multi-channel serial backplane An interconnecting device that has sockets for printed circuit boards to plug into.
Passive and Active
Although resistors may be used, a "passive" backplane adds no processing in the circuit. interconnects, and high-speed synchronous serial switch fabrics simplify designers' hardware implementation tasks, allowing them to concentrate on their value-adds such as higher layer switching capability, media flexibility, and network management.
Synchronous serial backplane switch fabric/transceiver chipsets (Fig 1) maintain faster switching and configuration speeds and are highly scalable as compared to switch ICs that employ shared memory (1) Using part of main memory to support a low-cost display circuit that does not have its own memory. See shared video memory.
(2) The common memory in a symmetric multiprocessing system that is available to all CPUs. See SMP.
1. , asynchronous Refers to events that are not synchronized, or coordinated, in time. The following are considered asynchronous operations. The interval between transmitting A and B is not the same as between B and C. The ability to initiate a transmission at either end. , or parallel switch fabrics. Configurations supporting over 32Gbit/sec of data bandwidth, capable of up to one hundred million connections per second, are possible. These devices can operate in a fully synchronous configuration of parallel chipsets, which makes multiple-hundred gigabit switches cost-effective and easier to design. They can switch either self-routing, variable length packets or fixed length cells, making it ideal for most datacom switch equipment applications, including Gigabit Ethernet, ATM, IP, and Fibre Channel.
New network processor chips will form the basis of the next generation of high performance switching/routing systems. These chips will perform switching/routing functions with QoS and will integrate multiple active flow processors for packet/cell header processing, including data payload (1) Refers to the "actual data" in a packet or file minus all headers attached for transport and minus all descriptive meta-data. In a network packet, headers are appended to the payload for transport and then discarded at their destination. , multiway lookup for policy enforcement, and packet transformations. They can be used in single chip designs, in back-to-back mode, or with external switching fabric components via the CSIX common switch interface standard.
Additionally, an expanding family of CMOS ICs provide gigabit-speed physical layer parallel-serial connectivity for a variety of Gigabit Ethernet, Fibre Channel, and Serial Backplane Interconnect applications. They are available in low-cost, single or quad packages and comply with IEEE (Institute of Electrical and Electronics Engineers, New York, www.ieee.org) A membership organization that includes engineers, scientists and students in electronics and allied fields. and ANSI (American National Standards Institute, New York, www.ansi.org) A membership organization founded in 1918 that coordinates the development of U.S. voluntary national standards in both the private and public sectors. It is the U.S. member body to ISO and IEC. standards.
These devices typically operate in the speed range of 1Gbit/sec to 1.5Gbit/sec and may or may not include encoding See encode. . The common encoding scheme employed is referred to as 8b/10b and was originally developed by IBM (International Business Machines Corporation, Armonk, NY, www.ibm.com) The world's largest computer company. IBM's product lines include the S/390 mainframes (zSeries), AS/400 midrange business systems (iSeries), RS/6000 workstations and servers (pSeries), Intel-based servers (xSeries) . This scheme converts a byte (8 bits) of data to a 10-bit symbol with far superior run-length and transition density characteristics, compared to 8-bit NRZ (Non-Return-to-Zero) A data transmission method in which the 0s and 1s are represented by different polarities, typically positive for 0 and negative for 1. See NRZI.
NRZ - Non Return to Zero (Non-Return to Zero) data. This facilitates the symbol's undistorted Adj. 1. undistorted - without alteration or misrepresentation; "his judgment was undistorted by emotion"
artless, ingenuous - characterized by an inability to mask your feelings; not devious; "an ingenuous admission of responsibility" propagation along a transmission line and, hence, its recovery by a receiver.
Byte-wide or multiple byte-wide parallel data are thus converted to 10-bit symbols, which are transmitted serially over metallic or fiber optic media. This encoding scheme was originally employed in Fibre Channel, carrying data at 100MB/sec. With 25 percent overhead for 8b/10b conversion and 6.25 percent for framing, this results in an encoded line rate of 1.0625Gbit/sec. The IEEE 802.3 Committee adapted the same 8b/10b encoding scheme for Gigabit Ethernet, but it carries 1Gbit/sec of framed data, which results in an encoded line rate of 1.25Gbit/sec.
Today's high-speed, high-capacity switch architectures must convey multiple channels of Gigabit Ethernet, IP, ATM, or Fibre Channel data to and from a switch core, typically utilizing serial backplane interconnect transceivers. The serial backplane interconnect transceiver (TRANSmitter reCEIVER) An electronic device or circuit that transmits and receives analog or digital signals. It comes in many forms; for example, a transponder on a satellite, a network adapter in the computer or the circuits in a cellphone. eliminates the high pin count connectors, trace routing, and signal skewing problems associated with parallel interconnect methods. These architectures may require signaling information to be included along with the data, which can easily increase the serial interconnect transceiver's operating line rate to 1.5Gbitlsec or higher.
Today, virtually all Gigabit Ethernet and Fibre Channel MACs perform the 8b/10b conversion and present the 10-bit symbols on a synchronous parallel data bus, or TBI TBI 1. Thyroxine-binding index 2. Total body irradiation (Ten Bit Interface). A transceiver or Serializer/Deserializer (SerDes) converts the parallel data to serial data. (Actually, there are two defined 10-bit symbols for each byte of data--one with a positive running disparity and, the other, a negative running disparity. The 8b/10b encoder keeps track of the running disparity and ensures that a positive symbol is followed by a negative symbol. This preserves a tight DC balance in the serial data stream, thereby minimizing signal jitter A flicker or fluctuation in a transmission signal or display image. The term is used in several ways, but it always refers to some offset of time and space from the norm. For example, in a network transmission, jitter would be a bit arriving either ahead or behind a standard clock cycle .) Transceivers suitable for both Gigabit Ethernet and Fibre Channel applications are widely available in either single or quad packages. The quad package supports higher circuit density and simplifies trace routing--especially useful in single-board switch designs.
Serial Backplane Interconnect Transceivers, however, frequently interface to ASICs or FPGAs, which do not perform the 8b/10b conversion; this function is performed by the transceiver. A typical configuration may utilize a single Interconnect Transceiver on a line card and may require 16, 32, or more such transceivers at the switch core. Here, too, quad transceiver packages permit higher circuit density and simplified circuit routing. The most sophisticated interconnect transceivers include elastic buffers for inter/intra-chip cable de-skewing and channel-to-channel alignment and may include redundant high-speed serial I/Os to support high-reliability/high-availability system architectures.
Optimized Network Processors
A variety of factors are currently driving the need for standardized, programmable network--or communications--processors in the market. Increased pressure on time to market for new products, as well as the need to rapidly deploy software upgrades to accommodate new standards or enhanced features all argue for a flexible, highly programmable computer engine. Further, as the complexity and diversity of the various protocols and switching algorithms continue to grow, many OEMs must now face the need to focus on those areas where they can offer significant product differentiation Product Differentiation
A source of competitive advantage that depends on producing some item that is regarded to have unique and valuable characteristics. with limited engineering resources.
Increasingly, even the very largest OEM (Original Equipment Manufacturer) The rebranding of equipment and selling it. The term initially referred to the company that made the products (the "original" manufacturer), but eventually became widely used to refer to the organization that buys the products and vendors are now turning to optimized processor architectures. Just as the diversity of protocols and algorithms have driven the need to adopt already developed processors from merchant silicon suppliers, those same factors are now driving the development of those processors along a number of diverse architectural paths, each optimized for a particular class of applications.
It's well known that a competitive Layer 3 switch must now accommodate a variety of QoS functions in order to improve throughput and reduce latency for critical data servicing real-time applications such as video and VoIP. A variety of architectural implementations have been developed to support these functions and still maintain line rates up to OC-48 (2.5Gbit/sec). In each instance, one or more optimized high bandwidth memory interfaces are required for packet classification and modification. Architectures employing multiple processors must also share this same bandwidth between processors as data packets are routed over common buses between processors and memory.
Likewise, intelligent multicast handling is essential for maintaining efficient throughput when high volumes of multicast and broadcast traffic are present. In many instances, multicast packets must be recognized by the traffic classification functions of the processor, but efficient handling must also be tightly coupled See tight coupling. with the switch fabric.
Traffic shaping Using methods to keep traffic flowing smoothly in a network. Although the term is often used synonymously with "traffic engineering," traffic shaping deals with managing the network moment to moment, whereas traffic engineering refers to the overall strategies employed in a network. also plays a key role in achieving QoS functionality in Level 3 switching devices. The ability to classify packets into a variety of queues based on implicit or implied priorities, coupled with algorithms such as Weighted Fair Queuing See traffic engineering methods. (WFQ See traffic engineering methods. ) and Random Early Discard (RED), is a necessity for maintaining real-time service. Intelligent buffering in this manner reduces Head Of Line (HOL) blocking by less critical packets. Processors implementing traffic shaping must also guarantee sufficient queue depth to prevent lost packets in the event of traffic congestion The condition of a network when there is not enough bandwidth to support the current traffic load.
congestion - When the offered load of a data communication path exceeds the capacity. .
The diversity of algorithms has also moved up the OSI stack The protocol stack in the OSI model. See OSI. to embrace Layer 7 (application layer) functionality, as well. Processors must look deep within the packet to find the necessary data for decisions about packet classification and modification. Common examples of adding Layer 7 functionality might include Web site URL URL
in full Uniform Resource Locator
Address of a resource on the Internet. The resource can be any type of file stored on a server, such as a Web page, a text file, a graphics file, or an application program. translation, bandwidth management Controlling the traffic flow in a network. See bandwidth manager. between multiple servers, and email filtering based on recipients' employment status.
Although the potential applications may be limitless, hardware implementation of these functions requires specific capabilities. Processors must be able to buffer and parse deep within long packets, rather than just the headers, to find the relevant data. Text comparisons must often be made on a byte-by-byte basis across large user data blocks to sift through irrelevant information. A number of processors, which will attempt to implement some of this functionality, are nearing introduction. What isn't yet clear, however, is whether a meaningful number of Layer 7 operations can be simultaneously performed in a cost effective manner, given the hardware required to perform those functions at or near wire rates.
Synchronous Serial Switch Fabrics
High performance backplanes are used in application areas that require switching or routing of high bandwidth data streams from one of several data inputs to one or more data outputs while supporting various classes of service. Another requirement is that the switch can be reconfigured at a rate consistent with the size of the smallest data packet for a given port. In enterprise applications where equipment cost is more important than link utilization efficiency, packets can be sent directly through a self-routing fabric. In core or edge routers, where link utilization and QoS is paramount, segmenting the packets allows for a more efficient utilization of the switch fabric and allows for fine grain bandwidth tailoring.
Current parallel backplane solutions include shared bus, shared memory, parallel switch fabrics, and various combinations. All of these solutions are limited in bandwidth due to limitations in bus speed and bus width. Other limitations include high component count and high pin count on backplane connectors. Today, many applications serialize To convert a parallel signal made up of one or more bytes into a serial signal that transmits one bit after the other.
serialize - serialise parallel data across the backplane. Although this reduces the connector pin count and backplane trace count, it does not solve the problem of high component count.
An alternative method is to use asynchronous serial crosspoint switches to break through the bandwidth barriers of parallel approaches, but with high bandwidth comes several limitations. With an asynchronous crosspoint, as the switch is reconfigured, the delay path changes through the switch. This requires a fast phase recovery circuit on the receiving port card, which reduces the available data bandwidth. In addition, the signal integrity is reduced through the switch so this approach cannot be scaled to larger switch fabrics without the use of retimer circuits between each device.
Using a fully synchronous design can solve many of the problems associated with asynchronous serial crosspoint switches. In this design, the serial data from the port card is retimed when it is received at the switch chip and is also retimed when it is sent from the switch chip as shown in Fig 2.
With permanent bit timing established between the switch chip and all the transceivers, no phase acquisition time is required after each reconfiguration of the switch. In addition, an arbitrary large fabric can be created without the need for retimers. Another advantage of this synchronous approach is that, in addition to bit synchronization (1) See synchronous and synchronous transmission.
(2) Ensuring that two sets of data are always the same. See data synchronization.
(3) Keeping time-of-day clocks in two devices set to the same time. See NTP. , word and cell synchronization Cell Synchronization. Synchronization literally means to make two or more things happen exactly simultaneously. For instance, two or more watches can be synchronized to show exactly the same time. can be performed between the transceivers and the switch chip.
Word synchronization allows data and command words to be passed between the port card and switch card. For a self-routing system, connection requests can be sent from the port card to an arbiter located inside the switch chip. Grant information can be passed back from the switch to the port cards. This allows a 32Gbit/sec switch fabric to route variable length packets at up to 100 million connections per second with support for multicast, virtual output queuing, and priorities. In addition, a fully cell synchronous switch fabric can be produced, allowing fixed length cells or segments to be routed through the switch timed to a master cell clock.
The newly emerging CSIX interface standard allows a mix and match of various off-the-shelf network processors and switch fabric solutions. It defines a fixed length segment that is sent to the switch fabric for forwarding. The segment contains a header with information such as destination port and priority. Parallel interfaces are defined for both 2.5Gbit/sec and lOGbit/sec bandwidth support.
For high bandwidth switch fabrics, these parallel interfaces cannot run across the backplane due to signal density limitations. Therefore, a synchronou serial switch fabric with a transceiver on the port card and a serial switch on the switch card is an ideal choice for CSIX-based applications. The transceiver can contain one or multiple CSIX parallel interfaces on one side and multiple high-speed serial links across a passive backplane A backplane that adds no processing in the circuit. See backplane. to the switch device.
Standard high-density, high-performance connectors from several vendors can support data rates up to 2.Ogbit/sec. Also, FR-4 can be used to route these signals up to 20 inches on a passive back-plane. Two of these links in parallel (4Gbit/sec) can be used to support the bandwidth of a 2.5Gbit/sec CSIX interface. Eight links in parallel (16Gbit/sec) will support a l0Gbit/sec CSIX interface. In both of these cases, a speed-up factor of 1.6 is supported in the switch fabric to ease congestion and reduce latency.
Network processors will classify the packets before sending them in segmented form to the switch fabric. The switch fabric must be capable of routing these segments to the required output port while providing low latency Low latency allows human-unnoticeable delays between an input being processed and the corresponding output providing real time characteristics. This can be especially important for internet connections utilizing services such as online gaming and VOIP - VOIP is not as important as to high priority traffic. Crossbar-based fabrics can suffer from head of line blocking. To minimize this, virtual output queues can be provided in the transceiver with each queue divided into various traffic priority levels (classes). This, combined with the l.6x speedup factor in the serial links, insures that high priority traffic will reach its destination with minimal latency. The CSIX interface also defines per-class flow control. This flow control information is passed from the egress See ingress. network processor to the fabric in order to minimize packet loss and optimize the fabric utilization.
The Road Ahead
As the growth in data traffic continues to outpace out·pace
tr.v. out·paced, out·pac·ing, out·pac·es
To surpass or outdo (another), as in speed, growth, or performance.
[-pacing, that of voice traffic on today's WAN infrastructure, there are efforts on multiple fronts to shape the WAN of tomorrow into a more data-centric, or packet-based, infrastructure. IP over SONET See packet over SONET. at up to lOGbit/sec (OC-192) is now a reality. The IEEE 802.3 Committee's Higher Speed Study Group (HSSG HSSG High Speed Study Group
HSSG High-Speed Signal Generator (Tektronix) ) meets regularly, with the objective of defining a lOGbit/sec Ethernet standard by mid-2002. While there are those who clamor to push Gigabit Ethernet out to the desktop, the reality is that, in the near term, Gigabit Ethernet will find a home mainly in corporate and ISP (1) See in-system programmable.
(2) (Internet Service Provider) An organization that provides access to the Internet. Connection to the user is provided via dial-up, ISDN, cable, DSL and T1/T3 lines. network backbones and server connections.
Gary Lee is the director of the Switch Fabric Group, Rob Sturgill is the director of marketing for Advanced Network Products, and Fred Weniger is the Gigabit product marketing manager at Vitesse Semiconductor Corp. (Camarillo, CA).