InfiniBand Gathers Powerful Support.This article is the second in a three-part series. The first part appeared in the April issue of CTR See click-through rate. . Initially, the industry was faced with two competing switched fabric proposals: Next Generation I/O--championed by Intel, Sun, and Dell, and Future I/O--championed by IBM (International Business Machines Corporation, Armonk, NY, www.ibm.com) The world's largest computer company. IBM's product lines include the S/390 mainframes (zSeries), AS/400 midrange business systems (iSeries), RS/6000 workstations and servers (pSeries), Intel-based servers (xSeries) , Compaq, and HP. In August 1999, the two camps announced the intent to combine the best of both proposals into a common standard called System I/O (Input/Output) The transfer of data between the CPU and a peripheral device. Every transfer is an output from one device and an input to another. See PC input/output. I/O - Input/Output and later renamed InfiniBand. These companies plus Microsoft now comprise the InfiniBand Steering Committee, chartered to promote the vision and shepherd standards development and market adoption. The committee's makeup is a powerful statement as to the breadth of industry support. Among them, Sun, IBM, Compaq, HP, and Dell account for 79% of the RISC/Unix and 73% of the IA server markets as calculated by IDC. InfiniBand addresses a broad range of price/performance points via a single standard with multiple link technology options. Three options (see Table) are defined so far--1X, 4X, and 12X--where X denotes link width. All are based on 2.5Gbaud signaling. The basic 1X physical definition is consistent with Gigabit Ethernet and Fibre Channel, e.g., 8B/10B serial encoding, to leverage existing technology experience and economies-of-scale. 4X and 12X are multiples of 1X, using byte interleaving interleaving - sector interleave across the multiple physical "lanes" to accomplish higher data rates. To compatibly mix and match link widths (and eventually faster link signaling) the end-points of a link are expected to auto-negotiate a mutually acceptable width and speed. InfiniBand uses an intelligent, layered message-passing model capable of remote DMA (1) (Digital Media Adapter) See digital media hub. (2) (Document Management Alliance) A specification that provides a common interface for accessing and searching document databases. . The basic transport mechanisms support a wide variety of communication models: reliable connection, datagram, multicast, etc. This is to be expected, given the challenge InfiniBand faces in transporting a wide variety of traffic types. Like VI, InfiniBand will allow high-performance IPC (1) (InterProcess Communication) The exchange of data between one program and another either within the same computer or over a network. It implies a protocol that guarantees a response to a request. between cluster nodes via application-to-application channels that bypass the operating system. The InfiniBand Architecture Fig 1 captures InfiniBand's value proposition, namely modular, scalable "rack and stack" computing. Gone is the "monolithic monster" server--a large, inflexible cabinet stuffed with processors, memory, I/O adapters, and disks. This shift is a response to the ISP (1) See in-system programmable. (2) (Internet Service Provider) An organization that provides access to the Internet. Connection to the user is provided via dial-up, ISDN, cable, DSL and T1/T3 lines. and electronic commerce-driven need for a server architecture that a) accommodates a tremendous rate of growth, b) adapts to rapidly changing market conditions, c) provides 7x24 availability, and d) leverages standard, high volume building blocks. Internet growth alone is seriously straining today's architecture. Web site traffic is doubling every 100 days and it's not uncommon for an ISP to see 200% growth in a year. Simply put, the industry needs an architecture that can keep up. InfiniBand supports rack-and-stack computing by eliminating PCI slots and decoupling Decoupling The occurrence of returns on asset classes diverging from their normal pattern of correlation. Notes: Take for example stock and corporate bond returns, which normally rise and fall together. I/O from the server. With no PCI slots, powerful SMP (Symmetric MultiProcessing) A multiprocessing architecture in which multiple CPUs, residing in one cabinet, share the same memory. SMP systems provide scalability. As business increases, additional CPUs can be added to absorb the increased transaction volume. server nodes with tens of Gigabytes of I/O bandwidth can easily fit in 1U (1.75 inches, a standard rack-mount measure), enabling 30 or more servers in a single rack. Decoupling server and I/O using a multistage mul·ti·stage adj. 1. Functioning in more than one stage: a multistage design project. 2. Relating to or composed of two or more propulsion units. switched fabric results in an architecture that orthogonally scales in all directions, in "pay as you go" fashion. In other words Adv. 1. in other words - otherwise stated; "in other words, we are broke" put differently , you add just what you need, nothing more. If you need processing power, add servers (or replace existing servers with faster ones). If you need LAN/WAN LAN/WAN Local Area Network/Wide Area Network or storage I/O channels, add TCA TCA 1. trichloroacetic acid. 2. tricarboxylic acid cycle (Krebs cycle). TCA Tricyclic antidepressant, see there racks and boards. If you need storage capacity, add drives, drive enclosures, and RAID controllers. If you need InfiniBand bandwidth and/or ports, add switches and links. An inherent advantage of the point-to-point topology is that critical links (bandwidth bottlenecks) can be upgraded as technology evolves without having to touch the rest of the fabric. As shown in Fig 2, InfiniBand already has a built-in migration strategy. Assume that early-market implementations are 1X top-to-bottom: i.e., 1X channels out of the server, 1X switches, and 1X TCAs. As CPU CPU in full central processing unit Principal component of a digital computer, composed of a control unit, an instruction-decoding unit, and an arithmetic-logic unit. technology evolves and servers get faster, servers will move quickly to 4X, then 12X. However, you won't have to discard your 1X I/O. As shown in Fig 2, users can upgrade the critical parts of the fabric close to the server and use bandwidth aggregation equipment to accommodate previous generation I/O. Critical systems like RAID may move quickly to the higher bandwidth interface. Less critical I/O can remain on "thinner" cheaper links indefinitely. A switched fabric is robust compared with a shared bus. Message-based communication isolates server and I/O memory spaces. Point-to-point links are naturally hot-plug. Setting up multiple paths--redundant links and switches-- between nodes for both reliability (no single point of failure) and performance (load sharing) is straightforward. The architecture lends itself to modules that can be safely added or replaced in the field or even by the end user. (In contrast, replacing a PCI (1) (Payment Card Industry) See PCI DSS. (2) (Peripheral Component Interconnect) The most widely used I/O bus (peripheral bus). adapter usually means bringing the server down and opening the cabinet, especially difficult in a rack-mount environment.) If well-executed, the end state is an architecture that will significantly reduce the cost and shorten the time needed to configure, expand, and service server installations. InfiniBand is driven by the need to improve the scalability of servers under tremendous pressure from Internet growth and it is backed by industry heavyweights. Obviously the bus is history, right? Not so fast. Despite its limitations (particularly in scalable environments) PCI is fast, simple, and cheap. InfiniBand is a significantly more complex technology and the transition from bus to fabric is a major industry undertaking. The final part of this article (to appear next month) will explore the technical and market hurdles InfiniBand must overcome before the bus architecture can be put out to pasture. Tom Hell is the senior systems architect of the storage components division at LSI LSI: see integrated circuit. (Large Scale Integration) Between 3,000 and 100,000 transistors on a chip. See SSI, MSI, VLSI and ULSI. Logic (Fort Collins, CO).
InfiniBand Link Options
Width Signals Full Duplex Bandwidth
1X 1 differential transmit pair 500 Mbytes per Second
1 differential receive pair
4X 4 differential transmit pairs 2 Gbytes/Sec
4 differential receive pairs
12X 12 differential transmit pairs 6 Gbytes/Sec
12 differential receive pairs
|
|
||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion