InfiniBand today.InfiniBand is one of a few I/O (Input/Output) The transfer of data between the CPU and a peripheral device. Every transfer is an output from one device and an input to another. See PC input/output. I/O - Input/Output architectures initially developed to address high bandwidth, low latency requirements for High Performance Computing (HPC (Handheld PC) A palmtop computer that weighs less than one pound and runs specialized versions of popular applications. Microsoft coined the term for its Windows CE operating system, which is an abbreviated version of Windows. See Pocket PC. ) environments. While early HPC deployments may have used Ethernet interconnects, a certain amount of latency inherent to TCP/IP TCP/IP in full Transmission Control Protocol/Internet Protocol Standard Internet communications protocols that allow digital computers to communicate over long distances. limited the overall potential performance of the clusters. Since the transport requirements for compute intensive applications do not need all the features of TCP/IP, development began on streamlined I/O architectures. The resulting solutions like Myrinet and InfiniBand support a Message Passing Interface (communications, protocol) Message Passing Interface - A de facto standard for communication among the nodes running a parallel program on a distributed memory system. MPI is a library of routines that can be called from Fortran and programs. (MPI MPI - Message Passing Interface ) over high bandwidth (10 Gigabits per second, Gb/s), very low latency transport architectures. Recognizing the broad potential for performance gains and cost savings to general data center customers, several systems manufacturers, including Dell, IBM (International Business Machines Corporation, Armonk, NY, www.ibm.com) The world's largest computer company. IBM's product lines include the S/390 mainframes (zSeries), AS/400 midrange business systems (iSeries), RS/6000 workstations and servers (pSeries), Intel-based servers (xSeries) , Hewlett-Packard, Sun, Intel, and Cisco, have participated in research that resulted in the final specification of InfiniBand. The InfiniBand technology we see today results from merging designs developed by Compaq, IBM, and Hewlett-Packard (Future I/O) and by Intel, Microsoft and Sun Microsystems (Next Generation I/O). Originally called System I/O, InfiniBand is now a trademarked term. There are two competing protocols currently used in InfiniBand. The first, SCSI SCSI in full Small Computer System Interface Once common standard for connecting peripheral devices (disks, modems, printers, etc.) to small and medium-sized computers. SCSI has given way to faster standards, such as Firewire and USB. Remote Direct Memory Access (RDMA (Remote Direct Memory Access) A communications protocol that provides transmission of data from the memory of one computer to the memory of another without involving the CPU. InfiniBand, Virtual Interface (VI) and RDMA Over IP are all forms of RDMA. ) Protocol (SRP SRP - A data link layer protocol. ) is the older of the two. Many of the early attempts to connect storage directly to the InfiniBand fabric used SRP as the protocol. Although SRP products are shipping today, the protocol lacks discovery and management features and is unable to accept unsolicited data. These deficiencies may have contributed to the decline in follow-on efforts to develop the SRP2 standard. The second, and potentially more successful, protocol is an InfiniBand version of the iSCSI Extensions for RDMA The iSCSI Extensions for RDMA (iSER) protocol maps the iSCSI protocol over a network that provides RDMA services (like TCP with RDMA services (iWARP) or InfiniBand). This permits data to be transferred directly into SCSI I/O buffers without intermediate data copies. (iSER). This emerging standard seems to have traction with the InfiniBand Trade Organization (IBTA IBTA InfiniBand Trade Association IBTA Instituto Brasileiro de Tecnologia Avançada IBTA Instituto Boliviano de Tecnologia Agropecuaria IBTA International Business Travel Association IBTA International Business Training Association ) who is working to define the iSER on InfiniBand mappings. The InfiniBand specification provides a protocol standard and physical connectivity standard with three connection models. The standards require the base 1x connection to sustain a transfer rate of 2.5 Gb/s. Also available are 4x and 12x connections. The 4x provides four "lanes" of wires for (4x2.5 Gb/s) 10 Gb/s serial data transport. Increased clock speeds in InfiniBand components have yielded Double Data Rates (DDR (Double Data Rate) Refers to an SDRAM memory chip that increases performance by doubling the effective data rate of the frontside bus. For more details, see SDRAM. DDR - Double Data Rate Random Access Memory ), so the 4x devices can achieve a transport performance of 20Gb/s. Lanes (or channels) are created between InfiniBand Host Channel Adapters (HCAs) and Target Channel Adapters (TCAs). HCAs are similar to a host initiator device in a storage network. A target device is simply a mode change within the same type of adapter residing on the recipient server or storage device. An InfiniBand switch can be used to interconnect the various devices. When two or more InfiniBand switches are deployed, a 30 Gb/s interconnect is used between them. Collectively, the adapters, wires, interconnected systems, switches, and storage can be referred to as an InfiniBand infrastructure or fabric. InfiniBand is capable of supporting tens of thousands of nodes in each subnet (SUBNETwork) A logical division of a local area network, which is created to improve performance and provide security. To enhance performance, subnets limit the number of nodes that compete for available bandwidth. , providing high performance, and enabling scalability and redundancy. This type of infrastructure is quite suitable for HPC, cluster, and grid requirements, which can require thousands of nodes. The ideal storage architecture to support such a multi-node server fabric is a Storage Area Network (SAN). SANs allow multiple nodes to access the same storage simultaneously, which is often critical for HPC applications. SANs also provide more scalable storage architecture, since the data center manager can usually add more space to the storage area network without service interruption. Typically, SANs are deployed with high bandwidth Fibre Channel arrays and accessed through a Fibre Channel switch In a computer storage field, a Fibre Channel switch is a network switch compatible with Fibre Channel (FC) protocol. It allows the creation of a Fibre Channel fabric, that is currently the core component of most storage area networks. . However, the cost of a Fibre Channel fabric A Fibre Channel fabric (or Fibre Channel switched fabric, FC-SW) is a switched fabric of Fibre Channel devices enabled by a Fibre Channel switch. Fabrics are normally subdivided by Fibre Channel zoning. Each fabric has a name server and provides other services. is quite high, while the cost of an InfiniBand fabric is more consistent with deploying Ethernet switches. To reduce costs and simplify management effort, customers ranging from those performing HPC simulations to those managing data with Oracle solutions have been pressing for a means to integrate their Fibre Channel storage directly to the InfiniBand fabric. Generally, customers will not choose InfiniBand specifically for storage purposes. Those who have already implemented an InfiniBand infrastructure in their computing environments will find it more efficient and cost effective to connect their storage to the existing InfiniBand infrastructure. There are two mechanisms to connect Fibre Channel storage fabrics to an InfiniBand infrastructure: bridges and native interconnects. Bridges. The first, Fibre Channel to InfiniBand bridges, are still somewhat costly and create a bottleneck that gates the Fibre Channel access to speeds less than the array is typically capable of delivering. Native interconnects. A more cost-effective, easier-to-manage solution is to integrate the arrays directly to the InfiniBand fabric. This is accomplished with native InfiniBand interconnects that replace, or in some case coexist with, Fibre Channel interfaces on the RAID controller. Connecting storage directly to InfiniBand fabrics also allows applications to leverage InfiniBand's built-in support for RDMA. This is particularly beneficial to HPC environments, allowing the applications to fetch data, compute, and put intermediate results into memory for other processes to complete the computation. InfiniBand implementations allow the server processors to post to a queue and continue processing, rather than managing I/O to a memory mapped device. Much work has yet to be done to bring RDMA to the storage level. Today, the adoption of InfiniBand is driven largely through the IBTA and OpenIB. The IBTA governs the physical specifications of InfiniBand, while OpenIB is dedicated to a common, open source InfiniBand software stack implementing all relevant protocols. LSI LSI: see integrated circuit. (Large Scale Integration) Between 3,000 and 100,000 transistors on a chip. See SSI, MSI, VLSI and ULSI. Logic is a founding member of both organizations and holds a seat on the Board of Directors of OpenIB. LSI's participation reflects the evolution of this interconnect to the storage portion of the data center or cluster fabric. InfiniBand is still a relatively new technology and today it is supported only in homogenous homogenous - homogeneous networks based on a Linux operating system. As the early adopters in HPC and data center environments continue to deploy it and reap the benefits of immensely increased speed and low latency, InfiniBand will eventually become more mainstream. It is expected to be adapted, in time, for use in more general purpose computing environments, and even has the potential to be a replacement for PCI bus architecture in high-end servers and PCs. Dave Ellis is the director of HPC Architecture, Engenio Storage Group, LSI Logic www.lsilogic.com |
|
||||||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion