Printer Friendly
The Free Library
5,660,707 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Optimizing network storage processing performance: programmable multi-processors and high-speed I/O enable new levels of flexibility and performance for network storage systems.


Storage system architects today face a dilemma. Overall data proliferation Data proliferation refers to the unprecedented amount of data, structured and unstructured, that business and government continue to generate at an unprecedented rate and the usability problems that result from attempting to store and manage that data.  coupled with rapid emergence of new technologies such as IP storage and storage virtualization Treating storage as a single logical entity without regard to the hierarchy of physical media that may be involved or that may change. It enables the applications to read from and write to a single pool of storage rather then individual disks, tapes and optical devices.  have created the need for faster, higher performance storage network solutions. Today's Storage Area Networks (Fibre Channel-based, iSCSI, or FCIP (Fibre Channel over IP) A protocol for tunneling Fibre Channel data across an IP network. Fibre Channel was designed for local storage area networks (SANs), but FCIP extends the distance to remote locations via any IP network. See Fibre Channel, iFCP and IP storage. ), and Network Attached Storage (NAS (1) See network access server.

(2) (Network Attached Storage) A specialized file server that connects to the network. A NAS device contains a slimmed-down operating system and a file system and processes only I/O requests by supporting the popular
) system designs, often do not deliver enough performance, flexibility, or scalability to keep pace with growing demands for capacity and throughput.

To address next generation storage processing requirements, a new category of powerful, programmable multi-processors with high levels of I/O (Input/Output) The transfer of data between the CPU and a peripheral device. Every transfer is an output from one device and an input to another. See PC input/output.

I/O - Input/Output
 and memory bandwidth Memory bandwidth is the rate at which data can be read from or stored into a semiconductor memory by a processor. Memory bandwidth is usually expressed in units of bytes/second, though this can vary for systems with natural data sizes that are not a multiple of the commonly used  is required; moving forward, these processor solutions will play a critical role in enabling new levels of performance and flexibility for network storage systems.

Technical Challenges

Storage system designers and architects face a number of critical technical challenges today:

* Processing performance

* Flexibility

* I/O and memory bottlenecks

* Fixed power and space constraints

Processing Performance. Storage systems must be able to handle more complex packet processing than traditional networking systems. Storage protocol mediation requires termination and regeneration of each protocol; furthermore, packets must be reformatted from one protocol to the other (e.g. IP, SCSI SCSI
 in full Small Computer System Interface

Once common standard for connecting peripheral devices (disks, modems, printers, etc.) to small and medium-sized computers. SCSI has given way to faster standards, such as Firewire and USB.
). In more advanced storage applications, the actual implementation and processing of the upper layers of the protocol (SCSI) may have to be done at the processor level, making this a highly compute-intensive task.

To address the need for high performance processing in the data plane (in-line) and the control plane (exception path), system designers need to evaluate not only the speed, or frequency, of the processor solution, but also the headroom and intelligence on the chip to support features such as TCP (1) (Transmission Control Protocol) The reliable transport protocol within the TCP/IP protocol suite. TCP ensures that all data arrive accurately and 100% intact at the other end.  checksum A value used to ensure data are stored or transmitted without error. It is created by calculating the binary values in a block of data using some algorithm and storing the results with the data.  and deep packet examination and manipulation.

Furthermore, since storage applications are especially sensitive to latency considerations, integration at the silicon level becomes key. In a multi-chip solution, there is added latency each time a task gets passed to another chip. Higher levels of on-chip integration can speed up overall system communication, thus reducing response time and improving overall processing performance.

Flexibility. While new trends such as iSCSI and FCIP are exciting, these specifications are still evolving and are not yet finalized. To support new protocols and emerging standards in a constantly evolving industry environment, storage equipment vendors will require processor solutions that can be programmed and updated easily using widely available software architectures and tools. In this way, system vendors can do software-based field upgrades to boost performance and deliver new features and thereby maximize the "time in market" of their system solutions.

I/O and Memory Bottlenecks. With processor clock speeds reaching frequencies of 1GHz and higher, system designers are facing a new problem: these high performance processors need equally fast access to high-speed I/O and memory to sustain wire-speed performance. For example, the PCI bus is widely used today as the primary I/O interface for chip to chip interconnect within a system. However, the PCI bus bandwidth becomes quickly saturated when needing to sustain very high data rates, especially with added peripherals such as Gigabit Ethernet PCI cards. Processor solutions that can integrate high-speed network I/O interfaces on-chip become very attractive.

Storage processing is also bound by memory considerations--not only is support for large external memory (DRAM) required, but there is also a need for high-speed access to memory to minimize latency and improve response time for data requests. Next-generation systems will need to address memory bandwidth to keep pace with advances in system I/O and overall processing performance.

Space and Power Limitations. In the datacenter, there are severe space and power constraints. System architects must balance system size and power consumption considerations with performance in developing next-generation system designs. Processor solutions that optimize power and performance, while maintaining a high level of integration are key.

A New Category of Processors

A new generation of MIPS-based processors from Broadcom delivers the industry's highest performance at low power while achieving a high level of integration. These processors offer a promising combination of high performance and flexibility ideally suited to address the needs of next-generation storage systems.

Broadcom's SiByte processors integrate high-performance cores that can scale up to 1GHz, with high-speed I/O interfaces and peripherals. The first member of the SiByte family, the Broadcom BCM BCM Baylor College of Medicine
BCM Become
BCM Business Communications Manager (Nortel)
BCM Broadcom Corporation
BCM Business Continuity Management
BCM Business Contact Manager (Microsoft) 
 1250, features two MIPS (Million Instructions Per Second) The execution speed of a computer. For example, .5 MIPS is 500,000 instructions per second; 100 MIPS is a hundred million instructions per second. 64 cores that scale from 600 MHz (MegaHertZ) One million cycles per second. It is used to measure the transmission speed of electronic devices, including channels, buses and the computer's internal clock. A one-megahertz clock (1 MHz) means some number of bits (16, 32, 64, etc.  to 1GHz, up to 128Gbps bus bandwidth, a high-speed memory subsystem that supports up to peak 50Gbps memory bandwidth, and up to 30Gbps total I/O bandwidth tightly integrated onto a 60 million transistor chip. Key interfaces include three integrated Gigabit Ethernet MACs, PCI (1) (Payment Card Industry) See PCI DSS.

(2) (Peripheral Component Interconnect) The most widely used I/O bus (peripheral bus).
, HyperTransport, and generic I/O. (Figure 1). The BCM1250 is targeted at mid to high-end storage applications including SAN routers, gateways, directors, switches, and NAS appliances. Recently, the single-core BCM1125 was announced to provide OEM (Original Equipment Manufacturer) The rebranding of equipment and selling it. The term initially referred to the company that made the products (the "original" manufacturer), but eventually became widely used to refer to the organization that buys the products and  vendors with a powerful alternative for lower cost devices such as HBAs or TCP offload engines.

[FIGURE 1 OMITTED]

The SiByte processors have a number of advantages over both general-purpose processors and traditional network processors for storage system applications. Unlike most general-purpose processors to date, they have the high integration and intelligence on chip to support in-line packet processing at wire speed rates ranging from 1Gbps to 2.4Gbps. However at the same time, the SiByte processors have a high degree of programming flexibility, one of the key benefits of taking advantage of a well-understood general purpose architecture such as MIPS. For example, the two cores on the BCM1250 are fully software programmable and can be partitioned to manage tasks as required--one core may handle iSCSI protocol processing while the other terminates TCP traffic, or handles exception processing.

Programming is made easier by the ready availability of software tools. Since the SiByte processors leverage the standard MIPS architecture, developers can use existing third party MIPS tools, such as compilers, assemblers, and debuggers. In addition, Broadcom provides firmware, as well as software drivers for the most popular operating systems including VxWorks, Linux, and NetBSD. This comprehensive software development platform provides high programming flexibility and allows customers to develop their own software applications such as virtualization An umbrella term for enhancing a computer's ability to do work. Following are the ways virtualization is used.

Hardware Virtualization
Partitioning the computer's memory into separate and isolated "virtual machines" simulates multiple machines within one physical computer.
 protocols, storage management, load balancing, and SCSI or InfiniBand command translations on top of the SiByte platform.

Integrated I/O

SiByte processors are designed for maximum I/O throughput, with three 10/100/1000 Ethernet MACs, a 32-bit/66MHz PCI host bridge, HyperTransport Host Bridge, SMBus, GPIO GPIO General Purpose Input/Output
GPIO General Purpose Input Output
, flash interface, interrupts, and DMA (1) (Digital Media Adapter) See digital media hub.

(2) (Document Management Alliance) A specification that provides a common interface for accessing and searching document databases.
 all integrated onto the single chip. These integrated I/O functions eliminate the need for a separate system controller, and provide flexibility in design. The Gigabit Ethernet MACs can also be configured as three 8bit or two 16-bit Packet FIFO (First In First Out) A storage method that retrieves the item stored for the longest time. Contrast with LIFO. See traffic engineering methods.

FIFO - first-in first-out
 interfaces. Two serial ports are available to use as UARTs for console ports or as synchronous interlaces. The HyperTransport and PCI interfaces enable customers to build high bandwidth, multiprotocol solutions using Gigabit Ethernet, Fibre Channel, iSCSI, and InfiniBand or any other emerging protocols.

Hyper Transport for High-Speed Data Communication

The HyperTransport Host Bus provides a high-speed chip-to-chip interface for connecting coprocessors, such as encryption engines, peripherals, or multiple SiByte processors. With peak data transfer rate of 6.4GB/sec, the HyperTransport specification provides better than an order of magnitude A change in quantity or volume as measured by the decimal point. For example, from tens to hundreds is one order of magnitude. Tens to thousands is two orders of magnitude; tens to millions is three orders of magnitude, etc.  increase in bus transaction throughput over existing bus architectures such as PCI, PCIX, and AGP (Accelerated Graphics Port) A high-speed 32-bit port from Intel for attaching a display adapter to a PC. It provides a direct connection between the card and memory, and only one AGP slot is on the motherboard. .

Typical Storage Applications for Advanced Multi-Processors

Highly integrated network processor solutions such as the SiByte family of processors can be implemented in various applications to address the need for high speed, high capacity storage in data centers, including SAN and NAS. One example, shown in Figure 2, is a line card in a SAN switch capable of performing storage protocols (TCP, NFS (Network File System) The file sharing protocol in a Unix network. This de facto Unix standard, which is widely known as a "distributed file system," was developed by Sun. See file sharing protocol and WebNFS.

NFS - Network File System
, CIFS (Common Internet File System) The file sharing protocol used in Windows. It evolved out of the SMB (Server Message Block) protocol in DOS, which is why the terms CIFS/SMB and SMB/CIFS are sometimes seen. The word "Internet" in the CIFS name has little relevance. ) on one processor while processing iSCSI on the other. For data entering through the Gigabit Ethernet interfaces, non-encrypted packets are processed by the first BCM1250, while encrypted packets are passed to the second BCM1250 and terminated. The BCM5840 delivers security encryption and authentication for iSCSI packets. The network and storage protocols could also be distributed across both BCM1250s.

[FIGURE 2 OMITTED]

Combining multiple processors and I/O with an industry-leading level of integration, Broadcom's SiByte processors are an example of a new generation of programmable processors that storage system architects can use today to achieve new heights of performance, flexibility, and scalability. They enable development of compact, lower power systems that provide greater capacity and throughput, thus addressing the seemingly conflicting challenges of today's storage system design.

Kim Chan is the product line manager at Broadcom Corp. (Irvine, CA).

www.broadcom.com
COPYRIGHT 2002 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2002, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Storage Networking
Author:Chan, Kim
Publication:Computer Technology Review
Date:Jan 1, 2002
Words:1404
Previous Article:Balancing the needs of data storage and business continuity.(Storage Networking)
Next Article:Storage Roundtable: addressing the issues today.(Storage Automation)



Related Articles
The Importance Of Storage Domain management.(Technology Information)
MIPS TECHNOLOGIES LICENSES 32-BIT CORE TO BRECIS COMMUNICATIONS.(MIPS TechnologiesMIPS32 4Km)(Product Announcement)
Tensilica's Xtensa Core Enables Industry's First Storage Network Processor.
Aristos Logic Unveils the Industry's First Intelligent Storage Processor; The Technology Sets a New Bar for Storage Performance and Manageability.
Xelerated Brings Programmable 40Gbps Technology to the Mainstream Ethernet; The X10q-e Network Processor sets new cost/performance standard at $245...
Blade or brick, take your pick: both increase server power, not server numbers.
The next evolution in storage: clustered storage architectures.(Storage Management)
PMC-Sierra Introduces 3 Gigabit Mux/Demux for Serial Attached SCSI Storage Systems.
Cradle Delivers Industry's Highest Single-Chip DSP Performance; Programmable DSP Family with up to Sixteen 375MHz DSP Engines Boosts Compute Power...
Bivio Names Freescale Semiconductor a Strategic Supplier of High-Performance Processors; Bivio 2000 Network Platform Benefits from High Performance,...

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles