Printer Friendly
The Free Library
14,458,594 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Scalable network storage architectures.


The network storage world has long been polarized A one-way direction of a signal or the molecules within a material pointing in one direction.  into two distinct camps--Network Attached Storage (NAS (1) See network access server.

(2) (Network Attached Storage) A specialized file server that connects to the network. A NAS device contains a slimmed-down operating system and a file system and processes only I/O requests by supporting the popular
) and Storage Area Networks (SAN)--with historical limitations of NAS restricting these solutions to specific markets. Customers with performance-sensitive applications or high-growth storage environments have predominantly turned to SAN solutions, finding traditional NAS offerings incapable of meeting business requirements. One of the primary reasons why SAN solutions are deployed for high-performance or high-scalability applications is that SAN solutions utilize hardware-based RAID controllers for data movement, while traditional NAS solutions utilize software running on a general-purpose CPU-based computer which, while more flexible and feature rich, proved far less able to cope during periods of heavy load.

The general-purpose computer Refers to computers that follow instructions, thus virtually all computers from micro to mainframe are general purpose. Even computers in toys, games and single-function devices follow instructions in their built-in program.  within a traditional NAS solution provides a high degree of manageability and value-added features, including snapshots, multi-protocol support, file sharing Copying files from one computer to another. See peer-to-peer network, file sharing protocol and file and printer sharing. , etc.--all of which have made NAS the preferred solution for any network storage environment that does not rely upon high-performance or high-scalability, yet traditional NAS implementations fail to scale because they are bound by the CPU CPU
 in full central processing unit

Principal component of a digital computer, composed of a control unit, an instruction-decoding unit, and an arithmetic-logic unit.
 cycles their host system can produce. This affects system throughput and as system load and capacity increases, software overhead similarly increases--causing performance degradation and increased latency. As business environments continue to see exponential growth Extremely fast growth. On a chart, the line curves up rather than being straight. Contrast with linear.  in terms of storage capacity requirements, NAS solutions are failing to keep pace with this growth. As a result, the customer is left with the difficult choice to either add an increasing number of NAS systems, or move to a SAN-based solution.

BlueArc has recognized these scalability and performance limitations of traditional CPU-based NAS offerings, and architected a solution that delivers the ease of management commonly associated with NAS with unparalleled performance and scalability. BlueArc's third-generation SiliconServer, Titan, is the first NAS system to have its file system completely implemented in hardware, designed to eliminate the inherent limitations of CPU-bound NAS, providing an alternative to deploying SAN solutions.

[FIGURE 1 OMITTED]

A Close Look at CPU Architecture

With the exception of BlueArc, every NAS device available to consumers today is implemented using a CPU-based architecture. There is little difference between these and the PCs on your desks. This traditional approach to data delivery is failing to scale, as the software controlling the file system is nearing its algorithmic maximum efficiency and future performance improvements will be incremental in nature, as computer systems tend to scale at around 15%-30% in terms of real end-user performance each year. In fact, these speed enhancements are decreasing over time. In the 2003-2004 period, CPU manufacturers saw performance improvements significantly below that forecast by Moore's Law "The number of transistors and resistors on a chip doubles every 18 months." By Intel co-founder Gordon Moore regarding the pace of semiconductor technology. He made this famous comment in 1965 when there were approximately 60 devices on a chip. , as the move from 130nm to 90nm proved more difficult than anticipated.

[FIGURE 2 OMITTED]

In order to create a software appliance A software environment that inclues the operating system and application. It is designed for installation in standard hardware that will be dedicated to running that single application. A "hardware appliance" is the software appliance and computer packaged as a single product.  that extracts as much capability out of the underlying hardware as possible, one must begin with the operating system operating system (OS)

Software that controls the operation of a computer, directs the input and output of data, keeps track of files, and controls the processing of computer programs.
 itself. General-purpose operating systems Operating systems can be categorized by technology, ownership, licensing, working state, usage, and by many other characteristics. In practice, many of these groupings may overlap.  are built to be extremely flexible and contain many elements in support of many applications and hardware devices. However, when used as a NAS server, this flexibility amounts to overhead unrelated to the core efficiencies of the server, alongside marginal performance. A superior approach is to instead implement a microkernel (1) The part of an operating system that is specialized for the hardware it is running in. The other components of the OS interact with the microkernel in a message-based relationship and do not have to be rewritten when the OS is ported to a new platform. , or lightweight operating system, designed for the specific purpose of serving data. Additionally, some software-based data vendors have taken the step of designing a specialized file system, coupled with RAID management, to provide data protection services. These three elements characterize the overall approach to the fastest software-based appliances.

As shown in Figure 1, when a client machine makes a request to the software appliance, every attempt is made to fulfill that request as far through the compute process as possible. The steps include the device driver of the network card initially receiving the request through error checking and translation of the request into the file system interface. However, this is a best effort strategy. In fact, it can only be a best effort strategy, because the CPU at the heart of software appliances is limited to performing only one task at a time. Looking at the simplified diagram, you can see that there are several things competing for time on the CPU:

* Each request from a client competes with every other request

* The file system semantic layer Semantic Layer

The semantic layer is a business representation of corporate data that helps end users access data autonomously using common business terms. Developed and patented by Business Objects, it maps complex data into familiar business terms such as product,
 must decode the request

* Error processing may be required for bad requests

* Writes into NVRAM (Non-Volatile RAM) May refer to dynamic RAM (DRAM) and static RAM (SRAM) chips that are backed up by a battery or to non-volatile chips such as flash memory. See non-volatile memory, dynamic RAM, static RAM and solid state disk.  are competing for time

* Writes into the RAM buffer are competing for time

* The block allocation layer must formulate a storage request

* The RAID manager must execute the storage request

All of these complex tasks require many CPU cycles in order to complete, but that alone doesn't paint the entire picture. There are a host of blocking activities that obstruct ob·struct
v.
To block or close a body passage so as to hinder or interrupt a flow.



ob·structive adj.
 data from moving along briskly. Besides the processes mentioned above, consider some of the things going on inside the software appliance:

* General OS housekeeping

* Snapshot

* Snap mirror

* NVRAM flushing

* NVRAM Mirroring (when deployed in a cluster)

* Heart Beating (when deployed in a cluster)

* NDMP (Network Data Management Protocol) An open standard for backing up data in a heterogeneous environment. Developed by Network Appliance and IntelliGuard Software, NDMP uses a common data format which is written to and read from drivers for the specific disk and  Backup

* Administrative Commands (which take priority over everything else)

* RAID Parity Generation

* RAID Parity Rebuild (when a drive fails, this process causes a severe CPU drain)

One new tactic being taken is to implement multiple servers (usually between 3 to 10) in order to improve performance and scalability, with the goal of introducing parallelism An overlapping of processing, input/output (I/O) or both.

1. parallelism - parallel processing.
2. (parallel) parallelism - The maximum number of independent subtasks in a given task at a given point in its execution. E.g.
 into the NAS file-serving process by having lots of CPUs working together on the problem. Unfortunately, the complexity associated with these clusters of interacting servers severely undermines their reliability and requires overwhelming management overhead. In addition to this for any general NAS file serving, it has been shown that as servers are added to the cluster, one receives a diminishing return due to intra-server interaction. Typically, by the fifth deployed server, the performance increase can be as little as 5%.

While the cluster approach can be an extremely expensive way to scale, parallelism is certainly the key to scalability. Taking a lesson from switching technology, BlueArc's Titan SiliconServer is a modular bladed server that incorporates a high-speed backplane An interconnecting device that has sockets for printed circuit boards to plug into.

Passive and Active
Although resistors may be used, a "passive" backplane adds no processing in the circuit.
 and offers massive parallelization through the use of solid-state Field Programmable Gate Arrays See FPGA.  (FPGA (Field Programmable Gate Array) A type of gate array that is programmed in the field rather than in a semiconductor fab. Containing up to hundreds of thousands of gates, there are a variety of FPGA architectures on the market. ).

[FIGURE 3 OMITTED]

The fastest, most reliable and longest-lived technology in the data center is typically the network switch, whether it is Ethernet or Fibre Channel. A switch purchased five years ago is still fast and useful because it was built with scalability in mind. The speed, reliability and scalability of the network switch are directly attributable to the parallelism inherent in the solid-state chip implementation, the high-speed backplane and the replaceable blade design.

It is telling that core network switching in most enterprises is generally a single or dual switch implementation with a dense port count and high-speed backplane. In the past, the switch cascading approach was attempted (much like the server farms discussed above), but its lifespan was short-lived due to abounding challenges around reliability and management. Today, most enterprises use a single director-class switch A fault-tolerant Fibre Channel switch that typically has a high port count and may serve as a central switch to other fabrics. See Fibre Channel.  for their core networking rather than multiple lower performance devices.

Introduction to FPGAs

BlueArc's SiliconServer technology is based on a new type of chip called a Field Programmable Gate Array (FPGA). FPGAs are like factories with a number of loading docks called input/output blocks, workers called logic blocks, and connecting everything are programmable interconnects, the equivalents of assembly lines. As data enters through an input block, much like a receiving dock, the data is examined by a logic block and routed along the programmable interconnect to another logic block. The data will continue to be routed from logic block to logic block until it is finally ready for shipping via an output block.

Imagine a large factory where there is a lot of activity as workers are doing their own particular tasks simultaneously; the same holds true inside an FPGA. Each worker or logic block is capable of doing its job, unfettered by whatever else is happening. These are simple tasks, such as looking for Looking for

In the context of general equities, this describing a buy interest in which a dealer is asked to offer stock, often involving a capital commitment. Antithesis of in touch with.
 a particular pattern in a data stream, or performing math functions. In addition, all of the logic blocks are synchronized so that with each clock cycle each logic block completes its specialized, individual task. The clock cycle within the FPGA is 50 million cycles per second. Given the 40,000 logic blocks inside an FPGA, this yields a peak processing capability of 2 trillion tasks per second--more than 600 times as many as today's fastest CPUs are capable of achieving (as of this writing, Intel's fastest microprocessor rated 3.4 billion instructions per second Instructions per second (IPS) is a measure of a computer's processor speed. Many reported IPS values have represented "peak" execution rates on artificial instruction sequences with few branches, whereas realistic workloads consist of a mix of instructions and even applications, ).

FPGAs have another leg up on alternative solutions; they can be completely reprogrammed (retooled, in effect) as often as you like, within just a few seconds. This capability allows BlueArc to continually improve and add to the functionality of the SiliconServer.

A Scalable NAS Architecture

BlueArc uses more than a dozen FPGAs in its SiliconServer architecture. This allows distribution of the processing of various elements to achieve additional parallelization. For example, on Titan's Network Interface Module, there are resources dedicated to Ethernet, TCP (1) (Transmission Control Protocol) The reliable transport protocol within the TCP/IP protocol suite. TCP ensures that all data arrive accurately and 100% intact at the other end.  and UDP UDP (uridine diphosphate): see uracil.


(User Datagram Protocol) A protocol within the TCP/IP protocol suite that is used in place of TCP when a reliable delivery is not required.
 processing. In the system's Protocol and File System Modules, there are FPGAs dedicated to further high-level protocol processing and the file system itself. Finally, on the Storage Interface Module, Caching, Locking and Fibre Channel requests are processed. All of these elements on all of the modules are simultaneously doing their work with no effect on the other modules.

Between these small factories, there is a data super highway. Just as the lanes of a super highway are unidirectional The transfer or transmission of data in a channel in one direction only. , so are the lanes of the SiliconServer's data pipeline. Data moves through the FPGAs along these pipelines and is processed appropriately as it flows. This is radically different than traditional implementations where data sits in one place and must be operated upon by a CPU, one instruction at a time.

[FIGURE 4 OMITTED]

Conclusion

Years ago, network switch manufacturers abandoned the CPU-based approach to delivering scalability improvements, moving instead to a hardware-based approach. This migration from software to hardware has allowed enterprises to have simple, high-performance networks, which today are able to move data far faster than the servers they connect. The time has come for the network storage manufacturers to follow this lead and abandon their own CPU-based approach. BlueArc's SiliconServer is the logical evolution of storage servers, delivering a product that offers massive parallelization by using new reprogrammable FPGA technology to move the traditional NAS software into a true hardware implementation, allowing NAS to scale for the first time and provide a sensible alternative to SAN.

BlueArc has delivered the first storage server to provide the levels of performance and scalability lacking in traditional CPU-bound solutions and encourages customers to choose the technology that meets their business needs while providing the lowest total cost of ownership.

www.bluearc.com

By Dr. Geoff Barrall

Dr. Geoff Barrall is executive vice president and CTO (Chief Technical Officer) The executive responsible for the technical direction of an organization. See CIO and salary survey.  of BlueArc (San Jose San Jose, city, United States
San Jose (sănəzā`, săn hōzā`), city (1990 pop. 782,248), seat of Santa Clara co., W central Calif.; founded 1777, inc. 1850.
, CA)
COPYRIGHT 2004 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2004, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Storage Networking
Author:Barrall, Geoff
Publication:Computer Technology Review
Geographic Code:1USA
Date:Aug 1, 2004
Words:1792
Previous Article:Implementing PCI Express for storage.(Storage Networking)
Next Article:Wide area file sharing across the WAN.(Storage Networking)(wide area network)
Topics:



Related Articles
Gator Provides Spectra of Tape Libraries.(Spectra Logic)(Company Business and Marketing)(Brief Article)
The Evolution Of Network Storage.(Technology Information)
NETWORK APPLIANCE LAUNCHES NEARSTORE R100.(Product Announcement)
Hitachi Data Systems accelerates lead in high-end storage.(upgrades Hitachi Freedom Storage Lightning 9900 V Series)
Hitachi Data Systems and Cisco team to deliver new networking solutions for enterprise storage.
High Performance Computing: past, present and future.(Storage Networking)
How iSCSI is changing the storage landscape.(IP San Webinar)(Internet Protocol)(Internet Small Computer System Interface)(Internet Engineering Task...
Disk array storage considerations as part of TCO strategies.(TCO: Disk Arrays)(Total cost of ownership)
Clustered storage: improved utility for production computer clusters.(Storage Clustering)
New routing strategies help storage area networks grow up.(Storage Networking)

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles