Printer Friendly
The Free Library
14,599,499 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Smart object-based storage cluster computing.


The advent of Linux compute clusters has forever changed Forever Changed was a Christian Rock band from Tallahassee and Orlando, FL. They came together in 1999 and broke up in 2006. Dan Cole was the lead singer, a guitarist, and a pianist. Ben O'Rear was the lead guitarist, Tom Gustafson played bass, and Nathan Lee played the drums.  the high-performance computing High-speed computing, which typically refers to supercomputers used in scientific research.  landscape. Instead of using proprietary, expensive supercomputers to solve the most challenging computing problems, nearly every new supercomputing system installed today is comprised of thousands of low-cost Linux servers united into a cluster. To unlock the full potential of these Linux compute clusters, a complementary data storage solution is needed. There is just such a solution: object-based storage clustering. Object-based storage clustering systems have the intrinsic ability to linearly scale in capacity and performance to meet the demands of the most powerful Linux-based clusters.

There are many technical and commercial applications that are benefiting from the "scale out" Linux clustering wave. For example, geophysicists are developing more capable seismic analysis Seismic Analysis or Earthquake Engineering is a subset of structural analysis and is the calculation of the response of a building (or nonbuilding) structure to earthquakes.  techniques to image the earth's substructure substructure /sub·struc·ture/ (-struk-chur) the underlying or supporting portion of an organ or appliance; that portion of an implant denture embedded in the tissues of the jaw.

sub·struc·ture
n.
 and guide oilfield drilling, resulting in a 25% higher predictability of energy discovery.

Pharmaceutical companies mine massive genomic datasets to provide better insight into human diseases and develop more personalized therapies. Commercial aircraft and automobile designers develop extensive computer simulations to make transportation faster, safer, and more comfortable. Internet portals such as Google index the content from hundreds of millions of Web pages that comprise the Internet.

All of these applications are recognized for their computational complexity computational complexity

Inherent cost of solving a problem in large-scale scientific computation, measured by the number of operations required as well as the amount of memory used and the order in which it is used.
. However, often underestimated is the equally ravenous appetite they have for high-performance data access. Without rapid and efficient access to data, scarce computing resources sit idle. Unfortunately, traditional networked storage systems are simply incapable of providing the data throughput needed to keep ever growing Linux clusters operating efficiently.

Equally important, these massive datasets need to be made globally available to all processes executing across the compute cluster to simplify application development and to ease the burden of managing data repositories. Here again, traditional networked storage systems fall short: they are incapable of scaling capacity within a single namespace A collection of names for a particular purpose. Typically, each name is unique. For example, tables in a relational database must all have unique names. A Windows workgroup that uses the original NetBIOS naming system requires a different "made-up" name for each computer and printer in  and thereby increase the time and complexity of managing networked data.

Data Access Patterns

To understand the need for a new approach to scalable storage, we first explore the manner in which many cluster computing Cluster Computing: the Journal of Networks, Software Tools and Applications is a journal for parallel processing, distributed computing systems, and computer communication networks.  applications address the storage bottleneck. Linux cluster applications use a scale-out approach to parallel computing Solving a problem with multiple computers or computers made up of multiple processors. It is an umbrella term for a variety of architectures, including symmetric multiprocessing (SMP), clusters of SMP systems, massively parallel processors (MPPs) and grid computing. . In this model, applications employ a 'divide-and-conquer' approach, decomposing the problem to be solved into thousands of independently executed tasks. The most common decomposition approach exploits a problem's inherent data parallelism--breaking the problem into pieces by identifying the data partitions that comprise the individual task, then distributing each task and corresponding partition to the compute nodes for processing.

For example, animation-rendering applications distribute scene generation tasks to hundreds of cluster compute nodes--each generating an individual frame of the final segment. Shared scene and character information and per-frame rendering instructions must be distributed to each of the compute nodes, and each node generates as much as 50 MB of output per frame. The individual frames are then sequenced and assembled into their final form for review. This is a common data access scenario across many cluster-computing applications.

The natural inclination of cluster computing developers is to deploy a networked storage solution that can be accessed by all nodes in the cluster. Such a solution greatly simplifies management of the compute jobs as all data partitions and replicas can be made available to all nodes, and hence any of the tasks can be computed on any node. Additionally, the output of these jobs can then be used directly elsewhere--in post-processing, visualization or even as the input to the next processing task in a computational pipeline.

Unfortunately, standard shared-storage solutions provided by NFS (Network File System) The file sharing protocol in a Unix network. This de facto Unix standard, which is widely known as a "distributed file system," was developed by Sun. See file sharing protocol and WebNFS.

NFS - Network File System
 file servers are only sufficient for small clusters of 10 to 20 nodes. Larger clusters require more scalable storage solutions. Storage Area Networks (SANs) and Network Attached Storage (NAS (1) See network access server.

(2) (Network Attached Storage) A specialized file server that connects to the network. A NAS device contains a slimmed-down operating system and a file system and processes only I/O requests by supporting the popular
) architectures have been employed for modest-sized clusters of approximately 20 to 50 nodes. However, these architectures have severe limitations as clusters become larger.

Neither SAN nor NAS architectures support the aggressive concurrency Operations that are performed simultaneously within the computer. For example, dual-core CPUs provide complete overlapping of two independent processes. See dual core, hyperthreading, multiprocessing, multitasking, multithreading, SMP and MPP.

concurrency - multitasking
 and per-client throughput requirements of scalable cluster computing applications. SANs were designed to provide a modest number of application servers with high-performance, highly reliable access to a shared pool of storage devices (e.g., for enterprise transactional databases). SANs improve the storage provisioning process, allowing disks to be moved among application servers to address changes in capacity requirements; but this leads to application server-based islands of data. NAS systems, on the other hand, were designed to afford widespread data sharing The ability to share the same data resource with multiple applications or users. It implies that the data are stored in one or more servers in the network and that there is some software locking mechanism that prevents the same set of data from being changed by two people at the same time.  on heterogeneous platforms with relatively modest per-user I/O (Input/Output) The transfer of data between the CPU and a peripheral device. Every transfer is an output from one device and an input to another. See PC input/output.

I/O - Input/Output
 requirements (e.g., for user home directory storage).

Because of these limitations, organizations are forced to adopt a process in which data from a shared storage system is staged (copied) to the compute nodes, processing is performed, and results are destaged from the nodes back to shared storage when done. In many applications, the staging setup time can be appreciable-up to several hours for large clusters.

Changing the Storage Landscape

Fortunately, object-based storage clustering offers a new approach to fill the expanding gap between Linux cluster growth and the lack of scalable data access. Object-based storage offers high bandwidth parallel data access between thousands of Linux cluster nodes and a unified storage cluster over standard TCP/IP TCP/IP
 in full Transmission Control Protocol/Internet Protocol

Standard Internet communications protocols that allow digital computers to communicate over long distances.
 networks. It is a solution in which the storage system's scalability can be precisely matched and then scaled to needs of the cluster computer. Together, Linux clusters and object-based storage clusters deliver commodity-like supercomputers able to keep pace with increasingly voracious applications.

At the core of an object storage architecture are dynamic, self-managing data objects stored across a cluster of "smart drives" known as Object Storage Devices. Data objects are fundamental containers that house both application data (including metadata describing the "mapping" of object data to physical disk drives) and an extensible set of storage attributes. User and application files are decomposed de·com·pose  
v. de·com·posed, de·com·pos·ing, de·com·pos·es

v.tr.
1. To separate into components or basic elements.

2. To cause to rot.

v.intr.
1.
 into a set of data objects and distributed across one or more Object-based Storage Devices (OSDs). Each OSD (1) (On-Screen Display) An on-screen control panel for adjusting monitors and TVs. The OSD is used for contrast, brightness, horizontal and vertical positioning and other monitor adjustments.  is an easily scalable cluster element: it includes one or more disk drives, local processing to manage data flow, memory for data caching, and a high-speed network connection.

Together, data objects stored on object storage devices form the core of a scalable storage system. Uniquely, each object-based cluster element has the "smarts" to deliver data directly to the Linux cluster. This is how highly parallel data access is achieved: Linux cluster nodes can securely read and write data objects in parallel to all object storage devices in the storage cluster. The "smarts" of this system offer further benefits: all data is virtualized into a single seamless namespace for ease of manageability and the entire system can be dynamically rebalanced to ensure ongoing, self-managed operational efficiency.

While object storage devices form the foundation of a massively parallel storage architecture, they do not comprise a storage system. To deliver a complete system, a scalable file-level metadata management layer must be developed. Metadata Managers in an object-based system manage information such as directory membership, file ownership and permission attributes. Metadata Managers are responsible for striping Interleaving or multiplexing data to increase speed. See disk striping.

striping - data striping
 data objects (portions of files) across OSDs and ensuring file-level data integrity--(e.g., by computing and storing parity objects that implement RAID-5 redundancy).

In an object-based design, MetaData Managers represent the control path between the Linux cluster and the storage cluster. This is the path through which compute cluster nodes make requests (e.g., to open or close files), are authenticated, and receive authorization credentials and a map of object locations and their host OSDs. The node then uses the map and credentials to securely access the cluster of OSDs, reading and writing file data without additional intervention by the Metadata Manager. MetaData Managers can be clustered just as easily as object storage devices for optimal performance.

An Inevitable Industry Direction?

Object-based storage clustering is core to the on-demand computing wave, where IT infrastructure is presented as a unified resource that is self-provisioning, highly scalable and dynamically self-managing. Given its advantages, many companies are actively contributing to the revolution of object-based storage. These include established companies such as IBM (International Business Machines Corporation, Armonk, NY, www.ibm.com) The world's largest computer company. IBM's product lines include the S/390 mainframes (zSeries), AS/400 midrange business systems (iSeries), RS/6000 workstations and servers (pSeries), Intel-based servers (xSeries) , EMC (1) (EMC Corporation, Hopkinton, MA, www.emc.com) The leading supplier of storage products for midrange computers and mainframes. Founded in 1979 by Richard J. Egan and Roger Marino, EMC has developed advanced storage and retrieval technologies for the world's largest companies. , Sun, and Intel, as well as emerging companies such as Panasas and Lustre--that are likely to be first to exploit this opportunity.

In addition, an ANSI (American National Standards Institute, New York, www.ansi.org) A membership organization founded in 1918 that coordinates the development of U.S. voluntary national standards in both the private and public sectors. It is the U.S. member body to ISO and IEC.  T10 standard for Object Storage Devices is being defined with the aid of the Storage Networking Industry Association An association of producers and consumers of storage networking products, whose goal is to further storage networking technology and applications. The Storage Networking Industry Association, or SNIA  (SNIA (Storage Networking Industry Association, San Francisco, CA, www.snia.org) An organization devoted to the advancement of mission critical storage systems. Founded in 1997, its goal is to determine the standards that must be developed to allow hosts and storage systems to interact via ). The standard includes a command set designed for the iSCSI transport layer--in essence providing object extensions to the traditional iSCSI block command set. The effort has the support of many leading storage companies including EMC, HP, IBM, Intel, Seagate, Sun, and Veritas. Version 1 of the standard will be adopted this year.

At Panasas, we are actively developing the premier storage system for scalable Linux clusters. In short, our systems are able to scale data throughput an order of magnitude A change in quantity or volume as measured by the decimal point. For example, from tens to hundreds is one order of magnitude. Tens to thousands is two orders of magnitude; tens to millions is three orders of magnitude, etc.  beyond competitive systems. The Panasas system seamlessly scales capacity from gigabytes to petabytes within a single unified namespace. And it is dynamically self-manageable. These remarkable results are only achieved by changing the storage landscape to a smart object-based clustering design.

www.panasas.com

Rod Schrock is CEO (1) (Chief Executive Officer) The highest individual in command of an organization. Typically the president of the company, the CEO reports to the Chairman of the Board.  and president of Panasas Corporation (Fremont, CA).
COPYRIGHT 2003 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2003, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Storage Networking
Author:Schrock, Rod
Publication:Computer Technology Review
Date:Oct 1, 2003
Words:1502
Previous Article:Finally, affordable SANs.(Storage Networking)
Next Article:It's 2003: do you know where your data is? The government is enforcing strict new guidelines on archived data. Is your company complying?(Storage...



Related Articles
SteelEye works with HP to deliver HP ProLiant DL380 Packaged Clusters to Linux customers.(SteelEye LifeKeeper for Linux 4.1.1)
Virtual storage and real confusion: a big disconnect between what vendors offer and what users want.(Industry Overview)
The next evolution in storage: clustered storage architectures.(Storage Management)
High Performance Computing: past, present and future.(Storage Networking)
SOFTWARE VENDER LICENSES OCM INTERFACE ON LINUX.
Clustered network storage: part one; Smarter, faster, cheaper and easier.(first in/first out)
Storage clustering.(Information storage and retrieval)
Clustered storage: improved utility for production computer clusters.(Storage Clustering)
Clustered network storage: part two; An evolution in storage.
Clustered network storage.(first in/first out)

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles