High Performance Computing: past, present and future.Cluster computing has forever altered the landscape of High Performance Computing (HPC (Handheld PC) A palmtop computer that weighs less than one pound and runs specialized versions of popular applications. Microsoft coined the term for its Windows CE operating system, which is an abbreviated version of Windows. See Pocket PC. ). From humble beginnings as part of a NASA NASA: see National Aeronautics and Space Administration. NASA in full National Aeronautics and Space Administration Independent U.S. project in the early 1990s, Beowulf clusters have now secured their place as the predominant high-performance computing architecture. Advances in scalable storage architectures promise further changes that will continue to drive supercomputing into more commercial IT organizations. High Performance Computing: A Landscape in Transition Only a few years ago, the typical supercomputer was built using custom silicon, proprietary high performance interconnects and specialized storage subsystems. Companies such as IBM (International Business Machines Corporation, Armonk, NY, www.ibm.com) The world's largest computer company. IBM's product lines include the S/390 mainframes (zSeries), AS/400 midrange business systems (iSeries), RS/6000 workstations and servers (pSeries), Intel-based servers (xSeries) , Amdahl, Cray and Fujitsu developed complex systems that took years to bring to market and cost millions to tens of millions of dollars. Such systems were available only to national and international government-funded research centers. Compare that with the current trend in high performance computing: cluster computers based on personal computer architectures, commodity microprocessors, Gigabit Ethernet and standard networked storage architectures. These systems represent the growing wave of commodity supercomputers. They are developed and sold by some of the traditional high-performance computing vendors, such as IBM, high volume PC manufacturers, such as Dell and HP, and a new breed of cluster computing vendors that includes LinuxNetworks, Rackable Systems and RackSaver. These systems can be acquired for as little as a hundred thousand dollars, making them available to a wide range of government, industry and academic institutions. Cluster Computing: To Infinity and Beyond The growth of cluster computing has been fueled by a number of key technology developments in recent years. First is the rapid advancement of CPU CPU in full central processing unit Principal component of a digital computer, composed of a control unit, an instruction-decoding unit, and an arithmetic-logic unit. technology, in accordance with Moore's Law. Proprietary vector processors have given way, first to RISC-based processors employed in HPC systems of the 1980s and early 1990s, and more recently to commodity Intel processors whose integer and floating-point performance improvements have outpaced more specialized processors. Second is the commoditization Commoditization 1. A situation when illiquid financial contracts are changed or modified in a way that promotes trading and results in a more liquid market. 2. Making a product into a commodity. Notes: 1. of high performance networking technology, which provides the interconnection network, required for cluster computers to communicate with one another. Last is the maturation of the software infrastructure required to orchestrate the activities of hundreds or thousands of cluster computers, making the task of effectively harnessing these hardware advancements accessible to a larger group of programmers. This last area has been instrumental in simplifying the installation, management and use of cluster computing systems. Together, these developments comprise a layer known as cluster middleware. Cluster management and monitoring software is now a common component of cluster vendors' offerings. This software allows for remote management of the cluster computing resources, including provisioning of additional nodes, node imaging (remote loading of the operating system) and application provisioning. Distributed resource management (DRM (1) (Digital Radio Mondiale) A digital audio broadcasting (DAB) system for AM radio in Europe. See HD Radio. (2) (Digital Rights M ) software, such as Platform Computing's LSF LSF Lisofylline, see there , provides a means for managing and monitoring the distribution of computing jobs across the cluster from a single point of control. Parallel programming libraries and compiler enhancements, such as MPI MPI - Message Passing Interface and OpenMP, and distributed debugging and monitoring tools support the development of parallel computing applications that are able to effectively leverage these cluster computer architectures. The result of these activities has been a broadening of high-performance computing. Traditional scientific high-performance computing applications such as high-energy physics research, environmental science and weather prediction, aerospace engineering, seismic data analysis, and signal and image processing, have been joined by a new wave of industrial applications. Drug discovery, circuit design, automobile design, financial analyses and digital media applications all benefit from the use of large cluster computing configurations. Scalable Storage: The Final Frontier These technology advances have removed many of the impediments encountered by early adopters of cluster computing technology. Still, one crucial development wave remains to complete the accessibility of cluster computing: the development of scalable commodity storage architectures to complement the scalable computing and networking architectures (Figure 1). [FIGURE 1 OMITTED] [FIGURE 2 OMITTED] Today's cluster computing approaches employ a scale-out or data parallel computing model. In this model, applications apply a 'divide-and-conquer' approach: the computing problem is decomposed into pieces by identifying individual data partitions that comprise an individual task. Each of those tasks is then distributed to one of the compute cluster nodes for processing. Program inputs and outputs are typically maintained in centralized datasets residing on long-term storage subsystems, so data parallel applications must typically scatter portions of the input dataset to each compute node, then gather partial results and combine them into the final output. The scatter-gather process depicted in Figure 2 represents a typical bioinformatics sequence analysis processing scenario. Sequence analysis locates similarities between DNA DNA: see nucleic acid. DNA or deoxyribonucleic acid One of two types of nucleic acid (the other is RNA); a complex organic compound found in all living cells and many viruses. It is the chemical substance of genes. or protein sequences that can be used to understand genetic similarities and differences from individual to individual and across species. It is at the core of current genomic research efforts. In this depiction, the genomic sequence database of interest is partitioned and distributed (scattered) amongst a number of cluster computing nodes. Query sequences are broadcast (copied) to the nodes, where they are compared with the respective target partitions. The partial results generated on each node are then combined (gathered) to form the final result set. This scattering and gathering of data partitions is time consuming and susceptible to program errors or compute node failures, making the overall process difficult to manage. Additionally, the massively parallel nature of large cluster computing environments (with thousands of processing nodes) is such that the number of concurrent accesses to the file system can be very high. Moreover, many parallel applications process data in somewhat synchronous waves where all of the processors are performing I/O (Input/Output) The transfer of data between the CPU and a peripheral device. Every transfer is an output from one device and an input to another. See PC input/output. I/O - Input/Output operations (sometimes to the same files) at nearly the same time. These two characteristics manifest themselves as two unique storage system requirements: high concurrency Operations that are performed simultaneously within the computer. For example, dual-core CPUs provide complete overlapping of two independent processes. See dual core, hyperthreading, multiprocessing, multitasking, multithreading, SMP and MPP. concurrency - multitasking to shared data and high aggregate throughput. In order to serve the needs of growing cluster computer installations, such storage needs to be scalable (both in capacity and in bandwidth) in order to balance activity across storage system components and ensure adequate performance. Traditional storage approaches alone, including direct attached storage (DAS), network attached storage (NAS (1) See network access server. (2) (Network Attached Storage) A specialized file server that connects to the network. A NAS device contains a slimmed-down operating system and a file system and processes only I/O requests by supporting the popular ), and storage area networks (SAN) are insufficient to address many of these key requirements. Scalable Storage Approaches In the last few years, a number of storage systems have been developed to complement cluster-computing solutions. These include parallel and distributed file systems (sometimes called clustered file systems), such as DFS (Distributed File System) An enhancement to Windows NT/2000 and 95/98 that allows files scattered across multiple servers to be treated as a single group. With Dfs, a network administrator can build a hierarchical file system that spans the organization's LANs and , PVFS PVFS Post-Viral Fatigue Syndrome (UK) PVFS Parallel Virtual File System , Sistina's GFS See Google File System. GFS - Grandfather, Father, Son , Silicon Graphics' CXFS CXFS Clustered Extended File System (SGI) and IBM's GPFS GPFS General Parallel File System GPFS General Purpose Financial Statements GPFS General Purpose Flight Simulator GPFS Gallery Parallel File System . Each of these attempts to eliminate storage bottlenecks by distributing data to a number of storage devices and partitioning file system access among a number of "metadata servers." The most recent, and perhaps most promising, advances are coming out of an area known as Object Storage. Object storage provides a new way of distributing and parallelizing To generate instructions for a parallel processing computer. file systems to support highly scalable configurations--precisely those needed by cluster computing systems. An emerging standard, known as Object-based Storage Devices (OSD (1) (On-Screen Display) An on-screen control panel for adjusting monitors and TVs. The OSD is used for contrast, brightness, horizontal and vertical positioning and other monitor adjustments. ), has been developed by a technical working group within the Storage Networking Industry Association An association of producers and consumers of storage networking products, whose goal is to further storage networking technology and applications. The Storage Networking Industry Association, or SNIA (SNIA (Storage Networking Industry Association, San Francisco, CA, www.snia.org) An organization devoted to the advancement of mission critical storage systems. Founded in 1997, its goal is to determine the standards that must be developed to allow hosts and storage systems to interact via ), and is working its way through the American National Standards Institute See ANSI. (body, standard) American National Standards Institute - (ANSI) The private, non-profit organisation (501(c)3) responsible for approving US standards in many areas, including computers and communications. ANSI is a member of ISO. (ANSI (American National Standards Institute, New York, www.ansi.org) A membership organization founded in 1918 that coordinates the development of U.S. voluntary national standards in both the private and public sectors. It is the U.S. member body to ISO and IEC. ) standards process. This standard defines an object storage SCSI command set that can be implemented over iSCSI to implement intelligent network attached storage devices that form massively parallel storage systems. Panasas, a new network storage company deploying an object storage file system, recently demonstrated scalability for sequential I/O loads in excess of 10 gigabytes per second in cluster computing configurations. Additionally, the system posted record-breaking SPECsfs results indicative of the system's ability to support general networked file system loads. Into the Future: Diskless Clusters High performance shared storage architectures capable of effectively serving thousands of compute nodes are further changing the HPC landscape. In an effort to improve manageability and increase system reliability, a number of researchers are developing diskless clusters that exploit these shared storage architectures. Diskless clusters simplify the node architecture by removing local disk and other non-essential components, significantly reducing the probability of node failure. This approach further reduces per-node costs and increases node density to provide increased levels of price-performance. To exploit these improvements, new operating system approaches such as the Beowulf Distributed Process Space project (BProc) are under way to further simplify cluster management. Proponents claim that such an approach provides higher availability and improves reliability, resource utilization and overall system performance. These benefits, together with a simplified management model, accelerate commodity supercomputing. www.panasas.com Bruce Moxon is manager, vertical marketing, for Panasas, Inc. (Fremont, CA) |
|
||||||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion