Lock Onto The Grid!Distributed processing The first term used to describe the distribution of multiple computers throughout an organization in contrast to a centralized system. It started with the first minicomputers. Today, distributed processing is called "distributed computing." See also client/server. to reinvent the Internet At the end of February I was one of several journalists flown to Mark-ham, Ontario (sometimes called Silicon Valley North) to view a presentation on an emerging distributed computing (1) The use of multiple computers networked throughout a wide geographical area, or the world via the Internet, in order to solve a single problem. See grid computing. (2) The use of multiple computers in an enterprise rather than one centralized system. infrastructure. The concept, known as The Grid, is widely viewed as the next logical step in the evolution of the Internet. While still relatively new, The Grid promises a true paradigm shift A dramatic change in methodology or practice. It often refers to a major change in thinking and planning, which ultimately changes the way projects are implemented. For example, accessing applications and data from the Web instead of from local servers is a paradigm shift. See paradigm. in the way people use computing resources, and in the underlying architecture of the Internet itself. While we often hear about WAP (1) (Wireless Access Point) See access point. (2) (Wireless Application Protocol) A standard for providing cellular phones, pagers and other handheld devices with secure access to e-mail and text-based Web pages. , Internet2 and other solutions aimed at expanding the reach and throughput of the Internet, The Internet, the, international computer network linking together thousands of individual networks at military and government agencies, educational institutions, nonprofit organizations, industrial and financial corporations of all sizes, and commercial enterprises Grid is no less important, and in fact may play a significantly more critical role in computing's future. The Grid is based on the concept that distributed computers, connected to one another--via LANs, WANs, and even the Internet--can share computing tasks, and intelligently divide computing jobs among hundred or thousands of machines, based on available resources. The idea is best explained in the book The Grid. Blueprint for a New Computing Infrastructure, edited by Ian Foster Ian Foster is the Senior Scientist (Associate Division Director) in the Mathematics and Computer Science Division at Argonne National Laboratory, where he leads the Distributed Systems Laboratory, and he is a Professor in the Department of Computer Science at the University of and Carl Kesselman Carl Kesselman is a project leader at the University of Southern California's Information Sciences Institute and a Research Associate Professor in Computer Science, also at the University of Southern California. , winners of the prestigious 1998 Global Information Infrastructure Next Generation Award. Vint Cerf (person) Vint Cerf - (Vinton G. Cerf) The co-inventor with Bob Kahn of the Internet and its base protocol, TCP/IP. Like Jon Postel, he was crucial in the development of many higher-level protocols, and has written several dozen RFCs since the late 1960s. has termed it "a source book for the history of the future." This article will discuss the ideas behind the technology, and how one company is moving the idea from theory to reality. Imagine a national power grid designed like the current Internet. Each home would have a generator in the basement, cranking out electricity during the day for use only in that home. At night, when electricity demands were low, the generator would be idle or running at very low power. If the generator was battery powered instead of gas powered, it might be connected to a central power source which would allow it to recharge when needed. Other than this connection, the home would create its own energy without any need for external support. This is basically the design of the current Internet. PCs crunch bits locally, and connect to a Web server when additional information is needed or desired. In essence, computing has moved from non-distributed datacenter (mainframe to terminal) to networked distribution (server to client with internal datacenter) to externally distributed (multiple Web servers to multiple clients). Some might argue that the rise of the ASP indicates a move back to central processing, but really the ASP simply represents a geographic shift in processing--from an in-house to an external datacenter. But in all cases, the actual processing paradigm is the same: processing is still not shared; that is, data is processed in one place and then sent to a remote system for use. The Grid represents the next logical step in the development of distributed computing, and that is distributed processing. Back to the power grid analogy. Power is created in numerous power plants, and then distributed where it's needed. Since the distribution structure is redundant, if one generation facility produces less power or a recipient needs less, resources are redistributed easily and transparently. Of course, the power sometimes goes out; no system is foolproof. The point is that the power level in our homes and businesses remains constant. And, when supply outstrips demand--like in the middle of the night--prices are cheaper. The authors of The Grid believe that the critical next phase in information technology lies in computational grid environments, sometimes referred to as distributed supercomputing. In this model, connected computers perform super-massively parallel symmetrical multiprocessing on a vast scale: hundreds, thousands, potentially even millions of machines working in concert to perform tasks too complex for any one system (no matter how powerful). In effect, what the book suggests is a systems of networks--of which the Internet is just one piece--similar to today's power grid. A massively redundant environment of shared systems, where resources and data are redistributed on-the-fly to the systems where they can best be utilized at any given moment in time. Ready, SETI SETI (sĕt`ē) [Search for ExtraTerrestrial Intelligence], name given to a series of independent programs to detect radio signals from civilizations beyond the solar system. [ldots] Let's briefly examine a real-world example, one which isn't perfect but effectively illustrates the idea. The Search For Extraterrestrial Intelligence project (SETI) takes a constant stream of data coming from space and analyzes it for indications of intelligent life in the universe. Because of the nature of data--terabyte after terabyte that never stops flowing--SETI came up with a novel idea: Why not let idle computers around the world help to process the data? So, anyone with the proper system requirements To be used efficiently, all computer software needs certain hardware components or other software resources to be present on a computer system. These pre-requisites are known as (computer) system requirements and are often used as a guideline as opposed to an absolute rule. can download the SETI software, which will crunch the numbers while the CPU CPU in full central processing unit Principal component of a digital computer, composed of a control unit, an instruction-decoding unit, and an arithmetic-logic unit. is otherwise idle, effectively creating a supercluster su·per·clus·ter n. A group of neighboring clusters of galaxies. supercluster A large group of neighboring clusters of galaxies, along with isolated galaxies scattered between them, the entire collection with millions of nodes. Symmetrical multiprocessing and superclustering are not new concepts, of course. Complex modeling, simulation, and engineering tasks are often broken into pieces and assigned to multiple systems. But The Grid throws several new ingredients into the mix: multiple operating system operating system (OS) Software that controls the operation of a computer, directs the input and output of data, keeps track of files, and controls the processing of computer programs. support, dynamic task allocation and monitoring, and geographic disparity. That's where Markham, Ontario Markham (2006 Population 261,573[0]) is located in York Region, directly north of Toronto, and is part of Toronto's CMA. It is larger than many Canadian cities. Despite its qualifications regarding population, it has not had the title of city conferred upon it by the comes in. Canada's Platform Computing Platform Computing is a privately held software company that is primarily known for its job scheduling product, Load Sharing Facility (LSF). It was founded in 1992 in Toronto, Ontario, Canada. Its headquarter is in Markham, ON. expects to be at the forefront of software development for distributed supercomputing. The company's primary software suite, called LSF LSF Lisofylline, see there (Load Sharing Facility), is used to manage clustered systems in high-performance/high-availability environments. With version 4.0, the company is taking a major step forward by adding advanced monitoring tools which can manage thousands of systems and dynamically alter and assign workflow. While the software shouldn't by any means be considered part of The Grid as envisioned by Foster and Kesselman, it does offer some intriguing glimpses of what future networks based on Grid technology might look like. "The Grid represents an evolution in computing," says David Wilmering, director of product management at Platform. "The Web has evolved into a commerce tool, and e-commerce is driving extranets. The next step is The Grid, which is being driven by the enormous increase in data created by the Web. The amount of data to be processed is outstripping processing power, even as processing power adheres to Moore's Law "The number of transistors and resistors on a chip doubles every 18 months." By Intel co-founder Gordon Moore regarding the pace of semiconductor technology. He made this famous comment in 1965 when there were approximately 60 devices on a chip. . This is what The Grid seeks to address." Like ARPANet, which eventually evolved into the modern Internet, The Grid today is primarily the realm of academics, computer scientists, and the government. Some pilot Grid projects, most notably NASA's Information Power Grid (IPG IPG Implantable pulse generator, see there ) and the Department of Energy's Distance and Distributed Computing and Communication (DisCom2), are in development and testing right now. These architectures generally combine geographically dispersed Unix boxes to run various processor-intensive tasks, though most have only a handful of machines on their networks. Platform Computing is actively involved in several supercomputing projects, including the NCSA's 512-processor NT Supercluster and the decoding of Human Chromosome 22 at the Sanger Centre in the U.K. Platform's LSF software is more than a utility which allows processor-intensive data to be spread across multiple CPUs; after all, clustered systems have been performing such tasks for years. What's different about LSF 4.0 is that it is performing what Wilmering calls application resource management. The software monitors all machines in a cluster and dynamically assigns processing tasks depending on the state of each machine's resources--hardware and software. If it sees that one machine is swapping to disk cache excessively, it transfers work to a machine with more free memory. Its management module can spot a machine that's just been upgraded with more RAM or a new processor and dynamically re-assigns tasks based on the new configuration. LSF can analyze network performance by gathering stats on idle time The duration of time a device is in an idle state, which means that it is operational, but not being used. , CPU, disk, memory utilization, and software license availability, and generate reports--or simply alter processing based on its findings. And, in the event of a host failure, the software will transparently transfer host duties to the machine with the most available resources. (And it will repeat the process if the back-up host fails.) What's more, it supports heterogeneous networks, and can run on various flavors of Unix, Linux, and the previously mentioned NT. The Grid is probably at least five years down the road in terms of general availability. But, like the Internet, its potential is practically limitless, since--by definition--it is an infinitely scalable technology. With broadband connectivity in homes expected to accelerate, we are not very far from an Internet running end-to-end at LAN (Local Area Network) A communications network that serves users within a confined geographical area. The "clients" are the user's workstations typically running Windows, although Mac and Linux clients are also used. speeds. When this happens, distributed processing will move from academia to reality. |
|
||||||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion