Clustering Strategies For Web Environments.Traditional computer and network architectures work well for most enterprise applications, but begin to break down when applied to the task of reliably serving up Internet content. Providing static and streaming content in an Internet environment requires seamless scalability, manageability, distrib-uted storage, and impressive levels of reliability. These goals are very difficult to achieve using the "big iron" approach. Better suited to this application space is server clustering See clustering. , where multiple specialized servers (or server appliances A self-contained computer system specialized for network use. Its applications are pre-installed, and access to setup and configuration is via a Web browser. Server appliances may provide a single application or several applications; for example, a single device may provide file server, ) are harnessed to work together as one. Clusters are constructed by connecting different types of systems and other devices (such as storage arrays) to one another. The elements that comprise a cluster, nodes, come in several different types. There are compute nodes that provide raw computation. These are typically server-class computers that contain a large amount of memory and have multiple high-speed processors. Scientific computing labs often have clusters of nothing but compute nodes, with parallel processing parallel processing, the concurrent or simultaneous execution of two or more parts of a single computer program, at speeds far exceeding those of a conventional computer. software distributing a single application between computers. These may also be general-purpose computers that are applied to solv-ing specific problems. Another common type of cluster element is the storage node. The storage node may offer up its content via the normal TCP/IP TCP/IP in full Transmission Control Protocol/Internet Protocol Standard Internet communications protocols that allow digital computers to communicate over long distances. network. This type of storage delivery is known as a Network Attached Storage (NAS (1) See network access server. (2) (Network Attached Storage) A specialized file server that connects to the network. A NAS device contains a slimmed-down operating system and a file system and processes only I/O requests by supporting the popular ) device. Alternately, the storage node may be attached to the other nodes within a cluster directly through a Storage Array Network (SAN) such as Fibre Channel or directly to a NAS device via a SCSI SCSI in full Small Computer System Interface Once common standard for connecting peripheral devices (disks, modems, printers, etc.) to small and medium-sized computers. SCSI has given way to faster standards, such as Firewire and USB. channel. This type of device is called a storage array. Other types of nodes can be lumped into the category of application specific server appliances. These nodes are, then, named for their specific function. For example, a basic Web-serving appliance might become a Web server node. A computer outfitted with encryption acceleration hardware would be a commerce accelerator node. The types of server appliances being delivered to solve particular problems are ever growing. There are several views into the underlying technologies employed in building a cluster. Some approach clustering simply as an exercise in reliability, using fail-over pairs to allow servers to provide levels of redundancy between nodes. Others view clustering as a single point of management for a grouping of machines. Clustering also takes on the issues of load balancing The fine tuning of a computer system, network or disk subsystem in order to more evenly distribute the data and/or processing across available resources. For example, in clustering, load balancing might distribute the incoming transactions evenly to all servers, or it might redirect them and cluster-level storage networks, where many servers share responsibility of serving content for a single Web site from a common view of the content. Add to this mix the idea of specialized appliances for SSL (Secure Sockets Layer) The leading security protocol on the Internet. Developed by Netscape, SSL is widely used to do two things: to validate the identity of a Web site and to create an encrypted connection for sending credit card and other personal data. encryption and streaming media See streaming audio, streaming video and digital media hub. acceleration and the choices grow quickly. Management Clusters The most basic and common type of clustering is the management cluster. A management cluster is the grouping of nodes into a single entity that is easy to manage. There are two principal types of management employed in these sorts of clusters: in-band and out-of band management. In-band management is the ability to monitor and administer a cluster of computers using on-board software called agents. This is applicable in a management cluster if the agents can be partitioned into logically managed com-munities of machines. This type of management is called "in-bound" because it utilizes the resources of the underlying operating system operating system (OS) Software that controls the operation of a computer, directs the input and output of data, keeps track of files, and controls the processing of computer programs. . A basic example of a management cluster using inband management is the SNMP (Simple Network Management Protocol) A widely used network monitoring and control protocol. Data are passed from SNMP agents, which are hardware and/or software processes reporting activity in each network device (hub, router, bridge, etc. community. SNMP allows an administrator to logically associate nodes in order to apply monitoring and management policies as a group. SNMP is not the only in-band management path for cluster management. There are a number of proprietary approaches offered by the various clustering vendors. Monitoring and administering clusters using out-of-band management Out-of-band management (sometimes called Lights-out management or LOM) is the use of a dedicated management channel for device maintenance. It allows a system administrator to monitor and manage servers and other network equipment by remote control regardless of leverages maintenance processors that are independent of the cluster node being monitored. This is a back door" into the server hardware that allows access, regardless of the state of the operating system and applications software run-ning on the node. Most maintenance processors provide control of the front panel functions of the node, allowing for things like remote power-off and power-on. They are also often tied into the System Management Bus (SMBus) on the node being monitored, allowing remote monitoring (protocol) remote monitoring - (RMON) A network management protocol that allows network information to be gathered at a single computer. Whereas SNMP gathers network data from a single type of Management Information Base (MIB), RMON 1 defines nine additional MIBs that provide a of on-board sensors (such as voltage, temperature, fan speed, etc.). Communication between the maintenance processors and the administering machine is provided by an external connection dedicated to the maintenance processors. Examples of these interconnects include serial ports, dedicated network interfaces, and extended 12C buses. The most flexible in building clus-ters are those interconnects that allow daisy chaining Connected in series, one after the other. Transmitted signals go to the first device, then to the second and so on. A SCSI Daisy Chain Both internal and external SCSI devices are daisy chained together. of machines into a physical cluster. Other interconnects may require concentrators or other external hardware. Effective management of a cluster involves the integration of in-band and out-of-band management paths. This allows the administrator to control the cluster node in the event of a hard failure and also allows the OS to be managed. This is illustrated in Fig 1. Availability Clusters High Availability Also called "RAS" (reliability, availability, serviceability) or "fault resilient," it refers to a multiprocessing system that can quickly recover from a failure. There may be a minute or two of downtime while one system switches over to another, but processing will continue. refers to the notion that nodes within a cluster can automatically take over for nodes that fail. There are typically two views of availability within a cluster: redundancy and fail-over. A properly partitioned cluster serving static Web content through a load balanced provides automatic redundancy. If a Web-serving node fails, then the load balancer just stops sending traffic to that node. This provides a level of innate redundancy that wouldn't be available outside of a cluster. The more popular view of availability clustering is that of fail-over. Using fail-over technology, machines within a cluster are assigned backup nodes. Each protected machine sends periodic "heartbeat" messages to its back-up node. If the heartbeat fails to appear in some configurable time period, then the applications (and other resources such as the node's virtual IP address) are taken over by the back-up node. When nodes that are actively running applications back each other up, this is known as an "active-active" configuration (since both machines are actively doing work). The opposite of this is "active-standby," where a back-up machine is dedicated to no other purpose than providing back up to an application server. The benefit of an active-active fail-over solution is that each node within a cluster is doing work. Setting up an active-active configuration requires some amount of capacity planning Determining the required future configuration of hardware and software for a network, datacenter or Web site. There are numerous capacity planning tools on the market used to monitor and analyze the performance of the current hardware and software. to ensure that the workload that each would assume in the case of a failure doesn't overwhelm the machine. Applications that migrate from a failing node to a backup node may require that application's data to be available on both nodes. This is typically achieved through shared storage. This storage may be a shared SCSI disk drive or virtual shared storage through a volume of file level disk mirroring software. Storage Architectures Cooperating nodes within a cluster often require access to the same data files. In the case of static content such as Web pages, these files may be periodically distributed to each node within a cluster. This is achieved using replication software. Dynamic data such as a shared database or rapidly changing data files must reside on storage that is easily accessible to each node within a cluster. There are a couple of ways to achieve this. A cluster can contain a Network Attached Storage device or may access shared drives on a storage array network. Network Attached Storage (NAS) is, perhaps, the simplest and most flexible method to add storage to a cluster. A NAS device is a specialized appliance that controls some amount of storage and looks to the nodes within a cluster as if they are a collection of file shares. Most NAS appliances export NFS (Network File System) The file sharing protocol in a Unix network. This de facto Unix standard, which is widely known as a "distributed file system," was developed by Sun. See file sharing protocol and WebNFS. NFS - Network File System (for sharing file systems in UNIX UNIX Operating system for digital computers, developed by Ken Thompson of Bell Laboratories in 1969. It was initially designed for a single user (the name was a pun on the earlier operating system Multics). environments) and CIFS (Common Internet File System) The file sharing protocol used in Windows. It evolved out of the SMB (Server Message Block) protocol in DOS, which is why the terms CIFS/SMB and SMB/CIFS are sometimes seen. The word "Internet" in the CIFS name has little relevance. (the file sharing Copying files from one computer to another. See peer-to-peer network, file sharing protocol and file and printer sharing. mechanism used in the Windows world An earlier computer exposition sponsored by COMDEX. Its first show was in 1991, and it was often held in conjunction with another computer show. See COMDEX. ). Back-end storage on a NAS device could be anything from a collection of disks to high-level RAID arrays. The choice of which to use is highly dependent upon the problem being solved by the cluster. The network connection between the storage nodes and the other nodes within a cluster is, typically, a high-speed network such as l00Mbit or Gigabit Ethernet An Ethernet standard that transmits at 1 Gbps. Used mostly to connect high-end workstations and servers as well as for network backbones, Gigabit Ethernet transmits full duplex from point to point using switches and half duplex in a shared environment (CSMA/CD) using a hub. . The other common storage architecture used in clusters is the Storage Area Network (SAN). The most popular type of SAN is the Fibre Channel Arbitrated Loop A ring topology used in Fibre Channel. Up to 127 devices may be attached in the loop, but only two can communicate at the same time, reflecting the channel nature of Fibre Channel technology. (FCAL FCAL Fibre Channel Arbitrated Loop ). SANs allow direct, very high speed, connection between the storage and the cluster nodes. SAN architectures provide a very high-end and expensive solution to the storage problem. It is most valuable in applications that require very fast access to shared drives. In contrast, NAS devices are economical and usually offer more than enough performance for Web content serving applications where the performance of the application is often bounded by the speed of the incoming Internet connection. Application Specific Server Appliances Some tasks within a content serving cluster cannot be satisfied using generic servers or storage nodes. These problems are often solved with server appliances tuned to easily solve a particular problem. Examples of application specific server appliances include devices such as load balancers, commerce accelerators, and content cache engines. Load balancers are widely used in Web serving clusters. These appliances provide a single point of interface to an IP address. Incoming requests to a Web site or IP address are, then, routed by the load balancer to one of an array of Web serving nodes within the cluster. Web sites that require the user to enter confidential information Noun 1. confidential information - an indication of potential opportunity; "he got a tip on the stock market"; "a good lead for a job" steer, tip, wind, hint, lead such as credit card numbers or stock-trading information use an encrypted communication path known as Secure Socket Layer (SSL). Commerce acceleration engines provide a highly tuned hardware and software package to very quickly process SSL traffic. Combined with load balancers that can differentiate SSL traffic from non-SSL traffic, commerce acceleration appliances can provide great benefit by directing traffic to the servers most capable of handling it. Content caching engines provide storage for very frequently accessed Web content. These appliances live between the back-end Web servers and a user's Internet Service Provider Internet service provider (ISP) Company that provides Internet connections and services to individuals and organizations. For a monthly fee, ISPs provide computer users with a connection to their site (see data transmission), as well as a log-in name and password. to offer up Web pages without impacting the backend servers. Properly configured and placed, these can offer a great performance boost to an active Web site. Building the Cluster There are no great secrets to building an effective cluster. Choose wisely and building a cluster becomes the simple matter of plugging the necessary components together. A typical cluster is illustrated in Fig 2. The important thing is to understand the problem to be solved. Research and fully understand the benefits offered by the underlying technologies (this article has barely scratched the surface). High traffic Web sites that perform a large number of SSL transactions may benefit from commerce accelerators and caching engines, while application service providers may need high-speed SAN storage arrays. Whatever cluster is developed, it should have a consistent management interface. Out-of-band management is essential to limit the amount of time required for a technician to be standing in front of a machine. This becomes crucial as co-location sites become more and more prevalent for content serving on the Web. Steven McDowell leads the platform software team at Network Engines, Inc. (Canton, MA). |
|
||||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion