Shared Data Clusters: Achieving Application Scalability And Availability With A SAN.Paul Massiglia Anne Janzer Examining the requirements for implementation This article is the first in a two-part series. The second part will appear in the February issue of CTR See click-through rate. . The emergence of Storage Area Networks (SANs) is enabling new system configurations that can leverage shared storage. One common system configuration is the "shared nothing" or availability cluster in which multiple computers share common storage and client access and can "take over" for a computer (node) in the cluster that fails. Shared nothing clusters can be implemented on Windows or Unix systems Noun 1. UNIX system - trademark for a powerful operating system UNIX, UNIX operating system operating system, OS - (computer science) software that controls the execution of computer programs and may provide various services using products like VERITAS Cluster Server Veritas Cluster Server (also known as VCS) is a High-availability cluster software, for Unix, Linux and Microsoft Windows computer systems, created by Veritas Software (now part of Symantec). or Microsoft Cluster Server Clustering software from Microsoft for Windows NT/2000. It provides rudimentary load balancing and two-node failover, which allows a second server to take over if the first one fails. Cluster Server was formerly code named Wolfpack. . The servers in these clusters do not access the same data concurrently. Instead, one server owns and accesses data in the common storage. That capability can be passed to another server if the first should fail. A shared data cluster takes a different approach, allowing nodes in a cluster to access the same data at the same time. Although sharing data introduces challenges of maintaining data integrity, it also leverages the capabilities of the SAN to consolidate data, improve availability, and support scalable solutions. This article discusses the requirements for implementing shared data clusters and the challenges and benefits of this approach. It also discusses the VERITAS SANPoint Foundation Suite HA to illustrate how these challenges can be addressed with a packaged software See software package. solution. The second part of this article (to appear in the February issue of CTR) will describe sample applications for shared data clusters, focusing on those kinds of applications that can derive the greatest benefit from this configuration. These articles concentrate on applications that use file system data; they do not discuss the separate but important issue of clustered databases. Databases, whether residing in file systems or on raw partitions, require clustered database managers such as Oracle Parallel Server A version of the Oracle database system designed for massively parallel processors (MPPs). It allows multiple CPUs to access a single database. (database) Oracle Parallel Server to provide the highly granular granular /gran·u·lar/ (gran´u-lar) made up of or marked by presence of granules or grains. gran·u·lar adj. 1. Composed or appearing to be composed of granules or grains. 2. locking and integrity capabilities necessary in a multi-instance environment. This subject is beyond the scope of this discussion. Background: Clusters And Shared Storage A cluster is generally defined as a grouping of computer systems that cooperate to work as a single entity in some capacity from a client's perspective. These systems must communicate with each other and share access to common storage. The essential components of a cluster include (Fig 1): * Multiple, independent servers * Common client access * Commonly-accessible storage * Software that manages the "clustering" behavior Although they have been around for quite a while in proprietary or high end applications, clusters are an increasingly popular system design today for a number of reasons. Availability is a key concern not just for a few specialized applications, but also for business applications ranging from financial institutions to online storefronts A store on the Internet that offers items for sale and is capable of handling the financial transaction online. See cybermall and digital money. to departmental file servers. A cluster enhances application availability because it provides redundant servers accessing common storage; if one server fails, another can pick up its processes. Improvements in processors and systems are making it possible to create clusters of desktop or midrange midrange Epidemiology The halfway point or midpoint in a set of observations; for most data, MR is calculated as the sum of the smallest observation and the largest observation, divided by 2; for age data, one is added to the numerator; a midrange is usually systems that can serve very demanding environments. SANs, which may be adopted for a variety of reasons, provide readily-available shared storage for cluster configurations. Clusters do not actually require SANs--you can build a cluster using a simple switched SCSI See SCSI switch. device. But the SAN makes it possible to share much larger storage pools between more servers. And clusters can help organizations leverage the benefits of the SAN implementation. Shared Storage vs. Shared Data Although all clusters share access to common storage, not all clusters actually share the data itself. The most common cluster architecture today is a "shared nothing" cluster, which means that the systems in the cluster do not share memory or concurrent access The ability to gain admittance to a system or component by more than one user or process. For example, concurrent access to a computer means multiple users are interacting with the system simultaneously. to data on the common storage. Although data resides on common storage (the SAN), only one system "owns" and accesses the data at any time. Another system (or node) on the cluster only accesses that data if it is taking over the tasks of the original node (Fig 2). A shared nothing cluster is useful for improving application availability. Any application can run in this configuration transparently. If there is a failure, another node in the cluster can start the application and serve new requests. Many web farms adopt this basic architecture. Each node (web server) accesses its own read-only copy of a web site. Load balancing The fine tuning of a computer system, network or disk subsystem in order to more evenly distribute the data and/or processing across available resources. For example, in clustering, load balancing might distribute the incoming transactions evenly to all servers, or it might redirect them applications (such as Cisco Local Director) direct clients to web servers. Scaling up is as easy as adding another server with its own copy of the site. A shared data cluster actually lets multiple nodes in the cluster access shared data concurrently. The basic hardware configuration is unchanged. The difference lies in the software managing the shared data--the clustering software as well as the clustered volume and file system software (Fig 3). Sharing concurrent access to data offers many benefits. * Storage consolidation: Administrators only need to maintain a single copy of data that is shared. For example, web farms only keep one copy of the web site data instead of creating one for each web server. * Fast failover: If a file system is mounted and shared by several nodes, then failover is accelerated. If one node fails, another node that has mounted the same file system can take over its applications without waiting to mount the file system. * Scalability: If multiple servers can perform the, same basic work accessing shared data, this configuration provides scalability to handle increased loads. Challenges Of Shared Data Clusters The obvious challenge of shared data clusters is maintaining data integrity without degrading TO DEGRADE, DEGRADING. To, sink or lower a person in the estimation of the public. 2. As a man's character is of great importance to him, and it is his interest to retain the good opinion of all mankind, when he is a witness, he cannot be compelled to disclose application performance. How do you ensure that the data remains consistent if several nodes can access and update it at the same time? There are several specific challenges to maintaining data integrity in a shared environment: * Consistency of file system meta data. The clustered environment must maintain the consistency and integrity of the file system itself. If one node adds a file, others must see it. A node cannot delete a file that another node is working on. Maintaining the integrity of the file system meta data is a key challenge for shared data clusters. * Preventing conflicting updates. In a single-server environment, the file system itself ensures that multiple users do not write conflicting updates to a file. The traditional solution at the file system level is file-level locking. A clustered file system A clustered file system is a file system which is simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system. must protect files from conflicting updates from multiple nodes, typically by extending a locking scheme to cover multiple servers. * Cache coherence coherence, constant phase difference in two or more Waves over time. Two waves are said to be in phase if their crests and troughs meet at the same place at the same time, and the waves are out of phase if the crests of one meet the troughs of another. . Between memory and storage lies another level of data, the system cache. Each node in a cluster maintains its own cache. Once a node reads data from storage, that data may remain in cache for some period of time, to accelerate future calls to that information. If another node updates the data in storage, then the data in the cache is no longer accurate. This, simplified, is the problem of cache coherence. The magnitude of these challenges has prevented widespread adoption of shared data clusters, except in high end or proprietary implementations or for specialized scientific purposes. But today new clustering and shared file system solutions are making it possible to create shared data clusters with heterogeneous storage hardware and basic Sun Solaris or Windows systems. VERITAS SANPoint Foundation Suite HA VERITAS SANPoint Foundation Suite HA is an integrated suite of products from VERITAS Software Veritas Software Corp. was an international software company that was founded in 1983 as Tolerant Systems, renamed Veritas Software Corp. in 1989, and merged with Symantec in 2005. It was headquartered in Mountain View, California. that enable shared data clusters on Sun Solaris platforms. It extends the VERITAS core volume management, file system, and cluster server See Microsoft Cluster Server. solutions to support shared data clustering. The product builds on a foundation of the VERITAS File System See VxFS. and VERITAS Volume Manager The Veritas Volume Manager, VVM or VxVM is a proprietary logical volume manager from Veritas (now part of Symantec). It is available for Windows, AIX, Solaris, Linux, and HP-UX. A modified version is bundled with HP-UX as its built-in volume manager. , and adds the following cluster-specific components: * Cluster Volume Manager (CVM). The cluster enhancements to the VERITAS Volume Manager enable multiple nodes in a cluster to share disk groups across the cluster. Disk groups may include mirrored, striped, and concatenated logical volumes for optimized performance and reliability. Failover is very fast, as multiple servers can import the same disk group. * Cluster File System (CFS CFS abbr. chronic fatigue syndrome CFS, n.pr See syndrome, chronic fatigue. CFS Chronic fatigue syndrome, see there ). Using the Cluster File System, different nodes in the cluster have direct access to a file system that is designated as shared. The nodes actually access the data itself directly, but send meta data information over the LAN (Local Area Network) A communications network that serves users within a confined geographical area. The "clients" are the user's workstations typically running Windows, although Mac and Linux clients are also used. to a single node that is designated as a primary node for the file system. (This capability can fail over if necessary.) * VERITAS Cluster Server (VCS (1) (Verilog Computer Simulator) See Verilog. (2) (Version Control System) See version control. ) offers instant, automated application failover capabilities for clustered environments. The SANPoint Foundation Suite HA uses VCS to manage failover, internode in·ter·node n. 1. A section or part between two nodes. 2. An internodal segment. in communications, and cluster membership. Addressing Data Integrity Challenges We have already highlighted some of the data integrity challenges for shared data which every shared file system or shared data cluster solution must address. The VERITAS solution handles these issues as follows: * Ensuring consistent meta data. The Cluster Volume Manager protects the consistency of the logical volume meta data, while the Cluster File System protects file system meta data. At the volume level, a single server acts as the primary server for a shared disk group. This server uses distributed transaction A distributed transaction is an operations bundle, in which two or more network hosts are involved. Usually, hosts provide transactional resources, while the transaction manager techniques to ensure that all nodes have a synchronized syn·chro·nize v. syn·chro·nized, syn·chro·niz·ing, syn·chro·niz·es v.intr. 1. To occur at the same time; be simultaneous. 2. To operate in unison. v.tr. 1. view of the disk group and supervises all changes to volume meta data. Changes to volume meta data would include creating volumes within disk groups, resizing volumes, or removing mirrored copies. This capability rolls to another server if the primary server fails. At the file system level, the first server to mount a VERITAS Cluster File System The Veritas Cluster File System, (or VxCFS), is a cache coherent POSIX compliant shared file system built based upon VERITAS File System. It is distributed with a built-in Cluster Volume Manager (VxCVM) and components of other VERITAS Storage Foundation products - particularly is the primary server for that file system and maintains its intent log. Only the primary server can update the file system meta data. Other nodes using that file system communicate with the primary server for any meta data changes. If this server fails, another server is elected to serve as primary server for the file system. * Preventing conflicting updates. The SANPoint Foundation Suite uses the VERITAS Global Lock Manager to prevent conflicting updates from multiple nodes using file-level locking. The management of the locks is split between the nodes in the cluster; if a node fails, then the other nodes acquire its locks. * Providing cache coherence. The internode communications provided by VERITAS Cluster Server are used to ensure cache coherence. If a node writes changed data to its cache, the data is written to storage before another node can acquire a lock for the data. Other Issues For Shared Data Cluster Maintaining file system and data integrity is obviously the critical first step in implementing a shared data cluster. Other considerations when implementing shared data solutions include transparency, manageability, and availability. Transparency. If a solution requires that applications be heavily rewritten to work in a shared data environment Automation services that support the implementation and maintenance of data resources that are used by two or more combat support applications. Services provided include: identification of common data, physical data modeling, database segmentation, development of data access and , then the usefulness of the solution is limited. A transparent solution allows an application to run in a shared data environment without modification. This enables organizations to put web servers, file servers, and other common applications in a shared data environment, and gain the benefits of shared data clusters without changing production applications. SANPoint Foundation Suite HA implements file sharing Copying files from one computer to another. See peer-to-peer network, file sharing protocol and file and printer sharing. completely transparent to the applications accessing the files. Manageability. In implementing any production solution, manageability is a key concern. The shared file system implementation should simplify the management of shared data. SANPoint Foundation Suite HA presents a consistent image of a shared device to all servers accessing it, offering centralized cen·tral·ize v. cen·tral·ized, cen·tral·iz·ing, cen·tral·iz·es v.tr. 1. To draw into or toward a center; consolidate. 2. management of diverse shared storage assets. It also provides the online management capabilities of the core VERITAS software. Failover. One of the key benefits of either shared nothing or shared data clusters is enhanced availability. The clustering software must leverage this by automating application-level failover in the cluster, with application specific agents and configurable failover policies. In the SANPoint Foundation Suite HA, VERITAS Cluster Server monitors the resources required by an application and offers configurable failover capabilities when required. Shared data clusters offer significant benefits over simple shared nothing clusters. They can improve application scalability, simplify administration through storage consolidation, and even speed failover for highly available systems. The second part of this article will describe how these benefits are derived in specific applications for shared data cluster configurations. Paul Massiglia is the technical director engineering, at VERITAS Software (Colorado Springs Colorado Springs, city (1990 pop. 281,140), seat of El Paso co., central Colo., on Monument and Fountain creeks, at the foot of Pikes Peak; inc. 1886. It is a year-round resort and a booming military, technological, and commercial city. , CO) and Anne Janzer is a technical marketing consultant based in Mountain View, CA. |
|
||||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion