Continuous Availability: Either You're Online Or You're Out Of Business.Availability is quickly becoming the primary criterion for platform selection as 24x365 operation requirements are spreading to all levels of computing. In the Internet/e-commerce economy, availability is no longer an option; you're either online or out of business. The requirement for continuous availability systems applies to a larger number of business applications than ever before. Continuously available systems consist of hardware and software designed to protect against component and system-level failures. The complexity and cost of these solutions depends on the number of users, the types of services provided, and the definition of what is an acceptable versus an unacceptable outage out·age n. 1. A quantity or portion of something lacking after delivery or storage. 2. A temporary suspension of operation, especially of electric power. . Higher availability comes at a higher cost. Every step taken outside of the traditional centralized cen·tral·ize v. cen·tral·ized, cen·tral·iz·ing, cen·tral·iz·es v.tr. 1. To draw into or toward a center; consolidate. 2. data center has offered the potential of greater productivity, functionality, and convenience for IT organizations. Along with these advantages however, distributed and network based computing has exposed companies to a much greater vulnerability than ever before. From Local Area Networks to centralized computing The introduction to this article provides insufficient context for those unfamiliar with the subject matter. Please help [ improve the introduction] to meet Wikipedia's layout standards. You can discuss the issue on the talk page. platforms and now the Internet and Intranets, the level of exposure has increased while disaster preparation has decreased. Major studies show that, while 12% of businesses have an effective recovery program for their enterprise computing Refers to information technology in the larger company. See enterprise data and enterprise networking. systems, the vast majority or 82% have ineffective recovery programs. The other 6% have only partially effective disaster recovery programs. The estimated average costs of system failures can be nearly fatal to some companies and often range to over $10 million per hour of downtime or lack of access to the IT function. A server that is 99% available may seem highly available, but will actually be unavailable 5,000 minutes or 3.65 days per year. Availability figures range from 95+% for newer NT-based servers to 99.999% for enterprise (OS/390 Parallel Sysplex IBM's System/390 clustering architecture. It allows multiple System/390 computers to work together as a single system. It supports data sharing with guaranteed integrity, extensive resource sharing and sophisticated workload balancing. ) based servers. For every nine of availability added, the cost increases. For every nine after five nine's (99.999%), it soars. This is called the "path to high nines." Not all downtime is accidental. The number of minutes per year of unavailability and the related impact can be misleading. With escalating storage management costs for distributed computing (1) The use of multiple computers networked throughout a wide geographical area, or the world via the Internet, in order to solve a single problem. See grid computing. (2) The use of multiple computers in an enterprise rather than one centralized system. facing the industry, many IT managers are beginning to strive for a "mainframe" mentality to reduce their management costs and attain significantly higher availability levels. Attaining mainframe class availability levels has become a goal for an increasing number of critical applications. Storage architectures are returning to a data center model, whereby storage (mainly disk storage) is the centerpiece of the computing environment. Centralized management significantly lowers the costs associated with managing storage while increasing the availability level of the IT function. In this model, storage now becomes the center of the information universe, a role that servers previously occupied. This model, of course, is the SAN model. The role of a SAN and NAS (1) See network access server. (2) (Network Attached Storage) A specialized file server that connects to the network. A NAS device contains a slimmed-down operating system and a file system and processes only I/O requests by supporting the popular in high availability Also called "RAS" (reliability, availability, serviceability) or "fault resilient," it refers to a multiprocessing system that can quickly recover from a failure. There may be a minute or two of downtime while one system switches over to another, but processing will continue. systems is becoming increasingly compelling for many reasons, but has some current issues to overcome. The Storage Area Network has become the most significant factor in redefining the traditional storage model. SANs commonly use high-speed storage interconnects such as Fibre Channel for most large sub-nets with xGigE (Ethernet), using the TCP-IP TCP-IP Transmission Control Protocol - Internet Protocol protocol (over fiber), for distributed systems Distributed systems (computers) A distributed system consists of a collection of autonomous computers linked by a computer network and equipped with distributed system software. quickly gaining momentum. A fundamental requirement for the SAN is continuous operation. 24x365 availability is not an option for most users and certainly not for online reservations, credit cards, e-commerce, and e-business companies with a global presence. This means that a significant number of transactions occur outside of local business hours BUSINESS HOURS. The time of the day during which business is transacted. In respect to the time of presentment and demand of bills and notes, business hours generally range through the whole day down to the hours of rest in the evening, except when the paper is payable it a bank or by a . Ebay, for example, had a well-publicized outage lasting 22 hours and experienced tangible losses, but did not go out of business. When high availability is absent, an IT organization can experience loss of customer confidence, loss of investor confidence, loss of market share, regulatory violations, and loss of competitive position in the marketplace, not to mention the obviou s loss of revenues. A recent study from IDC indicated that 85% of data loss was caused by hardware failure (42%), human error (30%), and software errors (13%). SANs and NAS must obviously be designed with continuous availability clearly in view. Storage consolidation resulting from SAN implementation implies that all servers have access to a common storage repository, meaning disk storage and, most likely, tape storage. Consolidation introduces a security issue that requires software to establish restrictions on which server or application can access which data. Locking mechanisms are required to synchronize and manage updates to files. SAN management software has now become the key for SANs to progress, not only from a management perspective, but also from the aspect of enabling today's high Today's High The intra-day high trading price. Notes: In other words, this is the highest price that a stock traded at during the course of the day. More often than not this is higher than the closing price. See also: Today's Low availability goals to be attained by implementing a SAN. Implementing a highly available SAN requires that all potential points of failure be eliminated and that any failures be repaired immediately. Failure points exist throughout the SAN. Potential points of failure exist with storage adapters, fiber optic connects, interconnect devices, HBAs, and SAN software. Implementing redundancy and fail-over architectures can minimize or eliminate hardware failures and are necessary steps to make the SAN as bulletproof Refers to extremely stable hardware and/or software that cannot be brought down no matter what unusual conditions arise. See industrial strength. bulletproof - Used of an algorithm or implementation considered extremely robust; lossage-resistant; capable of correctly as possible. Does anyone really know the availability index or the number of nines of availability of their SAN? Is this the most important measure? Beyond the number of nines as a measure for availability, we find a new set of metrics emerging, which further define the impact of lost availability on the level of service delivered. Often referred to as "QoS" for Quality of Service, we now have a way to look at the type of service being delivered when a failure actually occurs. Take a large retail outlet retail outlet n → punto de venta retail outlet n → point m de vente retail outlet retail n → , for example. If twenty checkout counters are in the store and only one is open, the store is still available, but the quality of service or time it takes the customers to check out is lengthened and the customer's experience is poor. Quality of Service takes the availability percentages to the next level and begins to add meaning to the impact of an outage. In a working SAN, when a path becomes unavailable, it may or may not result in degraded performance depending on workload and whether an idle or alternate path was available. Minimizing path degradation (and server busy) is a good reason why outboard Not built in. Outboard devices are external to the main unit. Contrast with inboard. See offboard. or "server-less backup" using a direct transfer of data from a disk storage device to a tape storage device is important for SANs to progress further up the QoS chain. Application performance should suffer far less degradation than before during backup. This support is not yet available for many applications. The more redundancy built into a SAN, the less the impact of a failure on the QoS delivered. Using QoS actually provides a very helpful metric for specifying the upper bounds of degradation for a given SAN implementation. The objective is to minimize the number of SAN paths impacted by a failure and to reduce the minutes needed to repair a failed component, including SAN management software. QoS levels for a SAN can range from Failure Sensitive designs prov iding no redundancy to fully Fault Tolerant The ability to continue non-stop when a hardware failure occurs. A fault-tolerant system is designed from the ground up for reliability by building multiples of all critical components, such as CPUs, memories, disks and power supplies into the same computer. designs providing fully redundant paths and interconnects while minimizing the number of times path fail-over is activated. The "Availability Value Chain" for data storage consists of several technologies that improve data availability Refers to the degree to which data can be instantly accessed. The term is mostly associated with service levels that are set up either by the internal IT organization or that may be guaranteed by a third party datacenter or storage provider. . The chain consists of 1) Electronic vaulting vaulting Gymnastics exercise in which the athlete leaps over a form that was originally intended to mimic a horse. At one time, the pommel horse was used in the vaulting exercise, with the pommels (handles) removed. (typically remote tape libraries), 2) Electronic journals, 3) Shadowing, 4) Mirroring, and 5) Hot standby A hardware device that is connected to the computer or computer complex and remains powered on. It is ready to take over immediately if the primary unit fails. A hot standby may refer to a complete computer system; for example, a standby server, or a component in a computer such as a . Each technology level increases in cost. Today, many people are still trying to justify the cost of high-availability. In the very near future, the same group will have to justify low-availability. Just as an enterprise would be out of business without electricity and telephones, it will not be able to conduct business operations Business operations are those activities involved in the running of a business for the purpose of producing value for the stakeholders. Compare business processes. The outcome of business operations is the harvesting of value from assets without its critical applications. The real effort here is to determine a minimum level of availability enabling business continuance. Now, the ultimate goal for IT architectures is to deliver self-healing, predictive systems that predict failures and correct them before they actually occur. These are now on our distant, but achievable, horizon. |
|
||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion