SAN Performance Management Issues.You have to be "up" to delivering a quality e-business experience in both the Business-to-Business (B2B (Business to Business) Refers to one business communicating with or selling to another. See B2B e-commerce, B2C and B2G.
B2B - business to business ) and Business-to-Consumer (B2C (Business to Consumer) Refers to a business communicating with or selling to an individual rather than a company. See B2B. ) environments. The power, reliability, and extensibility of Storage Area Networks (SANs) are driving significant interest in this solution as companies rush to implement e-business in every industry segment.
However, the level of complexity typical of e-business and SAN systems results in major challenges to management. Even more important, managers are discovering that being able to see and manage their entire solution's performance is a critical success factor when it comes to realizing the benefits that a reliable, redundant SAN can bring them.
The implementation of high capacity, redundant storage arrays and SANs provides major benefits in areas, including cost/byte, performance, and availability. However, achieving those benefits in the more complex SAN environment means putting in place technology designed to provide visibility throughout the SAN. Where once the manager could easily access performance and utilization information about storage directly from the server or host, storage arrays and SANs remove/remote this visibility. The option of not implementing redundant disk arrays (RAID) and SANs is not viable due to the exponential growth Extremely fast growth. On a chart, the line curves up rather than being straight. Contrast with linear. of high performance database activity, coupled with the need for accurate inventory, fast order response, and virtually 100% uptime--all in real-time.
As a result, today's managers are beginning to learn that deploying a SAN is just the beginning. The ability to see all of the SAN components and to tune it to achieve optimal performance and response times represents the next major step in SAN deployment. This article will:
* Identify how to "see" and manage the performance of a SAN.
* Spotlight roadblocks to achieving high SAN performance and provide some real-world advice for achieving performance breakthroughs.
* Examine issues when managing a typical SAN, including storage devices, switches, multiple database servers, web servers, and application servers.
* Provide recommendations for attaining optimum e-business performance from the application layer through the SAN.
* Provide an update on SAN management standards and a look at what managers are hearing from industry groups such as the SNIA (Storage Networking Industry Association, San Francisco, CA, www.snia.org) An organization devoted to the advancement of mission critical storage systems. Founded in 1997, its goal is to determine the standards that must be developed to allow hosts and storage systems to interact via and FibreAlliance.
Despite its additional complexity, a SAN requires the same basic systems management disciplines as a non- SAN environment, e.g., Event and Fault management, Configuration management, Change management, Asset and Accounting management, Performance management, and Security management. In the context of this article, the focus is on Performance management, which includes monitoring, graphing, data correlation, and analysis of performance data from all elements of the SAN including the applications running on each Server.
Seeing A Way To Optimize SAN Performance
The two major factors that affect SAN performance in any enterprise are easily identified. First, the typical SAN includes multiple hosts attached to the same storage arrays. Additionally, the server and storage consolidation that are actually a benefit of the SAN environment inevitably result in larger applications and a greater number of applications competing for the same network resources.
Before SANs and intelligent storage arrays such as EMC's Symmetrix products arrived on the scene, there was a relatively simple one-to-one relationship between servers and storage. Since most systems had a consistent mapping of physical to logical disks, there was a selection of performance management tools that are able to provide adequate information to manage the performance of the system. With intelligent storage arrays and SANs, in contrast, the logical disk in a system could very easily be the combination of many physical disks in one storage array, or, multiple storage arrays--and in the case of a SAN, behind more than a single switch. In effect, everything attached to the Fibre Channel interface of the host is hidden in a black box. In today's complex environment, therefore, it becomes absolutely critical to not only see each disk from the operating system operating system (OS)
Software that controls the operation of a computer, directs the input and output of data, keeps track of files, and controls the processing of computer programs. and each application's point of view, but also from the storage, switch, and fabric perspectives to ensure that data is flowing and smoothing and switching congestion The condition of a network when there is not enough bandwidth to support the current traffic load.
congestion - When the offered load of a data communication path exceeds the capacity. can be managed for optimal performance of the overall system.
In order to see all the various components that impact performance, each element of the environment must be instrumented to deliver performance information. This includes the applications, operating systems Operating systems can be categorized by technology, ownership, licensing, working state, usage, and by many other characteristics. In practice, many of these groupings may overlap. , switches, hubs, adapters, intelligent storage arrays, front-end controllers, cache and backend controllers, and the physical disks themselves. With access to all of this performance data, managers will be able to "see" how the entire infrastructure is performing. Only when that is accomplished can a manager begin to monitor, analyze, and, eventually optimize performance throughout the SAN.
SAN Performance Roadblocks
Roadblocks can come in many different forms in a complex SAN infrastructure. System servers, ports on a switch, an entire switch, the overall SAN switching fabric, and the various elements of the storage arrays are just some of the components that can have a significant negative impact on overall performance. The most common method of resolving performance bottlenecks is to change the access patterns of the applications so the associated data will take a new path. Products such as Powerpath from EMC (1) (EMC Corporation, Hopkinton, MA, www.emc.com) The leading supplier of storage products for midrange computers and mainframes. Founded in 1979 by Richard J. Egan and Roger Marino, EMC has developed advanced storage and retrieval technologies for the world's largest companies. and Dynamic Multi-Path from Veritas provide load balancing The fine tuning of a computer system, network or disk subsystem in order to more evenly distribute the data and/or processing across available resources. For example, in clustering, load balancing might distribute the incoming transactions evenly to all servers, or it might redirect them . In a SAN environment, it becomes critical for he application load to be spread across as many connection points as possible so the load is balanced for the overall system. Since these technologies are server-centric, the use of performance management software will enable managers to view the SAN as a whole and to determine how to keep the infrastructure in balance.
One popular application for SANs is backup that doesn't involve production servers (off-host or server-less backup). These backup activities are running on the same physical fibre as production servers, so these types of applications will consume bandwidth through the switches and overall fabric. This type of deployment highlights the requirement to pull together in one view all the elements that will impact performance of the overall SAN.
SAN Performance Management Issues
There are two different approaches to managing a SAN: Storage Centric or Server/Application Centric. In a Storage Centric view, all tools deal with the storage object (array, switches, etc.). In the Server/Application Centric view, the storage objects are one more element needed to get a complete view of the system environment.
One of the challenges arising with SANs will be how to access the multitude of performance data items as geographically separated SANs are deployed. Most management tools are using out-of-band collection via SNMP-- which means that all the storage objects must be network accessible. One solution to this problem is the use of in-band collection methods. The downside to in-band collection is that every OS vendor will need to provide for SCSI SCSI
in full Small Computer System Interface
Once common standard for connecting peripheral devices (disks, modems, printers, etc.) to small and medium-sized computers. SCSI has given way to faster standards, such as Firewire and USB. pass-through and each tools vendor will need to support all the pass-through implementations since there are no standards for in-band collection, to date.
A Roadmap To Optimum e-business Performance With A SAN
The value of using a SAN-aware performance management solution--and these are available today--is being able to evaluate the performance of all the elements in the SAN, including Application Servers, Database Servers, Backup Servers A computer in a network used to store copies of files from client machines or other servers. Such servers typically have their disks set up in a RAID configuration to provide fault tolerance. See backup program, RAID, SAN and LAN free backup. , Switches, Hubs, the Fabric, and Storage Array (See Fig). Equally critical, the data from these disparate and sometimes geographically distant elements must be consolidated into a single view of the e-business system. In addition, processes from various applications on each type of server should be grouped into a single workload element and, for each of these workloads, a variety of performance metrics Performance metrics are measures of an organizations activities and performance. Performance metrics should support a range of stakeholder needs from customers, shareholders to employees . such as CPU CPU
in full central processing unit
Principal component of a digital computer, composed of a control unit, an instruction-decoding unit, and an arithmetic-logic unit. , Memory, and I/O (Input/Output) The transfer of data between the CPU and a peripheral device. Every transfer is an output from one device and an input to another. See PC input/output.
I/O - Input/Output thresholds will be collected. To ensure that the health of the SAN can be reviewed at any time, the appropriate SAN performance management technology would also utilize a rules engine with rules tailored for each SAN element and enable thresholds to be checked and analyzed on an as-needed basis.
Applying these requirements and the appropriate performance management technology to an enterprise-scale e-business infrastructure should deliver positive results in real-world scenarios that are becoming all too familiar to today's managers. Suppose, for example, that the Web Application server is not responding fast enough to all of the customers visiting a company's website. An alarm has been reported that the threshold for "wait on I/O" has been exceeded. At this point, performance management technology can be used to correlate all the performance data from each element in the SAN. This information would be processed and compared to the behavior of "wait on I/O" for the application server mentioned above.
With a list of metrics metrics Managed care A popular term for standards by which the quality of a product, service, or outcome of a particular form of Pt management is evaluated. See TQM. and their correlation values, it becomes possible to recognize that a particular port on the switch is being saturated and the front-end director of the storage array is also overloaded. The question becomes "what caused this problem?" The answer: two unrelated other activities over the SAN are actually causing the bottleneck--a batch update was running on the database server and a backup was running from the backup server. This is the level of performance management technology that is needed to bring together this seemingly unrelated data and enable synthesis and analysis of that data to optimize SAN performance in the highly complex world of e-business.
Through the use of SAN-aware management tools, managers have the ability to link together all relevant data to find the root cause of problems. The complexity of the SAN environment makes it absolutely essential to look at all of the interrelationships in the system from the individual users to the storage array.
SAN Standards: The Current Scene
At this writing, no final standards exist for SAN management and interoperability The capability of two or more hardware devices or two or more software routines to work harmoniously together. For example, in an Ethernet network, display adapters, hubs, switches and routers from different vendors must conform to the Ethernet standard and interoperate with each other. in a heterogeneous environment Using hardware and system software from different vendors. Organizations often use computers, operating systems and databases from a variety of vendors. Contrast with homogeneous environment. . With that said, several groups are working on standards for Storage Area Networks, including the Storage Networking Industry Association An association of producers and consumers of storage networking products, whose goal is to further storage networking technology and applications. The Storage Networking Industry Association, or SNIA (SNIA), the Fibre Channel Industry Association (FCIA FCIA
See: Foreign Credit Insurance Association ), the Internet Engineering Task Force (c/o Corporation for National Research Initiatives (CNRI), Reston, VA, www.ietf.org) Founded in 1986, the IETF is a non-membership, open, voluntary standards organization dedicated to identifying problems and opportunities in IP data networks and proposing technical solutions to the (IETF See Internet Engineering Task Force.
IETF - Internet Engineering Task Force ), Jiro from Sun Microsystems Sun Microsystems, Inc. (NASDAQ: JAVA) is an American vendor of computers, computer components, computer software, and information-technology services, founded on 24 February 1982. , and the FibreAlliance, a subset of SNIA and FCIA members.
At a recent (late 1999) SAN conference in Seattle, fifteen companies participated in the kickoff of the SNIA Interoperability Lab effort, demonstrating interoperability between all types of SAN-related equipment based on communicating over a common Fibre Channel fabric A Fibre Channel fabric (or Fibre Channel switched fabric, FC-SW) is a switched fabric of Fibre Channel devices enabled by a Fibre Channel switch. Fabrics are normally subdivided by Fibre Channel zoning. Each fabric has a name server and provides other services. . Attendees also viewed the first implementation of Internet Web-Based Enterprise Management (standard, system management) Web-Based Enterprise Management - (WBEM) A DMTF management standard using the Common Information Model to represent systems, applications, networks, devices and other managed components; developed to unify the management of distributed computing (WEBM WEBM webMethods Inc. (stock abbreviation, AMEX) ), which managed heterogeneous storage systems spanning two continents.
In an effort to accelerate the development of standards for SANs, the FibreAlliance was formed in early 1999. This open consortium now numbers more than 30 leading and emerging Fibre Channel vendors that share a common commitment to help customers more rapidly and effectively deploy and operate heterogeneous SANs.
At press time, the FibreAlliance had recently completed Phase two of a Management Information Base or MIB (1) (Management Information Base) The hierarchical database used by the simple network management protocol (SNMP) to describe the particular device being monitored. MIB objects are identified using ASN.1 syntax. See SNMP, RMON, OID and ASN.1. . By employing this powerful, flexible, enabling software in their products, multiple vendors can develop Fibre Channel-based SAN devices that customers can manage in a simplified and integrated manner. The MIB enables SAN administrators to gain a detailed, topographic topographic
describing or pertaining to special regions. view of the storage network, obtain detailed SAN performance information from across the enterprise, and launch the management software of each SAN component. The consortium has submitted the new MIB specification to the Internet Engineering Task Force (IETF) as open standards Specifications for hardware and software that are developed by a standards organization or a consortium involved in supporting a standard. Available to the public for developing compliant products, open standards imply "open systems;" that an existing component in a system can be replaced draft.
A fundamental element of FibreAlliance membership is each company's commitment to adhere to adhere to
verb 1. follow, keep, maintain, respect, observe, be true, fulfil, obey, heed, keep to, abide by, be loyal, mind, be constant, be faithful
2. the FibreAlliance specification and to implement the MIB, protocol enhancements, and APIs. To date, 11 FibreAlliance companies have announced products that today or in the near future will implement the FibreAlliance SAN management framework.
Even with all this work, SANs deployed today will require the use of some vendor-specific tools. The standards will provide a common set of information. However, management tools will be forced to work in non-standard ways in order to maximize unique, product-specific functionality--at least for the foreseeable future. Therefore, industry experts are recommending that, when deploying a SAN, it is most effective to work with a supplier that has tested--and will certify-- that all components, including management tools, will interoperate.
Bill Martinson is the vice president of business development at Datametrics Systems Corp. (Fairfax, VA).