Path managers keep SANs on the right track: guide storage admins through a forest of network devices.
A network path is: 1) the route that data follows between its host and target storage, and 2) any route that carries inter-device communication over the fabric. A storage network path consists of HBAs (host bus adapters), LUNs (logical unit numbers assigned to devices and partitions), a route through the host-storage interconnect, and the device controllers. The Gartner Group, never shy with definitions, stated that optimal path management should:
1) Understand all the application's data and all the possible paths to that data, including alternate routes, failover routes such as secondary paths through a redundant fabric, local or remote replicas, point-in-time copies, etc.
2) Dynamically create optimal paths based on user-defined policy, which sets their application's quality of service (QoS) requirements.
Managing paths in message networks is not as difficult as in storage. Messaging environments operate in one-to-one environments where buses support a finite number of servers and storage. When a data transfer falls, the message network simply resends it. But storage area networks, with their many-to-many configurations, offer a variety of components, connections and pathways. However, SANs think that failure is not an option--so if a communication path does fail, the system will crash. To guard against this kind of system failure, SNIA (Storage Networking Industry Association) believes that storage network routing must have redundant, well-defined paths, be capable of fast path changes, and be able to quickly update changed topology information.
Path management software presents a logical view of the physical layout of devices and connections. This logical view allows storage managers to visualize how application data is moving through the storage networking infrastructure. Once the visualization is in place, path management software will, ideally, be able to create, measure, and re-map paths by dynamically responding to an application's stated service level.
Managing a path includes monitoring loads, making load-based routing assignments, enhancing zoning, improving failover, and activating path health validation. These features fall into two main camps: protecting against single points of failure (mainly consisting of not having any), and availability and performance considerations such as network loads. Bob Rogers, BMC's chief storage technologist described path management this way: "You've got a fabric and, first of all, you need to be able to detect single points of failure. Assuming you've architected so that you, hopefully, don't have a single point of failure, now you need to be able to consider availability considerations, performance considerations, and move on."
Ideally, storage administrators should guard against single points of failure by building their storage networks so data moves across multiple switches. Realistically, many SANs still have single points of failure as a result of storage networking's legacy model. As companies moved from DAS environments into SANs, they modeled their SAN topologies from what they knew--departmental servers with its direct-attached storage. The small SANs they created stayed essentially the same: arrays attached so SAN servers, still owned by that department or departmental server cluster. Now, in light of resource sharing and centralization pressures, firms are starting to consolidate their SANs. But, in even many of these, the larger SANs are still architected with single points of failure, because they weren't built for high availability situations. Granted, not every department will suffer deeply from a temporarily downed SAN. But if a firm is storing mission-critical, must have data, that firm must build a high availabilit y architecture. And that means admitting no single points of failure.
A good path management tool will report single points of failure and will allow the storage administrator to go into the HBA and connect to a redundant location, which will remove that pesky obstacle. The path manager should also be able to monitor the data rate across paths to make sure the operating system's load balancing system is working, and will alert the administrator when the server needs additional access points. For example, when a storage administrator must manage a significant I/O workload from a particular server--such as a very large Oracle database--putting everything across a single HBA port is a really bad idea. Path manager software could add dynamic pathing capabilities by identifying a failing switch or NBA and immediately shunting the data into a working path.
Karen Dutch, director of marketing at InterSAN, described dynamic path management. "The first thing is: path management is all about managing that relationship between the application and data storage and the SAN and environment, and being able to mask off the complexity of devices. Dynamic path management says you're able to be predictive and proactive about your paths." Path management measures services, matches them against thresholds, responds to path degradation and automatically reroutes the path through other devices, as needed.
Path management software can also help reconfigure paths, in the event of storage network changes. SANs are not static. Storage administrators are constantly adding, deleting, and reconfiguring their storage area networks: upgrading to 2GB HBAs, consolidating servers, adding new servers, installing new applications, installing new ports, and rewiring. But in a storage network, even a single application might be pathing all over the place--running over multiple HBAs and switches, being shunted about for load balancing, or following automatic failover paths. When if makes changes to the SAN, they must also remap the new pathways. Dutch estimates that manual remapping takes as much of 15 percent to 25 percent of an expert's time on a yearly basis, let alone the less skilled administrator's schedule. Path management can automate most remapping, freeing up significant portions of the SAN administrator's time.
Another path management feature is supporting service level requirements by auditing performance against policies. if connects their path management application with storage networking policy definitions. The path manager periodically tests set paths to make sure they are supporting policy thresholds for best practices. If it discovers non-compliant paths, such as partial paths operating on outdated or incomplete information, it visualizes the discrepancies and notifies the administrator. In a similar situation, there may be some degradation in infrastructure and ports are going down. The path management product reports and visualizes by displaying red switches on the network topology screen. Sophisticated path management tools can additionally identify the failing ports' affected paths, and which paths are routing through those ports. The path manager checks to see if the degradation is impacting service levels. It may not be: If no critical applications are moving through that port and the degradation is du e to temporary heavy I/O, it is unnecessary to establish new paths. But if the degradation or failure is impacting service levels, then the path management software will report the failure.
For example, one if organization installed dual redundant fabrics for high availability. But when they provisioned storage, the data only pathed through one fabric--and if had no way of knowing. When they used a path manager to optimize their storage network, that application revealed the failing redundant path. No hann, no foul--otherwise they would not have known about the failed pathway until their active fabric died or degraded, and their high availability measures would have died on the vine.
Path management can also help re-establish working paths within set policy. This is called dynamic path or self-healing path management, where servers and storage devices have multiple pathways with no single point of failure. Storage administrators often replace older or failing HBAs, but must manually remap those paths to new HBAs--and a 300GB database might take as long as 24 hours to completely remap. A path management tool can automatically remap those HBAs, cutting the process down considerably. Failover procedures can also benefit from automated remapping. A slowed application might be degrading because its primary path has already failed and it has switched successfully to its secondary path which, as a result, is experiencing high workloads and sluggish performance. Because the path manager has alerted the storage administrators to the situation, it is simple to reboot the primary server and reattach its storage devices using the path manager to quickly reestablish the primary paths.
Path management's path verification feature should not be limited to disk. According to Kevin Honeycutt, ADIC's executive director of product management, path verification is particularly important in tape paths in Fibre Channel environments. Failed paths between servers and disk-based storage are often obvious--the application suddenly can't find its data, and everything comes to a tearing halt amid the groans of anguished users, if immediately begins to repair the direct path and all is well. But in the case of backup procedures, the storage administrator might not find out until the next morning, or even next week, that the backups aren't running. And even following the alert, many IT shops have a narrow backup window. The storage administrator will recover or re-establish the backup path, but if the backup didn't run on time, that's too bad. The window is shut. To avoid these kinds of nasty surprises in storage networks, some tape libraries provide their own path verification by sending an in-band data pa cket at frequent intervals to test the path. Path management software can also continuously monitor the state of all data paths in the fabric, including between the tape devices, their connectors, source arrays and hosts.
Organizations used to judge their storage administrators' performance mostly on successful device tracking and low MTBF (mean time between failures). Yet, more and more frequently, they now base their analysis on how well if meets application service levels. Path management is a useful tool to help manage service levels, especially with shrinking headcount and growing storage needs. Future development may dynamically re-route and activate data paths to local replicas before a failure occurs. This feature will require adding file system capabilities to path management and similar SRM products, so the software can accurately measure the service that paths are delivering to an application, provision multiple paths, and accurately perform impact analysis on failed devices.
It's a promising field for sophisticated storage customers: Gartner Dataquest has predicted that path management will grow to $7 billion by 2003, since it incorporates some pretty necessary SAN requirements such as updating topologies, managing data communication and speed, checking pathway health, and automating virtual and self-healing pathing. Given the demands that even smaller SANs place on their administrators, this is a promising market for storage vendors.
|Printer friendly Cite/link Email Feedback|
|Author:||Chudnow, Christine Taylor|
|Publication:||Computer Technology Review|
|Date:||Mar 1, 2003|
|Previous Article:||Gigabit Ethernet and transport offload: transport offload engines help relieve TCP processing burden for Gigabit Ethernet. (Connectivity).|
|Next Article:||Disk companies pricing themselves out of business-again: lessons of the past still unlearned.|