Storage area networking and data security: choosing the right solution for maximizing storage and integrity. (SAN/NAS Backup).
However, some customers fail to realize that plugging everything into the same switched network merely provides physical connectivity, better bandwidth and better problem isolation. Left at that, the possibilities for unintentional and malicious data access and corruption in this many-to-many network actually increase. That's why securing, centralizing and consolidating control and monitoring of the network storage resources requires comprehensive storage management--a layer of software that oversees all facets of data storage and retrieval and enforces site specific policies for who gets what.
Storage security concerns need to be coped with at the host, management and storage levels. This article concentrates on the core capabilities and benefits of SANs, current trends in "virtualized storage," methods of implementing open storage networks and some of the impacts on security resulting from these implementation choices.
The Path to Storage Networks
Storage configurations, in the absence of storage networking, are constrained by cabling and bandwidth requirements of small computer standard interface (SCSI) hardware: short segment lengths of 30 meters or less, limited ability to address multiple devices per host bus adapter (HBA), and exposure and sensitivity to hardware failure and downtime. Inherently, SCSI disk devices are slaves to individual servers because of this wiring limitation. Resources are bound to specific hosts with no way to reallocate the spare capacity. In practice, it is common to have one server completely exhaust its disk space and incur downtime for reconfiguration, while adjacent servers have surplus space going to waste.
Fibre Channel: A Step in the Right Direction
With Fibre Channel products came the introduction of SANs, and connectivity became much easier: fewer cables, stretching dramatically longer distances and attaching numerous devices. Fibre Channel switches also addressed the need to manage and manipulate paths to storage and troubleshoot failures in an intelligent way. More disks could be consolidated into larger arrays that offered some centralization of management and uniformity of administration.
Still, in a Fibre Channel-based SAN, disk capacity remains enslaved to individual servers, with no relief for applications that are running out of space. The enhanced connectivity of the storage network does not change the fact that each application host expects to "own" a disk resource, dedicated exclusively for its use. As a result, the greater connectivity of a storage network makes it possible for hosts to access not only their dedicated storage, but the storage of all the neighboring hosts as well! Hosts will unknowingly compete for the same storage space if given the opportunity. So, while Fibre Channel makes physical configuration and implementation easier, it does not provide true networking capabilities.
Obviously, this is not only a major security problem but a data integrity issue as well. One approach to these concerns to "zone" Fibre Channel switches, i.e. segregate data paths using a management interface for the switch, is to block competing servers from seeing the same storage device. While this prevents one server from viewing and tampering with another server's data, the process requires a lot of individual attention to each storage path and practically negates any management advantages of networked storage. In large enterprises, it's not uncommon for IT staff to dedicate a significant amount of time to simply managing the zones of switches and changing the configuration for various reconfiguration and maintenance tasks.
Storage Pooling: Key to the Value of SANs
Regardless of the type of SAN connectivity, the foundation for achieving simplified management and increased storage capacity utilization is the ability to consolidate a variety of storage assets into a "storage pool" from which one can freely allocate capacity wherever and whenever it's needed. Although storage area networking has evolved significantly, the ability to flexibly pool and allocate storage remains limited. Many early adopters assumed that the hardware advances of improved performance, longer cable lengths, and a more sophisticated network infrastructure alone would naturally include storage pooling.
Unfortunately, that is not the case. Fibre Channel-based SANs allow application servers to be physically separated by greater distances from the storage devices and enable multiple servers to attach to each disk array, but additional intelligence is needed to appropriately allocate and share disk space. This intelligence is the key to easing administration, automating tasks, eliminating planned downtime and fully utilizing available disk capacity.
And as one major healthcare provider learned, even though a high-end, SAN-connected, intelligent storage subsystem may have its capacity fully allocated to applications, this is very different from being fully utilized. Some applications didn't use all the space given, while others used it faster than anticipated. As a result, this healthcare organization faced the need to purchase an additional, expensive and identical piece of proprietary hardware, because much of the available existing disk space was simply inaccessible. They instead opted to implement storage pooling technology.
Storage pooling is the result of a technique called "storage-virtualization," which breaks the one-to-one relationship between servers and disks and treats all storage as a consolidated resource, making it easier to allocate volumes of capacity, perform maintenance and other common tasks non disruptively. Virtualization enables additional layers of features, functions and automation that further ease the task of administration.
There are many ways to define and describe storage-virtualization and a few different methods of implementation. In this discussion, two important things should be kept in mind:
* All vendors do not uniformly implement virtualization. Many confine its use to very narrow hardware or software environments. This article mentions many capabilities of virtualization, but not all products will necessarily have all these features.
* Virtualization, while vital to many of the benefits of SANs, is not a solution in and of itself, no more so than an engine is the equivalent of a car. Features such as replication, point-in-time volume copy, remote mirroring, caching, and automation layered on virtualized underpinnings complete the value proposition of consolidated SANs.
Storage-Based Virtualization and Management: On the Move
Historically, storage management has been a core feature of intelligent storage arrays. Advanced features such as automated provisioning of disk space, protection of data by keeping current redundant copies, etc. are appealing and successful at reducing some of the management burden. The resident volume allocation features of the storage array in combination with switch zoning provide a measure of volume-access security.
The primary drawback of intelligent arrays is that the functionality is built on proprietary, custom-configured hard-ware and embedded firmware with very narrow coverage. This design choice has important ramifications for end-users:
* If more than one storage device is in place, as is true for nearly any enterprise, the management tools and benefits of array-based storage virtualization often only apply to the single device from that vendor (in other cases, users are restricted to using the same exact model of storage to retain cross-hardware management functionality). This creates a lock-in condition for the customer, which means higher costs for upgrades, service and expansion.
* Feature upgrades are complex and costly for the vendor due to the proprietary nature of the storage hardware and firmware. These costs are naturally passed on to the customer. Upgrade cycles can be slow and major improvements in performance or technology generally correspond with "forklift" replacement when new technology finally arrives.
Limiting storage virtualization to the proprietary storage array makes it nearly impossible to maintain both backward compatibility and purchasing flexibility in future storage acquisitions. It is common for suppliers of these storage subsystems to recommend abandoning existing storage assets in order to install a new storage network.
The advanced functionality that these devices offer is appealing, but the costs increase with lock-in pricing and slow adoption of important new technologies. The inability to mix-and-match storage devices diminishes the ability to optimize utilization. Data that could be adequately served with midrange disk arrays is confined to the premium-priced storage. Although array-based virtualization is common today, the economic pressures of data storage growth and management are making sole reliance on this strategy untenable.
The good news is that, in recognition of the complications posed by the proprietary array-based approach, every major storage vendor has announced intentions of developing network storage virtualization engines that remove many of these complexities. While most in-house virtualization development and related initiadves are still at least a year in the making, many storage suppliers are OEMing and reselling virtualization software, and offering customers a wide range of options and benefits today.
Host-Dependent Approaches Are Complex to Manage, Scale and Secure
Some host volume management tools technically incorporate basic forms of virtualization -- partitioning bigger disks under their control into smaller volumes and concatenate smaller disks into large volumes. Similarly, a few storage virtualization approaches depend on software agents on each application server to receive instructions from a management device elsewhere in the network. Still, others propose to use embedded proprietary code in specific HBAs or software drivers. One thing remains constant with all these techniques: dependency on some host-based "agent," i.e. proprietary technology that resides on the server to help accept and enforce storage-related instructions, which enables virtualization. "LUN masking," "asymmetric virtualization," and "out-of-band virtualization" are common labels for these implementations.
These methods attempt to compensate for the fact that operating systems are not designed to share storage resources between servers. When connected to a storage network, each host would normally claim any visible storage as its own, leading to major security holes and data corruption. The server-based agent limits access to only the resources that each host "sees."
There are several security and management implications with host-dependent approaches:
* Security of the data path is left to an "honor system." It is possible for servers unequipped with these proprietary agents to wreak havoc on corporate data.
* Installation and management is required on every server, adding to the total time and cost to bring a system online and perform upgrades, as well as distributing responsibility for secure volume allocation.
Each set of applications could potentially have different functional owners, requiring complex coordination of IT staff resources and planned downtime to ensure proper compliance and configuration of the agents.
For it to be effective, the specialized agent must be available for each operating system version in use. The developer of the driver can choose to stagger the release of support for various platforms, or not support a particular platform at all. This inherently exposes the corporation to the risk of introducing an unsupported platform and delaying needed system upgrades until the storage control agent is updated and tested. These types of dependencies slow growth and responsiveness to changing operational needs.
It has the potential to steal processing cycles from applications. Feature-rich storage management capabilities at the host expose applications to bottlenecks, even though on paper they appear to be out of the data path.
A platform-independent approach, one that does not require any specialized agents or drivers on the application servers, isn't susceptible to these problems.
Last year, a large government agency was investigating ways to consolidate storage and storage management for a typically mixed bag of application servers that included HP-UX, Sun Solaris, Microsoft Windows and Novell Netware. In the search, the agency explored a range of options, and every host-dependent storage management product was eliminated for a very basic reason: the lack of unilateral support for all platforms in the environment. Even with promises of support for a given platform in the future, the shortcomings of this approach were apparent, and the potential security issues encouraged the agency to ultimately purchase a platform-independent, network-based solution.
Network-Based Virtualization and Control
A network-based solution centralizes virtualization and management services in independent management devices, i.e. servers dedicated to the task of managing the storage network, sometimes called "storage control nodes." This solution is best positioned to eliminate most dependencies and provide the most durable, flexible virtualizadon and management mechanism without sacrificing data security.
Independence from server and storage enables each to grow and change without adversely effecting the storage management scheme set in place and offers significant cost benefits. Properly designed, this solution leverages other manufacturers' expertise in hardware, infrastructure, connectivity and systems design, and delivers storage control functionality via software. The inherent compatibility with any vendor's storage and any operating system gives the customer a wide range of purchasing options to negotiate a satisfactory price. Finally, correctly implemented network-based virtualization can rapidly accommodate emerging connection technologies, without requiring a fundamental redesign of the network.
Placing the responsibility and control for all storage volume allocations centralized in the network, regardless of the application environment and the storage back-end, delivers significant management benefits. There is no need to install or manage agents at the host to prevent them from gaining unauthorized access to storage. With passwords and authentication limited to specific management nodes, ensuring physical and logical security over storage allocation is much easier.
In addition, some leading implementations of network storage management can generally eliminate the complications of switch zoning. Rather than creating many zones connecting some servers to some storage, advanced solutions create essentially two zones that entirely separate servers from storage. The path to storage is handled exclusively and centrally through the virtualization and management nodes, again resulting in consolidated, secure control over volume allocations.
The network-based model for consolidated storage management, enabled by host and storage independent virtualization, is arguably the fastest growing solution in SAN management. There is a noticeable trend taking hold: From a security management perspective, several major entities in the financial, healthcare, defense and administrative government sectors-all of which have strict security requirements-are some of the earliest adopters of network-based storage management solutions and the momentum is growing.
In-Band Virtualization Assists Security
Notice that in the points above we are specifically discussing an implementation where the data flows through the management nodes, known as an "inband" approach--this is deliberate, and the importance of this approach has a direct correlation with security issues.
Implementations that use a sidelined management console while data flows directly from server to storage "out-of-band" sound comforting in concept, but there are limitations inherent in outfitting application servers with software or hardware elements that control authorized access to a particular volume:
* Managing servers is increasingly complex as the organization scales
* Security is an issue when relying on these distributed access-control mechanisms
* The drivers might not be available for all platforms and all versions of each OS
* Implementation and upgrades are more difficult when account for so many points of control
Differentiating In-Band Alternatives
In-band approaches must contend with latency and availability issues--will the storage control node slow response and what happens if the node should fail? These are valid questions, and each vendor's solution deals with these issues to varying degrees.
With regards to availability, well-designed solutions enable cost-effective "N+l" redundancy, exploiting the end-to-end design of the storage network to eliminate single points of failure. In this model, the solution is sized-based on performance, bandwidth, and connectivity requirements; adding one more collaborating device covers the workload in the event of an outage. In other words, if five management nodes are needed to meet performance criteria, a sixth is added to protect against failures. There are several products that take a "2N" approach, where every device is backed up by a secondary device, but clearly these are more costly and difficult to scale.
For performance, a few network-base storage management nodes implement caching algorithms, very similar to those found in all the leading enterprise storage arrays. With caching, not only can one eliminate latency issues, but users actually experience across the board performance improvements.
A printing and publishing business, after installing a storage networking platform that incorporates caching, documented a performance improvement of an astounding 300%--so tremendous that even the pre-press production staff complimented the IT director on the new efficiency. The workload, types of data transfer and configuration of the solution also affects the ability to achieve such results, but these types of acceleration techniques are viable.
Not all in-band offerings provide the same level of protection and performance enhancement; price to deliver availability and high performance can also vary significantly depending on the hardware platform used for the virtualization device.
It's Decision Time
Storage networks are a fast-growing reality in enterprise data centers. Many organizations are still making the transition from direct-attached storage to networked storage, and from network-connected to truly open, consolidated, managed networking.
With all this movement and diversity of hardware and software, one thing has remained constant: storage virtualization techniques, and the features and automation built on top of them, are critical to fully realizing the benefits of SANs.
For these reasons, it's essential to incorporate security measures in your decision-making criteria along with manageability, scalability, and performance-and to do so from the outset. Like quality, security is difficult to add on after the fact.
Calvin Hsu is product marketing manager at DataCore Software (Ft. Lauderdale, Fla.)
|Printer friendly Cite/link Email Feedback|
|Publication:||Computer Technology Review|
|Date:||Dec 1, 2002|
|Previous Article:||Consolidating with SAN: ensuring a SAN solution is the right solution. (SAN/NAS Backup).|
|Next Article:||Milk and honey: reaching the promised land of heterogeneous storage management. (Storage Networking).|