Printer Friendly
The Free Library
14,715,988 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Storage virtualization--architectural considerations, Part 2 of 3.


In Part One of this series, we discussed how storage virtualization Treating storage as a single logical entity without regard to the hierarchy of physical media that may be involved or that may change. It enables the applications to read from and write to a single pool of storage rather then individual disks, tapes and optical devices.  can deliver a non-disruptive operating capability Noun 1. operating capability - the capability of a technological system to perform as intended
performance capability

capability, capableness - the quality of being capable -- physically or intellectually or legally; "he worked to the limits of his
, enabling users to address the increasingly critical challenge of reducing the planned downtime The time during which a computer is not functioning due to hardware, operating system or application program failure.  associated with making changes to their storage infrastructure.

By way of review, storage virtualization provides a physical-to-logical storage device abstraction. It presents a simple, consistent representation of a complex infrastructure to the entities consuming resources. Virtualization An umbrella term for enhancing a computer's ability to do work. Following are the ways virtualization is used.

Hardware Virtualization
Partitioning the computer's memory into separate and isolated "virtual machines" simulates multiple machines within one physical computer.
 is an inherent capability of enterprise storage arrays, which aggregate capacity from multiple fixed disk drives in a single physical frame and present logical volumes for host access. More recently, a new class of virtualization technology See VT. See also virtualization.  has emerged that virtualizes capacity from multiple heterogeneous arrays across the entire SAN and manages a logical representation of this capacity from a single point.

However, there are many different approaches and challenges associated with this new class of virtualization. Architecture plays a defining role as to the ultimate value of these solutions in enterprise environments.

Host-based Storage Virtualization

One solution for addressing the challenge of aggregating and managing capacity from multiple SAN-based devices is one already deployed in many end-user environments: the host-based logical volume manager (LVM LVM Logical Volume Manager
LVM Liikenne- ja Viestintäministeriö (Finnish: Ministry of Transport and Communications; Helsinki)
LVM Left Ventricular Mass
LVM Landwirtschaftlicher Versicherungsverein Muenster
). Indeed, LVMs are becoming a standard part of most modern server operating systems See network operating system. . LVMs are software utilities that manage logical volumes presented from various storage devices, configuring capacity to suit the needs of an application. For example, these may concatenate To link structures together. Concatenating files appends one file to another. In speech synthesis, units of speech called "phonemes" (k, sh, ch, etc.) are concatenated to produce meaningful sounds.  a set of volumes configured at a small size at the array level to present a single large volume, they may slice a large array volume into several more manageable units, or they may be used to stripe data across a number of array volumes for performance reasons, while maintaining a single representation of the capacity to the application.

While LVMs provide some of the benefits of larger-scale, multi-device virtualization, they carry with them an intrinsic limitation--they are host-based and, as such, configuration and deployment must be done individually for each host. This is not an issue if there are a small number of hosts, but in an enterprise setting, where there are typically hundreds or even thousands of hosts accessing SAN-based storage, the manageability of this distributed capability quickly becomes challenging. This issue is exacerbated in environments with a large degree of change, which necessitate ne·ces·si·tate  
tr.v. ne·ces·si·tat·ed, ne·ces·si·tat·ing, ne·ces·si·tates
1. To make necessary or unavoidable.

2. To require or compel.
 frequent configuration modifications. Manageability also is a challenge if different LVMs are deployed across different operating systems Operating systems can be categorized by technology, ownership, licensing, working state, usage, and by many other characteristics. In practice, many of these groupings may overlap. , requiring administrators to be proficient with multiple toolsets. Other challenges that emerge when using host-based approaches are interoperability The capability of two or more hardware devices or two or more software routines to work harmoniously together. For example, in an Ethernet network, display adapters, hubs, switches and routers from different vendors must conform to the Ethernet standard and interoperate with each other.  (making sure that third-party LVMs stay compatible with operating system operating system (OS)

Software that controls the operation of a computer, directs the input and output of data, keeps track of files, and controls the processing of computer programs.
 revisions and new devices) and performance (some intensive LVM operations can sap host processing cycles).

Network-based Storage Virtualization

Network-based virtualization architectures attempt to address some of the challenges inherent in the host-based model. By putting the virtualization functionality in a layer between the hosts and subsystems, the functionality is more centralized cen·tral·ize  
v. cen·tral·ized, cen·tral·iz·ing, cen·tral·iz·es

v.tr.
1. To draw into or toward a center; consolidate.

2.
 for easier manageability. There are two architectural approaches: in-band and out-of-band.

In-Band Approaches

In-band architectures insert a virtualization device in the network data path (or "in-band") between the hosts and the arrays. These devices typically offer volume management and other complementary functionality, such as data movement and copy services Copy services is a term used in IBM storage systems, to describe a group of services that provide a method of copying or moving data from one location to another.

Generally a source and target logical disk are required. Data is copied or moved form the source to the target.
. In effect, they act as replacement storage controllers for the devices they are virtualizing. The virtualization device itself can be a dedicated server running virtualization software installed on top of a standard operating system, a dedicated appliance running embedded Inserted into. See embedded system.  code, or even an array controller "front-end" with a back-end that permits connection of additional array frames. The primary advantage of this approach is simplicity--one (new) self-contained device can be deployed to act as a central point of management for multiple connected devices.

One basic disadvantage of the in-band approach is the addition of an extra "hop" to the network path, which adds latency between the hosts and the physical storage. Some in-band devices attempt to address the added latency by employing caching within the device itself. Caching within the network also carries with it additional complications. For high-availability environments that demand redundancy, preserving cache coherency Managing a cache so that data are not lost or overwritten. For example, when data are updated in a cache, but not yet transferred to its target memory or disk, the chance of corruption is greater. Cache coherency is obtained by well-designed algorithms that keep track of the cache.  between a pair of in-band devices requires cache mirroring, which adds back some latencies. It also requires robust error and failure handling logic to ensure that cached and acknowledged I/O (Input/Output) The transfer of data between the CPU and a peripheral device. Every transfer is an output from one device and an input to another. See PC input/output.

I/O - Input/Output
 is stored safely on the back-end device.

A more significant disadvantage of in-band virtualization architectures is a limit to scalability. Since all I/O within the virtualization domain needs to go through the in-band device, it can become a bottle-neck, either in terms of bandwidth or processing power. Once either resource becomes depleted de·plete  
tr.v. de·plet·ed, de·plet·ing, de·pletes
To decrease the fullness of; use up or empty out.



[Latin d
, a scaling strategy must be employed. As we noted above, the need to mirror cache across in-band nodes makes "scale out" strategies (in which n-number of additional nodes are added for scaling) impractical. Instead, the only practical recourse is a "scale up" strategy that calls for bigger and bigger in-band nodes to deliver in-band virtualization on a large scale. At some point for a large environment, even a "scale-up" strategy will prove to be insufficient and a new in-band device will need to be deployed.

Out-of-Band

Out-of-band approaches are designed to avoid the performance challenges inherent in an in-band architecture by separating the management information from the data flow itself. In an out-of-band architecture, a separate piece of hardware called the metadata server, which contains information about the logical-to-physical relationships of the virtualized storage, communicates to each server, informing it where to direct its I/O requests. This communication is done over an independent network, separate from the Fibre Channel network used by the data traffic--hence the out-of-band description.

Because the host issues requests for virtualized storage directly to the destination device, the I/O performance is free of additional latency or bandwidth bottlenecks. Thus, the out-of-band approach is theoretically more suitable for higher performance applications. It also avoids the data integrity issues inherent with the in-band approach. No "state" or version of the data is ever held in the network. Until the data is properly stored on the array, the host is not made to believe the job has been completed. However, this type of out-of-band approach reintroduces some of the manageability challenges of the host-based approach. Namely, the need to load, maintain and qualify host-based software.

A refined out-of-band approach is emerging that addresses this manageability challenge. This approach leverages intelligent SAN switches as the platform for deployment of network-based storage virtualization. These switches have specialized port-level processors (frequently optimized ASICs, but also could be FPGAs or network processors) that inspect and redirect re·di·rect  
tr.v. re·di·rect·ed, re·di·rect·ing, re·di·rects
To change the direction or course of.

n.
A redirect examination.



re
 I/O (translate from logical to physical addresses) at wire-speed. By incorporating these processors directly into the existing SAN fabric, the need to manage another layer of virtualization devices is obviated. The metadata which was formerly managed at the host by the host agent is loaded into flash memory at the intelligent port, obviating ob·vi·ate  
tr.v. ob·vi·at·ed, ob·vi·at·ing, ob·vi·ates
To anticipate and dispose of effectively; render unnecessary. See Synonyms at prevent.
 the need for host-based software. Instead of communicating with the hosts, the metadata server communicates with the intelligent ports, ensuring they always have the right mapping information for the hosts accessing storage through those ports. In sum, the manageability of this refinement is greatly increased.

This switch-based out-of-band approach is also much more amenable to a "scale-out" scaling strategy. Because the vast majority of I/O processing (and, thus, the capacity of the implementation) is handled directly by the port-level processor in the intelligent switch, when increased scale is required, all that needs to be done is add more processors. This can be accomplished by adding another switch to the fabric or by adding another processing blade to an existing switch. The additional processors are still managed by the same metadata server, which does not need to scale nearly as often, as it does not handle I/O traffic, but rather only manages the metadata across the ports. In short, this architecture is theoretically capable to scaling to very large configurations, the kind of scale that will be needed to extend the benefits of storage virtualization across today's largest data centers.

Conclusion

There are a number of approaches to storage virtualization, each with their own attributes. As we have shown, architecture can be a key determinant of a storage virtualization solution's manageability, scale and ultimately, value to its adopter. A full understanding of a solution's architecture should be a key consideration for any potential adopter of this technology.

Mark Lewis is executive vice president and chief development officer at EMC Corporation EMC Corporation (NYSE: EMC) is an American Fortune 500 and S&P 500 manufacturer of software and systems for information management and storage. It is headquartered in Hopkinton, Massachusetts, USA.  (Hopkinton, MA).

www.emc (1) (EMC Corporation, Hopkinton, MA, www.emc.com) The leading supplier of storage products for midrange computers and mainframes. Founded in 1979 by Richard J. Egan and Roger Marino, EMC has developed advanced storage and retrieval technologies for the world's largest companies. .com

Please note: This is the second article in Lewis' three-part series on virtualization. We will return to this question and pose several more that should be asked of any potential virtualization vendor in the next and final article in this series.
COPYRIGHT 2005 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2005, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Storage Management
Author:Lewis, Mark
Publication:Computer Technology Review
Date:Jun 1, 2005
Words:1436
Previous Article:Managing e-mail as a business process.(Storage Management)
Next Article:Key software issues to consider when purchasing a tape library.(Disaster Recovery & Backup/Restore)
Topics:



Related Articles
Virtualization Makes Illusion Real.(Technology Information)
SAN Appliances Ease SAN Implementation.(Technology Information)
Virtualization: One Of The Major Trends In The Storage Industry -- What Are You Getting For Your Money?(Industry Trend or Event)
Hitachi Data Systems redefines modular storage.
Virtual storage and real confusion: a big disconnect between what vendors offer and what users want.(Industry Overview)
No quick virtualization fixes: achieve the goals of virtualization through holistic storage management.(Storage Management)
Trends in virtualization focusing on solutions, not technologies.(Storage Management)
Virtualization's new voice: virtualization plays an important role in an overall data management strategy.(Storage Management)
Achieving simplicity with clustered, virtual storage architectures.(Storage Clustering)
Cisco extends its leadership in network-based storage virtualization through Intelligent Fabric Applications.(Cisco Systems Inc.)

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles