Printer Friendly

Architecting a tiered data center: simple fundamentals bring great returns.

With all the choices in storage today and the increasing selection of vendor products, the appeal of building a tiered storage strategy is the potential savings that comes from cost-effective use of different types of storage.

Classifying different storage resources corresponding to particular tiers is only a first step to driving value from an effective tiered storage strategy. The real payoff comes from maintaining a SAN through the coordination of storage tiers with data management strategies and servicing it all with a network infrastructure capable of servicing those tiers with corresponding network service levels. This article addresses some of the basic strategies to establishing storage tiers (even small SANs can benefit) and maintaining these tiers with the methodologies and supporting technologies that help reap value.

SANs don't have to be massive to benefit from a tiered storage strategy. Even the smallest SANs can benefit--especially where the strategies are implemented early, and are maintained as storage volume increases over time. The earlier a discipline is established, the easier it will be to implement and maintain, so consider mapping out a strategy well ahead of the necessary deployment. Many customer proposals anticipate not just storage volume, but additional storage tiers as the solution matures. This will not only facilitate the migration when the time comes, but help justify the up-front costs by showing the returns from storage tiers through future years. When considering the implementation of a tiered storage strategy, either for a new or existing SAN, it helps to organize the coordinating efforts to utilize storage tiers effectively. Tiered storage, in its own right, won't do much to drive value unless it's coordinated with corresponding data and network service classification that are all maintained based on rules established through business-level requirements.

Step 1: Classifying Data

As with most technology solutions, much of the success is achieved in the planning stage when the necessary forethought goes a long way to accomplishing the project goals. A tiered storage strategy is no different, and a framework for classification is necessary in order to apply particular policies. Some of the best success is achieved with a simple enough classification of SAN assets without getting so detailed that it becomes too complicated to apply. Starting with the data itself is often an effective way to establish a baseline on which other assets are aligned. For example, with so many data types out there, both structured and unstructured, creating categories by which the data is classified helps to organize it (at least in theory, if not in practice). All corporate data is theoretically created and modified in the interest of the business, and may therefore be classified based on how it serves the organization.

For a long time, the simple designation of "mission-critical" meant that that data source and the supporting infrastructure was built and maintained with particular levels of performance and fault-tolerance. While the term "mission-critical" will always apply and indicate the utmost in data protection and availability, a more stratified system for data classification allows us to use all the variables across the data center to coordinate the most cost-effective resources. A good way to classify data is based on the tolerances of the business for data availability. In order to quantify data importance, try asking the question: "How long can the organization do without access to this information/application?" This might lead to categories such as "continuous", "near-continuous", "reliable" and "deferrable", with qualifications for each based on how long it would take to recover lost data. Simply creating the categories, and applying some definition to them sets up the structure on which management practices may be built.

Step 2: Classifying Resources

With some categories defined around business-level requirements, the next step is to associate resources to each data classification from each of the SAN asset groups, including servers, storage and the network. Servers can vary in their processing power, memory configuration, operating system, and a number of other characteristics that drive performance and reliability. These characteristics are typically shaped by the application the server(s) support. One of the key variables to consider when applying servers to their particular category is often the type of configuration. The higher categories may demand an Active/Active cluster configuration with adapters multi-homed to redundant fabrics, while the less servers supporting less sensitive applications may not be cluster-configured or require multi-path support.

Storage subsystems are typically those assets getting the most attention when classifying resources in a tiered data center. This is likely because of the disparity of solutions with a broad range of price/performance metrics. With several data storage classifications, the goal is to make appropriate investments based on the particular need. Because there are so many variations in storage solutions with all the options around disk drive types and the protocols and tools that support them, it's common to start with only a couple classifications for storage. In doing this, there is a cost effective way to separate on at least two tiers. Disk vs. tape, for example, is an appropriate first step to identifying a classification for data storage based on the accessibility the user/application requires. Regardless of the most appropriate classification at the onset of the program, don't hesitate to identify future additional storage tiers that could create an effective model for overall data management. It may help justify the purchase of a relatively expensive high-performance array when the longer-term plan includes augmenting that array with less expensive storage that may bring down the total cost of storage ownership over a two to three year period.

A typically overlooked resource that requires classification is the network infrastructure on which all these resources are connected. An important consideration is the fact that network resources, measured as throughput in MB per second are a fixed resource just as disk space and CPU cycles are. If data is categorized and supported by specific resources at specific times in its lifecycle, then the service this data receives from the network should be consistent with the program. Based on trends such as storage virtualization and network-based storage services, the infrastructure itself is responsible for carrying out many concurrent tasks that are all fulfilling different data center requirements.

As we consolidate, virtualize, and further obscure storage management functions, it becomes increasingly important to protect the necessary network services for time and performance sensitive applications. Make sure the network supports variable services that align with storage tiers, and that the band-width controls are dynamic enough to flex with the changing priorities over time.

Step 3: Data Management and Migration

Once there is at least a framework for classifying data and the different resources that correspond to the tiers, the possibility for real payoff is there. All of the careful arrangement of variable resources doesn't return investment until these resources are used in conjunction with each other, based on policies for how to manage data across the whole storage profile. This is where the concept for "Information Lifecycle Management" (ILM) comes into play.

While the storage tiers stay consistent (although likely to grow in capacity), it's the data itself that migrates from one tier to another, as administrators attempt to keep the right data on the most appropriate resource at the right times. This process is of course in perpetual motion, which is why some rules help provision for how data is migrated. One of the most basic rules is time-based. "After 120 days, offload data from the pricey storage to a less costly system." In some cases, data management rules may be based on particular business priorities, while in many cases, particular regulations will govern how available certain data must be. As with the classification of data and storage resources, the rules which govern how the two are used in conjunction with each other must be simple enough to apply and relevant enough to enable cost savings. Don't try to achieve too much at the onset. It's far easier to start with fewer classifications and rules and enhance the solution over time than trying to configure and manage a comprehensive solution out of the gate.

Theoretically, the more automated the data movement, the higher the value achieved. There are very comprehensive software tools that will assist with data management, but all automation must first be based on rules set out by the business. Again, start with some simple automation of the most regular housekeeping tasks, and then build in more automation as the processes mature.

One of the ways in which enterprises are achieving success through the gradual maturity of a tiered data center is to base all device interconnectivity on the same utility network infrastructure. With universal network availability among devices, classifications and rules for data management won't be bound by connectivity. So, follow the same principles here, as well. Invest in a network solution that can scale, and has the ability to support corresponding levels of service in the network depending on both the data and resource classification it serves, as well as the particular rule that's in place at the time. The same network connection should be capable of providing variable levels of services (MBps) in order to uphold the role of the tier. Having variable levels of network service will also assist when a change is necessary based on an acute need of the business. While the steady-state rules and processes are the norm, consider the ability to respond quickly to a need to recover, migrate or prioritize particular data sets. The network can be the variable used to non-intrusively react to temporary change while not disrupting the organization of the overall solution.

Next Steps

With a tiered structure laid out, and some processes in place for how to manage the lifecycle of data across those tiers, return will come from buying the most cost-effective resources based on a particular need. More advanced value can be achieved through the use of Service Level Agreements (SLAs) and accounting systems that take the tiered structure and represent it back to the business as a set of service options coupled with particular costs to the organization. The greater the visibility into resource utilization, and the greater the control to allocate those resources, the more effectively service, accounting and billing functions can be applied.

In summary, tiered storage solutions are a concept for cost-effective data storage and management. Even the most basic implementations with some simple rules can return savings, with the opportunity to refine the operation to achieve greater results over time. When evaluating and proposing components for the data center, consider how well they will contribute to the overall solution. You'll be rewarded when the data center melds to become a driving force behind the business.

Eric Blonda is director of product marketing for Sandial Systems, Inc. (Portsmouth, NH)
COPYRIGHT 2004 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2004, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Storage Management
Author:Blonda, Eric
Publication:Computer Technology Review
Geographic Code:1USA
Date:Mar 1, 2004
Previous Article:Tiered storage: new strategies match new demands and opportunities.
Next Article:Storage in utility computing: 7 critical questions for IT.

Related Articles
SAN mid-year report card: the enterprise is the activity hotbed.
File systems and storage.
It's 2003: do you know where your data is? The government is enforcing strict new guidelines on archived data. Is your company complying?
Tiered storage cuts costs, improves business alignment.
Heterogeneous SANs: the "Circe" of storage.
Tiered storage: does all data have to fly first class?
SAS: reinventing flexible storage in the enterprise.
SAN-based data replication.
Intelligent SANs: issues to consider when selecting an enterprise-class network storage controller.
Interface considerations for tiered storage.

Terms of use | Privacy policy | Copyright © 2022 Farlex, Inc. | Feedback | For webmasters |