Tiered storage: does all data have to fly first class?
Tiered storage can involve one or more strategies:
* Provisioning cheap vs. expensive disk, based on availability or performance needs
* Allocating different levels of data protection, based on recovery requirements
* Migrating data to different classes of storage as requirements evolve, e.g., ILM.
Each of these strategies comes with its own implementation issues' and tradeoffs between savings, operational complexity, labor cost and risk.
Matching storage price performance to application need has a very straightforward value proposition. Placing low value data on inexpensive disk, such as ATA devices, can carry a hardware cost of less than $7 per gigabyte. This compares with high-speed, high-reliability arrays that can run more than fifteen times as much. In addition, allocating storage to applications from different device types is also a relatively straightforward management proposition from a policy compliance perspective. But cheap versus expensive is a strategy best suited for larger enterprises with enough data in different classes to justify the expense of acquiring different classes of storage devices, and training staff on the different tools required to manage them.
Provisioning different levels of data protection offers perhaps the most widely applicable and cost-effective tiering strategy. Not all data requires mirroring, local or remote replication to meet availability and recovery requirements. But arrays in most IT departments are configured to deliver at least mirrored, if not replicated, storage to all data. One storage administrator joked to me that this was his "make a copy of this before you throw it away" policy. Configuring LUNs with different levels of data protection and allocating these to applications based on business requirements can save as much as 30% over provisioned, high-performance storage capacity. But managing different levels of protection is a labor-intensive process, and tight policy control is required to certify that critical business data is never under-provisioned.
Finally, migrating data from high-performance, high-availability storage to lower cost media as the data becomes less critical can further optimize asset utilization. However, this strategy assumes that tiering policies and practices for initial provisioning are in place, and that companies have a firm grasp on the metrics that determine when data evolves to require a lesser tier of storage service. More importantly, if data is migrated, what circumstances will dictate its return to "critical" status, and can re-provisioning be accomplished in a compliant fashion?
Increasing pressure to improve IT cost structure, coupled with the emergence of process automation tools that simplify operations and enforce policy compliance for service delivery, now presents a more practical risk-reward equation for tiered storage. Software to automate and control the process of storage service delivery differs from traditional monitoring tools by allowing IT departments to package the systems operations, human work-flow and policies required to deliver a specific class of storage to a specific application--then deploy these "packaged services" as a repeatable, software-automated process. Process automation software can reduce the labor and complexity involved in supporting multiple tiers of storage service. More importantly, these tools effectively bind service tiers to applications enforcing compliant service delivery. As the gap in price between classes of storage technology continues to widen, the justification for implementing a tiered storage strategy is becoming more compelling. But most IT departments today don't have documented processes and policies for service delivery. Many that do, still struggle with enforcing and measuring operational compliance. And most automation has been implemented through scripting of practices and policies that have evolved in the context of a manual storage supply chain. From this starting point, how does one evolve to tiered storage services without placing the company's data at risk, or burying already shorthanded staff in a project heavy on documentation and light on implementation?
A practical tiered-storage strategy is best pursued in incremental fashion: Defining data classes and their associated tiers of storage service, then implementing policies and service delivery processes--one class of data at a time. This "crawl, walk, run" approach provides a low-risk, non-disruptive way to demonstrate immediate results while continuing to learn, adapt and improve. I break the initiative down into 3 phases: Service mapping, process mapping, and implementation. No matter which spin on these steps is right for you, always think big, start small and scale fast.
Service mapping defines the business requirements for your data, classes of storage services to meet these needs, and a policy framework to associate one with the other. Data should be classified as to requirements for availability, application performance, protection, recovery time, and retention time. Service tiers or classes should be defined to match these needs. The number of tiers, or granularity of service class to application requirements, varies. But as a general rule, it is best to start with a few (i.e., gold, silver, bronze) and then expand variety as you gain experience.
Process mapping defines the operational workflow, policy and systems operations required to ultimately match and deliver classes of storage to your data. These processes include initial provisioning, ongoing capacity and data management, as well as migration processes as data requirements evolve. Start with a process for a single data class for an application, and its respective storage service tier.
Map the end-to-end service delivery process--human workflow, system operations, and approval chain--all the way from the application (or service consumer) to the supporting infrastructure. Limiting your view of tiered storage to the SAN infrastructure ensures that LUNs of certain characteristics are "available" but misses the whole point of certifying that the right application actually receives its requested service in a compliant fashion. The end-to-end view typically involves not only operations performed by storage administration but systems, database and backup administrators as well. For example, adding space to a database on a new file system with mirrored storage, remote replication and tape backup is a process that spans all these departments, in multiple locations. Mapping all the local and remote operations, as well as the workflow and interaction between administrators, is the first step in being able to achieve compliant service delivery.
Define the control model: Who specifically is involved, who will have control over policy definition and process execution, for what infrastructure/operations in the service path? What policies do you want "fixed," or system enforced, and which one will humans make? What are the response times and escalation process for exception and error handling? The service delivery process and control model not only provide a detailed definition of how service delivery objectives are met, but should also be used as a baseline for process improvement throughout the implementation phase.
In the implementation phase, start with automating management of the administrative workflow for a single end-to-end process. This is a simple non-disruptive approach to begin to enforce compliant practice, without jumping headlong into lights-out automation. Over time, incrementally integrate automated system operations into the workflow to offload human tasks to the system. With each iteration, it's important to audit performance against the baseline. What requests were serviced, what were response times, where were problems and exceptions encountered? This discipline will enable you to target areas and priorities for process improvement, as well as evolve policies to capitalize on improved control over the service delivery process.
So, does all data have to fly first class? Certainly not. Storage price performance options and process automation software have evolved to a point where there is a compelling rationale to pursue tiered storage strategies. Look to start with operational practices tiering classes of storage device or data protection. Enterprise-wide ILM implementations are not yet a practical reality but the best way to achieve short-term results and get from here to there is to phase in process and policy incrementally--and continue to evolve.
Tad Lebeck is CTO of Invio Software (Los Altos, CA)
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Storage Networking|
|Publication:||Computer Technology Review|
|Date:||Feb 1, 2004|
|Previous Article:||Serial attached SCSI and serial ATA seek their levels.|
|Next Article:||The dangers of recycled tape media.|