Printer Friendly

Storage too complex, your customers say: listen to them.

It's good that we are now hearing a lot more about where the IT community is having 'storage pain' or problems with its storage infrastructure. This is healthy introspection. Yet, on one hand, I'm concerned that the information being collected and talked about is misleading and potentially defocusing; and on the other, I've researched this area for the last 14 years and see an important shift taking place that must become broadly recognized.

From 1990 to 2000, I reported multiple times annually on the top storage problems based on surveys and custom research. Listening to IT professionals, we consistently categorized the top problems during this period as "managing disk space, running out of disk space, hard disk errors, capacity planning, and of course the list always included the ubiquitous backup demon in the number two spot." It wasn't until the early 2000s that I began hearing these pleas in a different context, as "storage complexity". As operating environments continue the path of consolidating and centralizing, they get bigger. The operating problem has elevated from a small number of servers and network devices to hundreds of thousands of unique networked elements in large datacenters. Yes, this is also what we call a datacenter operations complexity problem, not just a storage problem. But, let's talk about the smaller storage complexity dilemma first.

Today, the number one storage complaint (in frequency and in money allocated to fix it) is described by customers as "adding disk space". What does that mean? Is it hard to add disks? No, of course not. So, when you dig into this complaint you begin to hear the real issues--which is always the case in research. The surface data is usually misleading and not the real issue. There are many examples of this in recent history. In 1992, I began publishing the first large-scale research on the Cost of Managing Storage which included IT operations downtime from storage sources. Then, the number one storage downtime problem was reported by IT administrators initially as "processor failures"--which didn't make sense. So, digging in, we discovered that processor failures meant, "Oh, those are my disk drives" and that a disk failure had a profound effect on servers, operations and costs. Until about 1998, disk failures and errors were very high on the problem list. This problem has now faded into ancient history with the broad adoption of RAID. (RAID was a classic solution to a complexity problem. The solution wasn't to make disk drives more reliable, but to change the architecture to transparently cope with disk reliability problems.)

In the late 1990s, we began hearing words from IT that caused many to think provisioning and storage management were top problems. The industry responded with solutions such as virtualization and a plethora of point management tools such as SRM and auto-provisioning. Have these solved the problem? No, all of them are point solutions. Unfortunately, these vendors didn't hear the core problems. Why is that? I can only suggest that people didn't really study how IT operated. For example, it wasn't until 2002 that we even recognized that Visio drawings and Excel spreadsheets were (and still are) the top management tools in the datacenter because we weren't asking the right questions. (And a datacenter today averages 20-30 management tools.) It is not about "How does IT manage storage?" but "How do they operate storage?" and "What are their operating practices?" The differences are profound and the process workflow usually kept in a spreadsheet, thus its dominance.

Today's top storage problems are similar. I'm placing them into a category I call "Storage Complexity" on purpose because we need to look at the solutions to this category of problems in an operations context. IT doesn't naturally use this term so we're only just now beginning to hear it; but listen closely to what they say. When told that the big problem is some variant of "adding disk space," ask "What does that mean to you?" and "Why is it hard to add disk?" They will recant the statement and declare some string of problems and difficulties that flow from the continual fires and relentless process of expanding disk capacity. Figure 1 gives some examples of how this pain gets voiced.

[FIGURE 2 OMITTED]

I define storage complexity as the chaos of owning and operating thousands of storage elements. That sounds too vague to address, but is exactly the point. We need a new approach to solving complexity problems. Just as RAID solved the "disk error" problem of the early '90s, and 'data protection' will solve the backup problem that persists, so to will 'workflow-based process automation' (operations management) solutions pave the way to solving the storage complexity problem of the 2000s. Why? First, look at what makes up storage complexity. It contains the components listed in Figure 2 and more. The process generally goes "add disk then provision it (those 40-100 steps can take weeks of planned downtime), expand backup, maintain existing systems and operations while doing all this, establish data management services, deal with staffing shortages and expertise to do this work, load balance and tune the systems and applications, zone, recover and reallocate unused space," and on-and-on it goes. These are complex operational processes that cross all boundaries, including business and ownership domains, and are not addressed by a toolkit of traditional management tools. What happens if and when you skip a step such as expanding the backup?

Why is storage complexity an important concept? I reiterate, because the solution to storage complexity requires a new approach based on the application of a principle we haven't done well with in this industry.

Complexity Principle

If you want to solve a complexity problem, stop doing it! It seems blasphemous to shout, "Stop backing up!" or "stop over-provisioning," but they are correct principles and imply that appropriate alternate steps are taken. The solutions to storage complexity problems often begin with a professional services assessment and consulting engagement to develop a new approach to operations simplification. Don't get trapped into deploying a point solution or a couple of tools. Look instead at broad approaches to simplifying your operations and making them "on demand". "On demand" means an environment in which IT administrators can run custom, but standard, automated processes when needed or in which the system operates based on preset thresholds and policies and essentially adjusts itself--on demand.

Here is an example of how this approach works. An authorized storage administrator sets up standard, pre-authorized (probably workflow-driven) processes based on specific business policies and makes them available for use by other application or IT administrators. Processes can include utilization of hosts, network and storage elements, services, time, people, handoffs, workflow, and authorizations. Processes are sets of standard, on-demand operations, allowing any administrator or automated process to select from a pre-authorized set of storage services or practices. As examples, virtualized or pooled storage operates this way, albeit inside a controlled domain. Data protection replaces backup just as RAID replaced JBOD and solved the disk error and failure problems. Data protection, correctly implemented, transparently creates an environment in which the data is "always there" and available for use, on demand.

You have the idea by now. Start thinking about the customer's world through an operations filter and you'll see it differently. The more we start talking about storage complexity, the sooner we will transform IT operations because it will take market education and market development to make this change. Unfortunately, there is a paradoxical barrier we have to overcome. It has long been IT's job to hide and mask complexity from the 'boss'. To admit that you had downtime was perilous--it was safer to hide things. Now, in the 'new economy' and with new operating rules, IT has to expose the costs of operations. Complexity has to be exposed as well, if we are to overcome it.

This goal of overcoming complexity is also what the SNIA's Data Management Forum has chosen as its Vision for ILM. We've defined ILM as:

"Information Lifecycle Management is comprised of the policies, processes, practices, and tools used to align the business value of information with the most appropriate and cost effective IT infrastructure from the time information is conceived through its final disposition. Information is aligned with business processes through management of policies and service levels associated with applications, metadata, information, and data."

This means a whole new way of operating the datacenter based on the value of information. The intent is to ultimately "stop managing". We also need to stop managing storage to solve the storage complexity problem. Start working with this concept. Where to start? Begin with really solving your customer's top pain problems--but listen carefully.
Figure 2

COMPONENTS OF STORAGE COMPLEXITY

What is Storage Complexity?

 Percent of Reported Problems, Multiple Responses

Other 4.8%
Disk Performance 2.8%
Staff Shortages 4.8%
TCO 4.8%
Flexibility 6.0%
Data Mgmt. Services 6.0%
Disk Mgmt. 6.4%
BU 9.2%
Maintenance 17.1%
Process Complexity 17.9%
Capacity Growth 20.3%

Strategic Research Corp.

Note: Table made from bar graph.


www.sresearch.com

www.snia.org/dmf

RELATED ARTICLE: Figure 1

WHAT IS STORAGE COMPLEXITY?

Complexity means customers need a combination of solutions and professional services to solve a set of large scale problems related to their storage infrastructure and storage operations

* I work for a conglomeration that has a patchwork infrastructure and we spend too much time putting out fires.

* Many things, including server consolidation (minimize # of vendors and then ease the B/U situation, i.e. time for B/U), planning and executing DR plans, strategic planning for growth and archiving of data

* More efficient use of our storage, many pools of unused storage out there in different environments. Getting replication to work across data-centers reliably. It's not working the way it's purported to. $2M budgeted

* Management overload--with heterogenous storage, not able to take advantage of vertical storage sub-ports; wasting storage = poor capacity utilization, we bought more than we can use; B/U & restore, no consistent policy so we can't always recover with the SLA periods. $2-3M budgeted

Strategic Research Corp.

Michael Peterson is program director of SNIA's Data Management Forum, as well as president and senior analyst of Strategic Research Corp. (Santa Barbara, CA)
COPYRIGHT 2004 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2004, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Special ILM Issue; Information Lifecycle Management
Author:Peterson, Michael
Publication:Computer Technology Review
Date:Aug 1, 2004
Words:1709
Previous Article:The abundance challenge: create more storage demand; capacity far outstrips ability to utilize it.
Next Article:Compliance drives ILM into SMB market: an interview with Alan Sund of Sony Tape Storage.
Topics:


Related Articles
ILM: the next wave.
Unstructured data: the roadblock to effective ILM.
Policy-based data management in ILM.
Transparent capacity management.
Virtual tape: a solid citizen in an ILM world.
Information lifecycle management: mastering complexity.
The year in storage: data protection led innovations.
ILM: the promises and the problems.
The evolution of hierarchical storage management.
ILM ... easier said than done.

Terms of use | Privacy policy | Copyright © 2021 Farlex, Inc. | Feedback | For webmasters