Printer Friendly
The Free Library
14,507,670 articles and books
Member login
User name  
Password 
 
Join us Forgot password?

Tape and Backup Issues In Storage Area Networks.


Authors' note: This article addresses tape and backup issues in storage networks, with a focus on case studies and other consulting and testing experiences from Imation's Storage Network Solutions Lab, in Oakdale, MN.

As the adoption rate of storage networks continues to grow quickly, customers face fundamental challenges with the architecture and design of the SAN infrastructure. Moreover, there are a myriad of migration challenges--from understanding and mapping what is in place today (typically SCSI-based systems) to anticipating what will be installed tomorrow (based upon the latest industry reports, most commonly Fibre Channel systems).

Disk and tape are two of the most widespread storage devices in an open systems environment. Between these two devices, we've found that more storage networking connectivity issues are centered around the tape environment. This is of considerable concern as it can be argued that the biggest "killer app A software application that is exceptionally useful or exciting. Killer apps are innovative and often represent the first of a new breed, and they are extremely successful. For example, in the late 1970s, the VisiCalc spreadsheet was the killer app for the Apple II, providing reason " that can benefit from storage networking is the backup and restore application--an application relying heavily on the interaction between disk and tape.

Backup and restore applications are the thorn in the side of nearly every IT director. It is analogous to buying home or auto insurance--you hate to pay the monthly premium, but if you don't have it when you need it, the results can be disastrous. As corporations' storage requirements continue to double or triple each year, a backup and restore application designed to be extremely cost-effective three years ago is now bursting at the seams trying to accommodate these new storage volumes. If that backup infrastructure is primarily LAN-based, potentially over a shared 10Mbit Ethernet, it is not uncommon to hear stories of daily backup windows ranging from 13-23 hours every day and weekly backups taking the entire weekend. Backup and restore applications are well positioned to take advantage of every business and technical benefit that storage networks can provide. Because tape is primarily the key media in these applications, we continue to uncover hidden challenges of configuring tape drives in storage network s as we test and stress these applications and environments in the lab.

Use of Disk And Tape

Before we review some lab experiences, we'd like to note the subtle differences between disk and tape and the way they have been used in IT environments in the past and today.

Regardless of the connection method, disk gained the primary attention of high availability Also called "RAS" (reliability, availability, serviceability) or "fault resilient," it refers to a multiprocessing system that can quickly recover from a failure. There may be a minute or two of downtime while one system switches over to another, but processing will continue.  experts. Disk, with spinning parts and moving heads, was tagged as one of the first devices with the highest potential to fail over a specified period of time (also referred to as Mean Time Between Failures--MTBF). That failure potential drove the advent of RAID levels, RAID disk subsystems, and more robust drivers developed with a built-in level of recoverability in the event a hard or soft error occurs, which is often expected.

While tape also has moving parts Moving parts are the components of a device that undergo continuous or frequent motion, most commonly rotation. "Parts" only include the mechanical components which does not include fuel, or any other gas or liquid.  and heads, a failure of a tape drive or tape media did not get the same attention as a failed, unprotected disk drive. Applications and data resided on online disk, and if it failed your application was down. Backup data resided on offline or nearline tape, and if it failed you simply re-ran the backup job.

In today's environment most, if not all, of a company's online disk is protected in some RAID fashion. A disk failure is now an afterthought af·ter·thought  
n.
An idea, response, or explanation that occurs to one after an event or decision.


afterthought
Noun

1.
, with customer-serviceable hot-swap disk drives, global hot spares, and algorithms to predict when a disk is going to fail before it actually fails so that it can be replaced ahead of time.

However, in the backup and restore environment, a failed I/O (Input/Output) The transfer of data between the CPU and a peripheral device. Every transfer is an output from one device and an input to another. See PC input/output.

I/O - Input/Output
 still equates to a failed backup job. Although this job can be automatically retried re·tried  
v.
Past tense and past participle of retry.
, it must start at the beginning. With the volume of data being backed up today, a single, failed backup job could mean not meeting the nightly backup window. There also are protection schemes for tape, such as RAIT RAIT Redundant Array of Inexpensive Tapes
RAIT Radioiodine Therapy
RAIT Ram Rao Adik Institute Of Technology
RAIT Request and Authorization for In-scope Tasking
RAIT Rdma Applications Implementations and Technologies
 (Redundant Array of Independent Tape) and RAIL (Redundant Array of Independent (Automated) Libraries), but these solutions are expensive and therefore uncommon.

With the introduction of tape sharing in storage networks, the possibility of failed I/O has increased, and with it the potential for missed backup and/or restore windows has increased as well. As customers want to maximize the value of current backup assets, most commonly SCSI-based tape drives, they need more equipment between the disk storage and the tape storage such as hubs, switches, bridge/routers, etc., to enable advanced features and functionality. All of this further increases the complexity and the potential for problems.

The IT Director's back is to the wall when looking at backup and restore. Should he or she invest in old or new technology to accommodate this explosive data growth? For reasons that drive migration to new tape technologies, customers are also migrating to a newer backup/restore storage networking architecture. While we feel that this is the right decision for the right reasons, we will discuss some of the more interesting technical challenges specific to backup and restore applications in storage networks.

CHALLENGE #1

The first backup and restore storage networking challenge is specific to Windows/NT, reboots, and tape drive allocation. The issue is with multiple tape drives allocated sequentially based upon the order of NT discovery. If a tape drive other than the last one discovered fails for any reason, all the remaining tape drives will "shift up" upon a reboot To reload the operating system, which restarts the computer. See boot.

(operating system) reboot - (From boot) A boot with the implication that the computer has not been down for long, or that the boot is a bounce intended to clear some state of wedgitude.

See warm boot.
 of the NT server.

Looking at a specific example, upon a server reboot, physical tape drive #3 will become logical tape drive #2 if physical tape drive #2 or physical tape drive #1 fails or becomes unavailable. While this is not a storage network-specific issue, we're seeing more of it given the nature of storage networks fostering and supporting multiple tape drives and tape sharing environments for Windows/NT. Problems other than a failed tape drive may occur, such as a tape drive being used by another server or a problem with the fabric connectivity to the tape drive.

While there is yet no solid solution for this issue, the workaround (jargon, programming) workaround - A temporary kluge used to bypass, mask or otherwise avoid a bug or misfeature in some system. Customers often find themselves living with workarounds for long periods of time rather than getting a bug fix.  is to ensure that a Windows/NT server reboot does not occur when all of that server's tape drives are not online and available. The lab engineers agree, however, that planning every single Windows/NT reboot is a challenge at best.

CHALLENGE #2

The second backup and restore storage networking challenge is specific to Fibre Channel Arbitrated Loop A ring topology used in Fibre Channel. Up to 127 devices may be attached in the loop, but only two can communicate at the same time, reflecting the channel nature of Fibre Channel technology.  and Fabric reconfiguration issues. Our first example is specific to a LIP (Loop Initialization in·i·tial·ize  
tr.v. in·i·tial·ized, in·i·tial·iz·ing, in·i·tial·iz·es Computer Science
1. To set (a starting value of a variable).

2. To prepare (a computer or a printer) for use; boot.

3.
 Process; a means to get an AL_PA to indicate a loop failure or to reset a node) occurring over a Fibre Channel Arbitrated Loop while tape I/O is occurring. If this LIP occurs, the tape driver will report an I/O error and the current tape stream will fail and/or abort (1) To exit a function or application without saving any data that has been changed.

(2) To stop a transmission.

(programming) abort - To terminate a program or process abnormally and usually suddenly, with or without diagnostic information.
. Depending upon the backup software See backup program.

(tool, software) backup software - Software for doing a backup, often included as part of the operating system.

Backup software should provide ways to specify what files get backed up and to where.
, either the failed stream or the entire backup/restore job will require a restart To resume computer operation after a planned or unplanned termination. See boot, warm boot and checkpoint/restart. .

Our second example is more "bleeding edge A pun on "leading edge." It implies that using the latest technology is often risky because it has not been tested with enough users and may not perform as expected. Introducing an advanced product or service is also risky because the user community may not be ready for it or really want " and will require additional work in determining the root cause, potentially in the FCP (Fibre Channel Protocol) See Fibre Channel.

FCP - Flat Concurrent Prolog.

["Design and Implementation of Flat Concurrent Prolog", C. Mierowsky, TR CS84-21 Weizmann Inst, Dec 1984].
 or FCP-2 protocol. At a high-level, if any type of interruption is encountered in a Fibre Channel connection, most commonly through a fabric, the delay may cause an active tape drive to generate an I/O error and pause active read/write activities while the name server is updated in the fabric. As in the first example, either the failed stream or the entire backup/restore job will require a restart depending upon the backup software. The recurring re·cur  
intr.v. re·curred, re·cur·ring, re·curs
1. To happen, come up, or show up again or repeatedly.

2. To return to one's attention or memory.

3. To return in thought or discourse.
 theme in both of these examples is that tape I/O at this point is not as tolerant to interruptions as many would hope.

CHALLENGE #3

The third backup and restore storage networking challenge is specific to both the physical and logical mapping of network storage connected tape drives. We have observed many customers frustrated frus·trate  
tr.v. frus·trat·ed, frus·trat·ing, frus·trates
1.
a. To prevent from accomplishing a purpose or fulfilling a desire; thwart:
 by the allegedly simple task of mapping a pool of tape drives through a storage network into the backup software configuration. This becomes even more of an issue when a storage network infrastructure is not currently in place.

For example, from a physical design perspective a customer may have 12 new tape drives in two new automated tape libraries. It can be very frustrating frus·trate  
tr.v. frus·trat·ed, frus·trat·ing, frus·trates
1.
a. To prevent from accomplishing a purpose or fulfilling a desire; thwart:
 to map the drives in the specified libraries through bridges and/or routers, through a switched fabric, through host bus adapters See host adapter.  and identified channels or paths into backup server A computer in a network used to store copies of files from client machines or other servers. Such servers typically have their disks set up in a RAID configuration to provide fault tolerance. See backup program, RAID, SAN and LAN free backup. (s), especially with all the new technology. Refer to Figure 1 for a graphical representation of this design.

Once those challenges are overcome, the customer must then map the pool of tape components into the selected backup software of choice. Refer to Figure 2 for additional details.

A further challenge is determining which server or servers control the robotics robotics, science and technology of general purpose, programmable machine systems. Contrary to the popular fiction image of robots as ambulatory machines of human appearance capable of performing almost any task, most robotic systems are anchored to fixed positions  and tape mount requests and how that may or may not map through the storage network. The complexity of these tasks should not be underestimated.

We believe that some backup vendors have addressed these challenges eloquently, while others lag well behind. It is up to the customer to closely investigate the storage network features and functionality of the backup software they're considering.

CHALLENGE #4

Our final backup and restore storage networking challenge is specific to tape drive sharing and multiple software applications sharing those same drives or associated robotics. An example is running both an HSM (1) (Hierarchical Storage Management) The automatic movement of files from hard disk to slower, less-expensive storage media. The typical hierarchy is from magnetic disk to optical disc to tape.  application and a backup application concurrently in your environment.

You may assume that your single automated tape library with ten tape drives could be used for backups at night and HSM during the day. We have found this the exception rather than the rule, as this "sharing" of devices between applications is problematic for software and supported by very few vendors.

Our engineers view the uses of the reverse SCSI SCSI
 in full Small Computer System Interface

Once common standard for connecting peripheral devices (disks, modems, printers, etc.) to small and medium-sized computers. SCSI has given way to faster standards, such as Firewire and USB.
 protocol, which has not been widely developed today, as one potential solution. Much like a file system controls access to a specific file to be shared, reverse SCSI incorporates locking functionality to facilitate the sharing of SCSI devices. The implementation of the reverse SCSI protocol would allow these separate applications to effectively share the same SCSI device, i.e. tape drive, by "fencing" off a device in use. There also are new products on the market that allow tape virtualization--relieving the dependency on physical device availability. Regardless of the approach, the customer should recognize the vendors' potentially high licensing fees associated with either backup servers or backup devices See backup storage.  being shared.

Despite these challenges, we believe that the benefits outweigh the risks when moving your backup and restore environment into a storage networking architecture. We caution, though, that you need to know of the technical challenges associated with the implementation and architecture of this environment.

Bill Peldzus is the technology manager and Robert A. Jackson. Erik Jobannessen. Rich Mikulak. Mall Reller, and Jeff White Jeffrey Newman White (born February 19, 1977) is an Australian rules footballer.

Making his debut in 1995 with the Fremantle Dockers, he was drafted with the number 1 pick in the 1994 AFL Draft.
 are senior storage engineers at Imation (Oakdale, MN).
Fig. 2: Configuring Tape Components in
Storage Networks
Configuration Table
Physical  Logical  Library
   T6        0       L1
   T8        1       L1
   T1        2       L1
   T3        3       L1
   T5        0       L2
   T2        1       L2
   T7        2       L2
   T4        3       L2
COPYRIGHT 2001 West World Productions, Inc.
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2001, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

 Reader Opinion

Title:

Comment:



 

Article Details
Printer friendly Cite/link Email Feedback
Title Annotation:Technology Information
Author:WHITE, JEFF
Publication:Computer Technology Review
Geographic Code:1USA
Date:Apr 1, 2001
Words:1842
Previous Article:EMC AND Healthcare.(Company Business and Marketing)
Next Article:High Speed Transaction Recovery.(Technology Information)
Topics:



Related Articles
"Tape Is Always Going To Be Important".(Company Operations)
Implementing Fibre Channel Strategy For Tape Backup.(implementing a LAN-free and a server-less Storage Area Network (SAN) solution for tape libraries...
For Business Preservation [ldots] Get It On Tape.(first of two articles)(Technology Information)
The Role Of Tape-Based Storage In Storage Area Networks.(Industry Trend or Event)
Breakthroughs In Enterprise Backup Solutions For NAS File Servers.(five solutions to back up and restore data on NAS filers)(Technology Information)
Breakthroughs In Enterprise Backup Solutions For HAS bile Servers, Part 2.(NAS file servers)(Technology Information)
ADIC SCALAR 1000 QUALIFIED BY EMC FOR USE WITH EMC DATA MANAGER.
Speeding up the network: D2D backup lets VARs beat the Bottleneck Bugaboo. (Nectivity).(Buyers Guide)
Tape or disk: why not both?(Storage Management)(Industry Overview)
Getting disk into the backup process; adding benefits of disk while supporting existing processes.(Storage Networking)

Terms of use | Copyright © 2009 Farlex, Inc. | Feedback | For webmasters | Submit articles