Speeding up the network: D2D backup lets VARs beat the Bottleneck Bugaboo. (Nectivity).VARs specializing in storage solutions are increasingly confronting demands from their customers to provide more powerful backup solutions that can meet their growing need for data protection The underlying problem is that many of the current tape-centric backup solutions they are selling are lacking in the critical area of performance.
VARs are hearing the demand for new and easy-to-implement solutions that take advantage of in-place networks and give IT managers higher levels of backup performance, faster restores and greater dependability.
These new strategies are in stark contrast to current data-protection solutions. VARs frequently hear customers complain that system-level bottlenecks hamper backup performance. So much new data is being produced that traditional strategies are losing their ability to cope with the volume, particularly as backup windows continue to shrink.
It is time for new thinking about a problem that will not go away by itself.
VANs Must Heselve the Tape and Network Bottlenecks
VARs can no longer offer solutions based exclusively on tape drives, libraries and backup software See backup program.
(tool, software) backup software - Software for doing a backup, often included as part of the operating system.
Backup software should provide ways to specify what files get backed up and to where. . VARs now have the option of offering traditional disk-to-tape solutions or a new generation of disk-to-disk (D2D (Disk-to-Disk) Typically refers to backing up data on disks rather than on tape. Disk-to-disk backup systems provide a very fast restore capability compared with tape backup. See D2D2T and virtual tape. ) solutions. Backup appliances from companies like Okapi okapi (ōkăp`ē), nocturnal ruminant mammal, Okapia johnstoni, of the giraffe family. It inhabits the almost sunless rain forests of the upper Congo and feeds on leaves. Software are giving VARs new sources of revenue while solving the critical problems of data protection and business continuance.
When a single computer in the network data path controls backup processing, that computer will ultimately become a bottleneck. But the easy fixes that include faster processors and more memory mean that companies will have to deal with other bottlenecks long before the backup server A computer in a network used to store copies of files from client machines or other servers. Such servers typically have their disks set up in a RAID configuration to provide fault tolerance. See backup program, RAID, SAN and LAN free backup. runs Out of horsepower.
Fact: Tape drives are typically not as fast as the disks they are backing up. While the disk array (a collection of disk drives) used on most servers is typically capable of maintaining a data transfer rate of 50MB/sec or more, most tape drives transfer data at single-digit rates.
Depending on the amount of data being backed-up and the amount of memory available on the computer involved, the backup process frequently has to wait For the tape to catch up. This increases the size of the backup "window" for each computer being backed up and for the entire backup cycle for multiple computers.
VARs have traditionally offered a straightforward workaround (jargon, programming) workaround - A temporary kluge used to bypass, mask or otherwise avoid a bug or misfeature in some system. Customers often find themselves living with workarounds for long periods of time rather than getting a bug fix. by simply installing more tape drives and making them available for simultaneous use. Two tape drives cut the backup window in half because two servers can back up their data to two separate tape drives at the same time. But if enough tape drives are connected to just one computer, that computer may then become the speed bottleneck.
It is more likely, though, that the network will become the next bottleneck. This is symptomatic of the cascade of bottlenecks that IT managers must deal with to effectively backup their data.
The typical corporate computer network (LAN (Local Area Network) A communications network that serves users within a confined geographical area. The "clients" are the user's workstations typically running Windows, although Mac and Linux clients are also used. ) is far slower than the typical disk drive. A l00BaseT Ethernet LAN will usually deliver little more than 5MB/sec of throughput. (The theoretical maximum is only 12MB/sec.) So both the tape drive and the disk being backed up will find themselves waiting for data to move across the LAN. Even with multiple tape drives, the network is the bottleneck.
A new backup architecture is emerging that lets backups run more quickly and efficiently. Rather than coping with multiple tape drives (and tape cartridges See cartridge. that multiply like rabbits), the latest backup strategies use a dedicated backup network and D2D backup technology. Enter a new VAR opportunity.
The Case for D2D Backup, aka Faster Backup
Using D2D, the servers being backed up never need to wait for a tape drive. The result is shortened backups. For small data files that would normally require frequent tape drive starts and stops ("shoe-shining"), the performance improvement can be substantial. For large files that normally stream to tape at full speed, the higher bandwidth of the disk shortens the backup window. From the server's point of view, once the data is on the D2D backup disk A disk used to hold duplicate copies of important files. A variety of removable media are used for backup, including floppy, Zip and Jaz disks, CD-Rs, CD-RWs and DVD-RAMs. See backup. , the backup is complete.
Properly configured D2D backup enables multiple simultaneous backup operations because the backup disk array can be written to simultaneously by many servers. This means that multiple servers can be backed up at the same time, to shorten the entire backup window.
The D2D disk array (or appliance disk) also enables an offline "secondary backup" to tape. This allows the data on the appliance disk to migrate to tape at a more leisurely, scheduled pace. It also eliminates the need for multiple tape drives to run in parallel. Since the primary backup to disk has already taken place in the shortest amount of time possible, the secondary backup to tape can happen at a much more relaxed pace.
D2D also enables much faster restore times. More than 90% of restores are requested within 48 hours of a backup. Just as backing up to disk is faster, restoring from disk is also faster. If the D2D appliance disk is large enough to hold two days worth of backups, 90% of restores will occur at disk-to-disk speeds.
Finally, D2D backup virtually eliminates the uncertainty of backup processing. From their own experience, VARs are very familiar with the pain associated with uncompleted backups caused by a bad tape or a tape drive problem.
D2D Storage Choices: Network-Attached Storage See NAS. or Block Storage Over the Network
Almost from habit, VARs have used network-attached storage (NAS (1) See network access server.
(2) (Network Attached Storage) A specialized file server that connects to the network. A NAS device contains a slimmed-down operating system and a file system and processes only I/O requests by supporting the popular ) for Ethernet/LAN storage connectivity. NAS uses the heavyweight network storage protocols (NFS (Network File System) The file sharing protocol in a Unix network. This de facto Unix standard, which is widely known as a "distributed file system," was developed by Sun. See file sharing protocol and WebNFS.
NFS - Network File System for Unix and CIFS (Common Internet File System) The file sharing protocol used in Windows. It evolved out of the SMB (Server Message Block) protocol in DOS, which is why the terms CIFS/SMB and SMB/CIFS are sometimes seen. The word "Internet" in the CIFS name has little relevance. for Windows). These heavyweight network storage protocols drive down performance as they drive processor utilization up. NAS performance is also adversely affected by the network congestion In data networking and queueing theory, network congestion occurs when a link or node is carrying so much data that its quality of service deteriorates. Typical effects include queueing delay, packet loss or the blocking of new connections. that arises from multiple data stream aggregation. In addition, all data stored on disk is always stored in blocks. The overhead penalty imposed by a NAS file system on block storage (particularly for Oracle, SQL SQL
in full Structured Query Language.
Computer programming language used for retrieving records or parts of records in databases and performing various calculations before displaying the results. and Exchange data) is both extensive and unnecessary.
On the other hand, NAS allows multiple servers, running various operating systems Operating systems can be categorized by technology, ownership, licensing, working state, usage, and by many other characteristics. In practice, many of these groupings may overlap. , to access the same disk volumes (and even the same files) at the same time. Windows 2000, Linux and other operating systems offer NAS support that allows the transparent sharing of storage resources across a network. And some form of volume/file sharing is required to move data from the D2D appliance disk to tape for long-term retention and/or archive.
There's a big "but" here: There is a huge performance penalty imposed by NAS resource sharing. Even using a dedicated GbE network, a mid-range Windows 2000 system will be unable to move data to and from a NAS storage device at more than about 10MB/sec.
Bottom line: There's more to a D2D backup-and-restore solution than just plugging in a NAS box A network attached storage device. See NAS and box. .
The Dedicated Backup Network Problem
Because of the 24x7 operational requirement of most IT operations, dedicating the corporate GbE backbone network A backbone network provides a path for the exchange of information between different LANs or subnetworks. A backbone can tie together diverse networks in the same building, in different buildings in a campus environment, or over wide areas. to backup is unacceptable. VARs must deal with backup traffic saturating the corporate backbone network that brings all the other servers to their knees.
Solution: The obvious step is to install a second network exclusively for backups. This frees the primary LAN connection making it available for normal data traffic.
Block Storage Is Faster Than NAS
Block storage data transfers, like those delivered by the Fibre Channel Protocol (FCP (Fibre Channel Protocol) See Fibre Channel.
FCP - Flat Concurrent Prolog.
["Design and Implementation of Flat Concurrent Prolog", C. Mierowsky, TR CS84-21 Weizmann Inst, Dec 1984]. ) in Fibre Channel and the Internet Small Computer Systems Interface (iSCSI) for GbE, are significantly more efficient data movers Also called a "storage router," it is a device in a backup system that manages the transfer of data to the backup storage. See LAN free backup. than NAS. Properly configured, both Fibre Channel and GbE networks are capable of wire-speed (Gigabit per second) data transfer. NAS does not run over Fibre Channel and achieves only a small fraction of line rate on GbE.
D2D Using a Fibre Channel SAN VARs know that the most expensive route to a backup network is through a dedicated Fibre Channel (FC) storage network (and frequently referred to as a storage area network or SAN). Using highly specialized and complex components available from a limited number of Fibre Channel vendors, the FC storage network connects multiple computers to centralized cen·tral·ize
v. cen·tral·ized, cen·tral·iz·ing, cen·tral·iz·es
1. To draw into or toward a center; consolidate.
2. data storage devices.
A Fibre Channel storage network uses the same data protocol as the normal SCSI SCSI
in full Small Computer System Interface
Once common standard for connecting peripheral devices (disks, modems, printers, etc.) to small and medium-sized computers. SCSI has given way to faster standards, such as Firewire and USB. connection between computer and disks. But because a Fibre Channel storage network is really a set of point-to-point connections rather than a real network, it can support congestion-free data transfer rates up to 80 or 90MB/sec.
D2D using a Dedicated GbE Network and iSCSI
It is now time for VARs to think about GbE-based storage networks. The least expensive--and by far the least complex--route to a dedicated backup network is through a dedicated GbE storage network (subnet (SUBNETwork) A logical division of a local area network, which is created to improve performance and provide security. To enhance performance, subnets limit the number of nodes that compete for available bandwidth. ) that connects multiple computers to centralized data storage devices--using standard Ethernet and the recently standardized iSCSI block storage protocol. The GbE storage network uses the same commodity Ethernet infrastructure components that companies are already using in their corporate LAN.
A GbE storage network also uses the same data protocol as the ubiquitous SCSI connection between computers and disks.
Dedicated Storage Networks: Part of the Problem end Solution
Unfortunately, whether using Fibre Channel or GbE, a storage network must be configured to provide each connected server with its own disk volume(s) on centralized storage. If a company has a hundred servers, the centralized storage device must be broken into at least one hundred disk partitions See partition. , each one formatted by the computer that owns it, using the file-system it needs.
Windows 2000, Linux and other operating systems don't know Don't know (DK, DKed)
"Don't know the trade." A Street expression used whenever one party lacks knowledge of a trade or receives conflicting instructions from the other party. how to make a block storage network do the same kind of heterogeneous storage sharing as NAS. They treat block storage on the network as if it was just another disk connection. Consequently, block storage networks only allow servers to access their own disk volumes.
This creates a huge problem for D2D backups and restores. When performed across a LAN, multiple computers can access the same disk volumes at the same time to deposit their backed-up data. A dedicated block storage network does not allow this to happen. Instead, with a dedicated block storage network, the permissions of which computer owns which disk volumes must be changed--one-at-a-time. This is much like going all the way back to a regular LAN connection with just a single tape backup Using magnetic tape for storing duplicate copies of hard disk files. Users can add an internal or external tape drive to their desktop computers for backup purposes, and files are typically copied to the tapes using a backup utility that updates on a periodic schedule. device!
Five Key Tradeoffs
D2D backup sounds like a great idea, but there are several pieces still missing from the puzzle. VARs must be aware of the key tradeoffs that have to be made in selecting and implementing a D2D backup solution. These are:
For my D2D network, should I use file protocols (NAS) or block protocols (FC, iSCSI)? NAS can share files and volumes, allowing data to be backed up from the D2D storage to tape. But, NAS performance lags well behind block storage performance. A dedicated block storage network gets data to the D2D disk far more quickly, but the data is marooned ma·roon 1
tr.v. ma·rooned, ma·roon·ing, ma·roons
1. To put ashore on a deserted island or coast and intentionally abandon.
2. once it arrives. Tradeoff: Sacrifice data transfer speed for the ability to retain a copy of data on tape.
Should I install a GbE- or a Fibre Channel-dedicated backup network? GbE is far more familiar and far less expensive to install and maintain than Fibre Channel. But GbE's performance can be degraded by congestion The condition of a network when there is not enough bandwidth to support the current traffic load.
congestion - When the offered load of a data communication path exceeds the capacity. . Tradeoff: Give up simplicity, familiarity and cost containment cost containment,
n the features of a dental benefits program or of the administration of the program designed to reduce or eliminate certain charges to the plan. for higher performance.
Should my D2D solution emulate a tape drive or a disk drive? While tape emulation sounds "more natural" for a backup operation, disk emulation lets backup software take advantage of disk's direct access capabilities. Disk emulation also eliminates mount and positioning delays and eliminates the complexity of volume multiplexing multiplexing, in communication, technique whereby two or more independent messages, or information-bearing signals, are carried by a single common medium, or channel. . Tradeoff: Give up the tangible benefits of disk technology for the ephemeral Temporary. Fleeting. Transitory. benefit of tape emulation.
Which low-cost storage should I use: ATA/IDE or the latest in Serial ATA See SATA.
Serial ATA - Serial Advanced Technology Attachment disk technology? ATA/IDE disk drives are in wide use today, but the industry is moving overwhelmingly to Serial ATA. Tradeoff: Embrace tomorrow's technology or stick with yesterday's technology.
Should I change my backup software? Many D2D solutions require that you abandon your currently installed backup software in favor of theirs. You probably have a considerable investment in training and operational procedures The detailed methods by which headquarters and units carry out their operational tasks. to support your currently installed backup software. Additionally, your backup media was probably recorded using the format supplied by your currently installed backup software. Tradeoff: Abandon a substantial investment in training, operational procedures and .media when a D2D vendor promises better things to come.
Faced with more data and few options to implement critical, robust data protection, VARs can now sell a new generation of emerging D2D backup appliances.
RELATED ARTICLE: The Nine Primary Features-Benefits of D2D
Backup to disk is faster than backup to tape: Faster backup minimizes both the time data is unprotected and the time servers are unavailable for their primary task).
Backup to disk eliminates the mount and positioning delays associated with tape based backup and restores: The elimination of mount and positioning delays that typically last anywhere from a few seconds to a few minutes.
Multiple simultaneous (parallel) backups and restores: Multiple servers can be backed up at the same time; critical restores can take place while backups are being processed.
Creating duplicate copies of backup images from a disk storage unit to tape can be performed as an offline operation, independent of the original client that was backed up: Creating duplicate backups to enhance data availability Refers to the degree to which data can be instantly accessed. The term is mostly associated with service levels that are set up either by the internal IT organization or that may be guaranteed by a third party datacenter or storage provider. is simple, and that there is no additional impact on the client.
Faster restores from disk to disk: Just two days of backups maintained on disk means that 90% of restores will happen at disk-to-disk speeds, therefore business can continue sooner rather than later.
Reduced traffic on the LAN--LAN-free: The rest of the business can continue processing with undiminished performance while backups are being processed.
Backup to disk doesn't require multiplexing backups from many slow clients onto a single tape: Tuning multiplexing for optimal performance is no longer required, and the complexity of restoring from a multiplexed tape is eliminated.
D2D restores don't require a tape drive: Critical restore operations don't have to wait for a tape drive to free up.
Backups and restores using D2D are less prone to failure: Increased confidence in the ability to satisfactorily complete backups; fewer reruns required.
Bob Farkaly is chief marketing officer at Okapi Software (Poway, Calif.)