ENVISION: A Clear Look At The Future.This article is the first in a two-part series. The second part will appear in the July issue of CTR See click-through rate. . Data storage requirements are growing at an unprecedented rate. Industry analysts estimate that over 50% of current system spending is on data storage and that 70%-80% of that cost is associated with storage management. The analyst firm International Data Corporation (IDC) estimates that almost l0,000PB of storage will be shipped between 2000 and 2003. Very Large Databases (VLDBs) comprise approximately 65%-70% of all data on existing disk subsystems across distributed, midrange midrange Epidemiology The halfway point or midpoint in a set of observations; for most data, MR is calculated as the sum of the smallest observation and the largest observation, divided by 2; for age data, one is added to the numerator; a midrange is usually , and enterprise platforms and VLDBs larger than l0TB are not uncommon. Disk storage capacities are increasing 60% or more per year and tape storage capacities are increasing 30%-50% per year, but as capacity grows, commensurate improvements in access time, reliability, and cost are necessary. Although the per-unit cost of storage continues to decrease, the cost of server-attached storage often exceeds the cost of the server itself. As storage environments expand, so do the costs of managing the environment: the 70%-80% of the cost of storage associated with storage management makes the total cost of ownership up to seven times the original procurement cost. Backup--the backing up of data from locally or network-attached disk storage to locally-attached tape devices--begins when a host server initiates a backup command, reads data from disk devices, places the data into server memory, and then writes this data to tape. While a tape drive is usually considered the bottleneck in the operation, often it is actually the server that is the bottleneck, consuming inordinate amounts of CPU CPU in full central processing unit Principal component of a digital computer, composed of a control unit, an instruction-decoding unit, and an arithmetic-logic unit. , memory, and server I/O (Input/Output) The transfer of data between the CPU and a peripheral device. Every transfer is an output from one device and an input to another. See PC input/output. I/O - Input/Output . The method used by the backup command to access data from the disk may also limit the disk-to-tape transfer rates, since most file systems write files to disk using small blocksizes. Also, in a typical backup operation, a backup client on the application host sends data to the backup server A computer in a network used to store copies of files from client machines or other servers. Such servers typically have their disks set up in a RAID configuration to provide fault tolerance. See backup program, RAID, SAN and LAN free backup. , which, in turn, writes the data to an attached tape drive. On a large application server, the backup client and server functions are typically deployed on the application-owning host, using significant CPU and bandwidth resources. What do customers want? It's relatively simple, and hardly unexpected: Immediate access to data that is always available and always retrievable from a storage device that is highly reliable, infinitely scalable, and can be purchased at the manufacturer's cost through a bid procurement process. The Solution This article introduces the concept of ENVISION: ENterprise VIrtual Storage Integration Over Networks. ENVISION integrates storage networking and virtualization An umbrella term for enhancing a computer's ability to do work. Following are the ways virtualization is used. Hardware Virtualization Partitioning the computer's memory into separate and isolated "virtual machines" simulates multiple machines within one physical computer. into an intelligent, automated storage infrastructure. ENVISION employs three technologies as the foundation for a performance-centric, cost-effective approach to LAN-less, serverless tape backups over SANs: Storage Area Networks; Storage Resource Management; and Intelligent Virtual Tape Storage Devices (IVTSDs). ENVISION promises to provide a reliable, available, and scalable SAN architecture that truly allows enterprise-wide LAN-less, serverless tape backup. A Review Of The Technologies A SAN is a high-speed channel network, separate from a LAN (Local Area Network) A communications network that serves users within a confined geographical area. The "clients" are the user's workstations typically running Windows, although Mac and Linux clients are also used. , designed to establish an "any-to-any" direct connection between multiple clients, servers, and storage devices. SAN interfaces are typically Fibre Channel (FC) and SCSI SCSI in full Small Computer System Interface Once common standard for connecting peripheral devices (disks, modems, printers, etc.) to small and medium-sized computers. SCSI has given way to faster standards, such as Firewire and USB. , rather than Ethernet or token ring A local area network (LAN) access method developed by IBM. Conforming to the IEEE 802.5 standard, Token Ring uses a token ring access method and connects up to 255 nodes in a star topology at 4, 16 or 100 Mbps. . Assuming that SANs will primarily use FC, the typical architecture incorporates multiple components depending on the required architecture, including hubs, switches, directors, routers, bridges, extenders, and Host Bus Adapters See host adapter. (HBAs). SAN technology promises the capability to connect any server to any other server or storage device over a high-speed channel separate from the LAN. This capability eliminates the need for backups over the LAN, which is not only slow, but also adversely affects production when backups compete for LAN bandwidth. SANs provide increased flexibility, data availability Refers to the degree to which data can be instantly accessed. The term is mostly associated with service levels that are set up either by the internal IT organization or that may be guaranteed by a third party datacenter or storage provider. , performance, and manageability. Use of FC removes previous limitations on distance, scalability, performance, and connectivity and can provide non-disruptive path-level fail-over and load balancing The fine tuning of a computer system, network or disk subsystem in order to more evenly distribute the data and/or processing across available resources. For example, in clustering, load balancing might distribute the incoming transactions evenly to all servers, or it might redirect them over multiple connections between servers and storage devices. Storage Resource Management (SRM (1) (Storage Resource Management) The management of the storage resources in an organization in order to avoid duplication of files and to determine space utilization across all servers. ) is a software solution designed to provide an enterprise-wide view of the storage environment from a centralized cen·tral·ize v. cen·tral·ized, cen·tral·iz·ing, cen·tral·iz·es v.tr. 1. To draw into or toward a center; consolidate. 2. location and from multiple clients. SRM allows configuration, visualization, monitoring, and management of the SAN and the physical and logical storage devices attached to the SAN. It is important to note that SRM only manages the storage resources, not the data itself. In a SAN environment, serverless backup A type of LAN free backup that does not use any of the resources of an application server or a backup server. See LAN free backup. removes the application host and the backup server from the data path. Backup data is written directly from disk to tape and the application server simply initiates the backup application and maintains the controlling software. Once initiated, the server is free to perform application processing without the backup operation tying up CPU cycles and memory. Currently, serverless backup over a SAN may be performed using a third-party copy function that allows a SAN-attached storage device such as a tape library to manage communication between the backup server and SAN disk storage. Within this infrastructure, the storage acts as a backup server on the SAN and transfers data directly from the SAN disk system to the tape library. Another way to perform serverless backups is to use new routing technologies that incorporate active agents with embedded Inserted into. See embedded system. SCSI extended-copy commands, which allow the router to manage the transfer of data directly between disk and tape without havi ng to move data through servers. However, software to support the utility is required. A serverless backup solution that offers significant promise is the use of Intelligent Virtual Tape Storage Devices (IVTSDs) within a SAN configuration. Virtual Tape Storage Devices (VTSDs) are physical disk subsystems that are "mapped" with multiple logical tape devices. This technology effectively increases the number of available tape storage devices exponentially, often offering 64 or more virtual tape drives to the host and applications. The host and applications "see" the logical devices as physical devices and treat them as if they were real. A key advantage to deploying VTSDs is the ability to leverage most of the physical characteristics of the underlying physical device (which, for disk, includes higher reliability, improved throughput performance, and faster access to data). Physical tape drives and automated tape libraries are integrated components of the VTSD VTSD Vernon Township High School (New Jersey) architecture and provide the capability to perform "offline" migration of data to tape (meaning that the operation can be performed with l ittle or no use of CPU cycles). Intelligent Virtual Tape Storage Devices (IVTSDs) are the next generation of VTSDs. Instead of requiring a dedicated application server to perform backups, the IVTSD incorporates "intelligence" in the form of an Intelligent Virtual Storage Management Application (IVSMA) that is integrated into the device firmware A category of memory chips that hold their content without electrical power. Firmware includes flash, ROM, PROM, EPROM and EEPROM technologies. When holding program instructions, firmware can be thought of as "hard software." See flash memory, ROM, PROM, EPROM, EEPROM and FOTA. . The IVSMA operates according to according to prep. 1. As stated or indicated by; on the authority of: according to historians. 2. In keeping with: according to instructions. 3. policies defined by the user. Intelligent agents continuously poll the architecture and the data management tasks are effectively taken away from the host, application, and server. This enables serverless, policy-based data management. The IVSMA communicates with all available hosts and any backup application using a standard data management protocol. Backup operations are initiated by the backup application and the IVTSD takes care of the rest. The LAN is not used to perform the backup, freeing up valuable LAN bandwidth and capacity because the ITVSD is SAN-attached. This technology allows for volume-based or "snapshot" backups, while still providing recovery at the file level. Glenn Jacobsen is the senior partner at Trilliant Group (Denver, CO), a vendor-neutral consulting, integration, and education firm focused exclusively on storage and storage management. |
|
||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion