Storage Intensive Applications.The unprecedented growth rates Growth Rates The compounded annualized rate of growth of a company's revenues, earnings, dividends, or other figures. Notes: Remember, historically high growth rates don't always mean a high rate of growth looking into the future. for digital storage applications have been well documented for the past few years. Annual storage growth rates can now be expected to range from 60% to over 100% annually for the next five years based on the prevailing economic conditions, the degree of effective storage management implementation, and the arrival rate of many new applications. At the heart of storage demand are numerous storage intensive applications. Understanding storage-intensive applications means more than just quantifying high-capacity storage demands with variable performance requirements. Databases now represent an estimated 65 to 70 percent of all (block format) data on disk subsystems across enterprise, midrange midrange Epidemiology The halfway point or midpoint in a set of observations; for most data, MR is calculated as the sum of the smallest observation and the largest observation, divided by 2; for age data, one is added to the numerator; a midrange is usually , and distributed computing (1) The use of multiple computers networked throughout a wide geographical area, or the world via the Internet, in order to solve a single problem. See grid computing. (2) The use of multiple computers in an enterprise rather than one centralized system. platforms. As these databases begin to scale beyond 10 terabytes in size, the need for timely information delivery accelerates. Application requirements for video-on-demand, HDTV (High Definition TV) A set of digital television (DTV) standards that offer the highest resolution and sharpest picture. Although some HDTV sets are available in standard (rather square) screen sizes, the overwhelming majority of sets are wide screen, which eliminates , video mail, and electronic security are pushing the envelope on server capacity and transfer rate. Multimedia applications using video, voice, and text will soon drive throughput requirements beyond the 150-megabyte per second levels, but have relatively low I/O (Input/Output) The transfer of data between the CPU and a peripheral device. Every transfer is an output from one device and an input to another. See PC input/output. I/O - Input/Output per second demands. At the other end of the application spectrum, data warehouse and OLTP (OnLine Transaction Processing) See transaction processing and OLCP. OLTP - On-Line Transaction Processing applications can push I/O requirements beyond 25,000 I/Os per second, while transferring smaller data blocks. Projections for 100-terabyte applications requiring at least 100-gigabit (10 gigabyte) per second transfer rates by the year 2010 are now on the long-range planning horizon Planning horizon The length of time a model or investor or plan projects into the future. . Sizing Storage Intensive Applications The emergence of electronic medicine as a new discipline arises from the awareness that continuing advancements to the medical knowledge base on traditional paper-based methods is impossible. Recent studies suggest that 85 to 90 percent of all healthcare information is stored on paper or film. A typical radiological X-ray takes 12 megabytes of storage. If a hospital performs 200 X-rays per bed per year, a 500-bed hospital will generate 100,000 X-rays per year resulting in 1.2 terabytes of storage. Backing up this data doubles the storage requirement. A discharged patient's X-rays may seldom if ever be accessed again, but a lifelong archive of the data is still required. Others include CT-scans, digital echocardiograms and lab reports, and brain scans brain scan n. A scintigram of the brain, used to identify cerebral blood flow and to detect intracranial masses, lesions, tumors, or infarcts. . Designing server, network, and data storage systems for new storage-intensive applications can quickly approach the limits for traditional file management and storage systems. The applications require careful planning or even outsourcing in order to meet servic e-level agreements. The emergence of the World Wide Web represents a discontinuity dis·con·ti·nu·i·ty n. pl. dis·con·ti·nu·i·ties 1. Lack of continuity, logical sequence, or cohesion. 2. A break or gap. 3. Geology A surface at which seismic wave velocities change. in planning for storage growth. Here, the past cannot be used to predict the future. Internet applications that retrieve textual, e-business, audio, and video information are becoming almost impossible to control or plan for in terms of bandwidth and storage requirements. These applications regularly create and access files (as opposed to blocks) that range from 1 megabyte One million bytes, or more precisely 1,048,576 bytes. Also MB, Mbyte and M-byte. See mega and space/time. (unit) megabyte - (MB, colloquially "meg") 2^20 = 1,048,576 bytes = 1024 kilobytes. 1024 megabytes are one gigabyte. to several gigabytes or more in size. The Internet has already had and will continue to have a tremendous impact on bandwidth consumption and poses a new set of bandwidth management Controlling the traffic flow in a network. See bandwidth manager. and storage management challenges with little help available in terms of measurement tools. Internet traffic Internet traffic is the flow of data around the Internet. It includes web traffic, which is the amount of that data that is related to the World Wide Web, along with the traffic from other major uses of the Internet, such as electronic mail and peer-to-peer networks. jams now pose a bandwidth challenge with an estimated 500 terabytes per month total volume in 1999. This does not yet include the potential impact of the wireless Internet market. The eventual impact of the Internet on the data-storage industry has not been determined. Generally unmanaged since its inception, e-mail was the first killer application Killer Application Killer application or "killer app" is a buzzword that describes a software application that surpasses all of its competitors. Notes: The term is sometimes used to describe a type of software. for the Internet and now consumes several terabytes of disk storage at many customer locations. E-mail messages are often kept "forever" with little focus on deleting data that is no longer valuable. A recent study by the Midrange Performance Group (http://www.mpginc.com) indicated that the average size of an e-mail message, including any attachments, has now exceeded 50 kilobytes. In addition, the average e-mail message presently takes an average of 17 hops to arrive at its destination creating a growing latency problem for Internet performance. Obviously, as e-mail growth explodes, storage management guidelines will become even more important, including the effective migration or deletion of infrequently referenced messages to lower, more cost-effective levels of storage. Soon, voice-mail and video-mail will join e-mail, pushing the requirements of storage and bandwidth far beyond current levels. What percentage of e-mail is junk mail See spam and junk faxes. ? (Table 1). Application Storage Profiles As data continues to escalate in value, selecting the optimal storage system (disk or tape) that best matches the application's access characteristics is quickly becoming more critical. Applications demonstrate predominant access patterns over a period of time that characterize their normal profile. A read-intensive application may even demonstrate a high write level for shorter periods of time, though its predominant profile remains read intensive. Typical application profiles include read/write intensity, I/O (transaction) or throughput intensity, and random or sequential access In computer science, sequential access means that a group of elements (e.g. data in a memory array or a disk file or on a tape) is accessed in a predetermined, ordered sequence. Sequential access is sometimes the only way of accessing the data, for example if it is on a tape. patterns. The requirement for a certain level of availability is implied for each application and attaining higher availability comes at a higher price. Read-intensive applications such as OLTP require large numbers of random read requests and represent a good locality of reference Also known as "locality in space" and "spatial locality," it refers to the fact that most instructions in a program are in routines that are executed over and over, and that these routines are in a reasonably confined area. It also refers to data fields in close proximity to each other. or a high hit-ratio, making it best suited as a disk-caching application. Multimedia and video require large file transfers involving long sequential reads and writes, and favor systems that offer very high streaming data Data that is structured and processed in a continuous flow, such as digital audio and video. See streaming audio and streaming video. transfer rates. Large sequential files with moderate or archival access levels may be best suited for removable tape storage libraries. As the storage networking industry evolves to increased sharing of applications and data having different storage profiles on a single storage system, selecting the storage device that most effectively supports multiple profiles such as read and write, or random and sequential becomes even more critical (Table 2). Data Reference Patterns Another aspect of effective storage management involves understanding data reference patterns. What happens to data as it ages? The probability of reuse is the primary metric for understanding these effects. For most all data, the number of references to data significantly declines as the file ages. This observation provides insight into more cost-effective storage management as it enables the movement of less active data to lower-cost levels of storage. This has historically been the fundamental concept of HSM (1) (Hierarchical Storage Management) The automatic movement of files from hard disk to slower, less-expensive storage media. The typical hierarchy is from magnetic disk to optical disc to tape. (Hierarchical Storage Management See HSM. ). In some cases, aged data can become more active requiring it to be promoted to a higher level of the hierarchy. The ultimate HSM system should move data both up and down the storage hierarchy The range of memory and storage devices within the computer system. The following list starts with the slowest devices and ends with the fastest. See storage and memory. VERY SLOW Punch cards (obsolete) Punched paper tape (obsolete) FASTER based on the activity profile. The tremendous increases in the amount of digital storage continue to make storage management more difficult and, as a result, more data is being accumulated for longer periods of time. The ability to manage data is not keeping up with the growth rate of data. The percent of digital data that has lost its value and should be deleted is quickly declining. The probability of reusing data typically falls by 50 percent after the data is three days old. After 30 days since creation, the probability of reuse normally falls below a few percentage points. E-mail and medical imaging applications represent good examples for the data aging One of the compliance testing applications put forth during the Y2K problem, in which years were added to a date to bring it up to or beyond the year 2000. See Y2K problem. profile depicted here. Keeping very low activity or inactive data on spinning disks for long periods of time is not economical for environmental reasons (increasing electrical consumption), let alone the differential in storage acquisition costs between disk and tape. As the SAN evolves, optimal data placement among various storage technologies will begin to occur dynamically and out board of the connected servers.
Table 2
Application Storage Profiles
APPLICATION Read Write I/O Throughput Random
Intensive Intensive Intensive Intensive Access
OLTP * * * *
Data
Warehouse * * *
System
(SCP) * * *
File Serving * * *
Medical
Imaging * * *
Web/
Internet * * *
Multimedia
Video * *
Document
Imaging * *
CAD/CAM * * *
Backup/
Recovery * *
APPLICATION Sequential
Access
OLTP
Data
Warehouse
System
(SCP)
File Serving
Medical
Imaging
Web/
Internet
Multimedia
Video *
Document
Imaging *
CAD/CAM
Backup/
Recovery *
[Graph omitted] |
|
||||||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion