Optimizing Your RAID Array Performance.When companies invest in expensive RAID (Redundant Array of Independent Disks) arrays, they generally have two primary concerns: high reliability, which translates into high data availability Refers to the degree to which data can be instantly accessed. The term is mostly associated with service levels that are set up either by the internal IT organization or that may be guaranteed by a third party datacenter or storage provider. , and I/O (Input/Output) The transfer of data between the CPU and a peripheral device. Every transfer is an output from one device and an input to another. See PC input/output. I/O - Input/Output performance. RAID, in its various implementations, has been around for quite a while, and has proven itself time and again in terms of both of these issues. RAID hardware and software continues to evolve to match the rising bandwidth and availability requirements of our increasingly data intensive world. Reliability of RAID today is rock solid, either through the use of RAID mirroring, or with "hot standby A hardware device that is connected to the computer or computer complex and remains powered on. It is ready to take over immediately if the primary unit fails. A hot standby may refer to a complete computer system; for example, a standby server, or a component in a computer such as a " disk drives that can automatically reconstruct data from a failed drive in a RAID 3 or 5 configuration. RAID speed and throughput, however, is not as cut and dried cut and dried cut adj (also: cut-and-dry) (answer) → eindeutig: (solution) → einfach . Don't misunderstand mis·un·der·stand tr.v. mis·un·der·stood , mis·un·der·stand·ing, mis·un·der·stands To understand incorrectly; misinterpret. ; today's RAID arrays, spinning at 7200rpm or better, are extremely fast. I/O throughput, however, is influenced by a number of factors outside the array, such as the operating system operating system (OS) Software that controls the operation of a computer, directs the input and output of data, keeps track of files, and controls the processing of computer programs. , database, applications, and file sizes being read or written. How do you know you're getting optimal performance from your array? Many operations managers See datacenter manager. have been mesmerized by vendor promises of blazing disk performance, only to find out after implementation that throughput reality is far short of the hardware specification. RAID vendors, like just about every other hardware company, love to tout the speed of their equipment; it's only natural. Everywhere you turn these days, you're inundated in·un·date tr.v. in·un·dat·ed, in·un·dat·ing, in·un·dates 1. To cover with water, especially floodwaters. 2. with the speed mantra mantra (măn`trə, mŭn–), in Hinduism and Buddhism, mystic words used in ritual and meditation. A mantra is believed to be the sound form of reality, having the power to bring into being the reality it represents. . UltraWide SCSI SCSI in full Small Computer System Interface Once common standard for connecting peripheral devices (disks, modems, printers, etc.) to small and medium-sized computers. SCSI has given way to faster standards, such as Firewire and USB. pumping data at 4OMB/sec, Fibre Channel at 100MB/sec, 600Mhz processors, FireWire running at 400MB/sec, T1 & T3 networks, Gigabit Ethernet An Ethernet standard that transmits at 1 Gbps. Used mostly to connect high-end workstations and servers as well as for network backbones, Gigabit Ethernet transmits full duplex from point to point using switches and half duplex in a shared environment (CSMA/CD) using a hub. . Whew whew interj. Used to express strong emotion, such as relief or amazement. whew interj an exclamation of relief, surprise, disbelief, or weariness ! Specification sheets from RAID vendors indicate some dazzling numbers for I/O throughput in terms of MB/sec or I/O/sec. It's important to understand that most RAID array throughput specifications are theoretical maximums, attained only when the device is running 'raw'; that is, not attached to a real live production computer running a mix of software applications. That brings up the main point. In any implementation of RAID, your objective should always be to tune the array to your particular set of applications. While you still won't achieve the rated speed of the array, you'll maximize the data throughput for the specific blend of hardware and software you're running. Unfortunately, most RAID users either can't or don't take the time to tune their system. Here are a few things you'll want to know about your operating environment In computing, an operating environment is the environment in which users run programs, whether in a command line interface, such as in MS-DOS or the Unix shell, or in a graphical user interface, such as in the Macintosh operating system. and RAID array that will help optimize I/O and throughput performance. Application Environment There are two prevalent types of application environments. Those that execute a high volume of transactions with each transaction containing only a small amount of data, and those that read and write large files such as video, image, or other multimedia applications. Many companies have both kinds of environments. In such cases, they usually separate them on different machines and then allocate a specific RAID array to each kind of application environment. As a general rule, high volume transaction environments work best when the RAID array can execute an optimal number of I/Os per second. Many shops will seek I/O rates in the range of 7,000-8,000 per second. For environments that read and write large file sizes, the RAID array performs better with higher MB/sec transfer rates. RAID Array Parameters In order to tune an array to the specific environment properly, several key parameters need to be configured; RAID level (0,1,3,5), cache size, and stripe or segment size. Another factor to consider is the total number of drives assigned to a group or logical unit. Again, in general terms, a high volume environment will have a smaller stripe size and a smaller cache block cache block - cache line , while large file sizes will be better accommodated with larger cache and larger stripe sizes. Better performance can be gained by increasing the number of drives up to a certain point. Some RAID arrays are capable of configurations with 60 drives on one controller, but in reality adding drives much beyond 30 for a logical unit will not improve performance. Performance can also be affected by the percentage of read versus write activity. Write performance is typically slower, such that if your environment is skewed skewed curve of a usually unimodal distribution with one tail drawn out more than the other and the median will lie above or below the mean. skewed Epidemiology adjective Referring to an asymmetrical distribution of a population or of data toward disk writes, you might consider changing the RAID level, say, from RAID 5 to RAID 1. In order to do dynamic fine-tuning, you need access to two key functions. First, you need to capture data about your disk array while it is running in a live production environment. Second, you need to be able to manipulate RAID configuration parameters without taking the array offline. Until recently, the data center administrator's hands have been tied when it comes to dynamic tuning. There are only a very few RAID vendors that offer the capability to capture and analyze array performance data at the disk controller level. You must have data to identify bottlenecks, detect usage patterns, and monitor impact of any changes you make. Likewise, there are very few that offer live alteration of the array such as changing cache size, stripe size, or even RAID level without losing data or requiring the device to be offline. Because of this, operations managers have, to a certain extent, been at the mercy of the vendors. Vendors set default array parameters at the factory. Not knowing in advance what kind of application environment the array will service, the defaults tend to be middle of the road. At implementation time, RAID vendors can adjust parameters to attempt to match a high volume transaction or large file size environment, but, once they leave, it is difficult for the data center administrators to make improvements. What kinds of performance data would be useful? You want to be able to monitor live, production activity at the disk controller and logical unit level for such things as: total I/Os (read and writes), read percentage (percentage of the total I/Os that are reads), cache hit Finding and retrieving an instruction or item of data in a cache. Contrast with cache miss. See cache. (storage) cache hit - A request to read from memory which can satisfied from the cache without using the main memory. Opposite: cache miss. percentage (percentage of reads found in the cache as opposed to requiring a disk access), current throughput in MB/sec, and current I/O rate per second. With this kind of information in hand, you can begin to manipulate configuration parameters and fine-tune your system. Bottom Line You owe it to your users to maximize the performance of your RAID array. When you're considering a RAID vendor, find out what kind of flexibility you have in dynamically tuning the array. Bear in mind that performance tuning Performance tuning is the improvement of system performance. This is typically a computer application, but the same methods can be applied to economic markets, bureaucracies or other complex systems. requires a bit of trial and error. You don't want to have to call in your vendor every time you need to change certain parameters. Ask your RAID vendor the following questions. What parameters can you monitor? What data can be collected? What key array settings can you adjust? Can you alter a live, production array configuration without destroying the data? When you find a vendor that can answer these questions to your satisfaction, you can proceed with the confidence that you will be able to optimize your RAID array and keep your end users happy. Mike Rouch is a product manager at Open Systems Solutions, Inc. (Yardley, PA). |
|
||||||||||||||||||

Printer friendly
Cite/link
Email
Feedback
Reader Opinion