Measuring The Impact Of Fragmentation On NT System Performance.
"Disk fragmentation can cause performance problems," says the Microsoft Windows NT Server Resource Guide. "You should consider running a defragmentation program on a regular basis."
So what's behind this dramatic shift of position? A wealth of new research has made it clear that fragmentation exerts a serious toll on system performance. This article discusses these research findings, the latest defragmentation techniques, and why regular defragmentation is an integral part of a smoothly running NT network.
Fragmentation In Court
The U.S. District Court in Portland, Maine serves as a striking example of how fragmentation can cripple any NT system. This site originally relied on a Dual Pentium Pro 200 server with three 4GB drives to run ARCServe, Exchange, and several other applications. Yet after a few months of operation, NT performance significantly deteriorated. Some server operations took 25 times longer than normal and backup times lengthened considerably. In response, four NT servers were added over a three-year period.
Despite these hardware upgrades, however, user complaints began to mount up due to file access delays. Some directory listings and file opens, for instance, were taking ten seconds or more and server response again became sluggish.
"One server took 20 minutes to shut down during reboot," says senior automation manager Kevin Beaulieu. "System deterioration over time was the reality of the NT world as far as we were concerned."
The U.S. District Court then introduced a Dual Pentium II 450 with 52GB online storage (RAID 5) and 512MB of RAM. Yet after an initial period of exceptional performance, even this server began to toil noticeably.
"By this time, I'd realized that hardware was not the problem," reports Beaulieu. "Finally, I tracked down the true villain-- disk fragmentation."
At this site, Executive Software's network defragmenter Diskeeper 5.0 was installed across every server and Windows 95, 98, and NT box on the network. Instead of 20 minutes to down a server, it took 45 seconds. File access times dropped from 10 seconds to one to three seconds and many hours were saved through faster backups.
Seeing Is Believing: Real World Testing
For those who consider the experience of the U.S. District Court to be no more than a freak occurrence, try this experiment: Install NT and MS Office on a new machine, then run an analysis of fragmentation (Diskeeper Lite, which includes a fragmentation analysis utility, can be downloaded free from www.execsoft/dklite/download.htm). You'll be shocked at how much fragmentation exists-- even before the system has had a chance to operate. "Don't assume you have no fragmentation just because you have a lot of free disk space," says software developer and NTFS expert, Larry Seltzer. "Even a freshly installed NT or Windows 2000 system has a large number of fragmented files."
The results of that experiment are supported by a recent fragmentation study conducted on 100 companies (ranging from small to large) by La Crescenta, CA-based research firm BNI. This survey revealed that 56 percent of NT workstations had files fragmented between 1,050 and 8,162 pieces. One in four reported finding files with as many as 10,000 to 51,222 fragments. For servers, an even greater degree of fragmentation existed. Half of the respondents discovered 2,000 to 10,000 fragments and another 33 percent had files fragmented into 10,333 to 95,000 pieces. Although rare, it is possible to find the occasional file splintered into millions of pieces.
With such levels of fragmentation existing as the norm, valuable hours can be squandered during backups due to the time it takes to read the disk before relaying the entire file to the storage medium. In extreme cases, the backup device may even come to a halt while the host computer assembles the file.
Defragmentation Performance Testing
Surprisingly, defragmentation had never been independently benchmarked, an omission rectified earlier this year by National Software Testing Lab (NSTL) of Conshohocken, PA. NSTL conducted performance tests on two of the most common NT configurations, determined through a survey of 6,000 system managers running Windows NT. A Pentium II 266 workstation with 96MB of memory and a 2GB IDE hard drive running Outlook and Excel, showed a performance leap of 74.5 percent with a defragmented drive. On a PII 400MHz workstation with 228MB RAM and a 4.2GB HD, the improvement rose to 80.6 percent.
Two popular server configurations also underwent testing. A dual Pentium PRO 200 with 128MB of memory on five, 4gig SCSI hard drives, running RAID 5, Exchange Server, and SQL Server 7.0, recorded an increase of 19.6 percent on a defragmented drive. On a Pentium PRO 200 with 64MB of RAM, two 4GB SCSI HDs running Exchange and SQL 7.0, performance rose by a hefty 56.1 percent.
NSTL tested the effects of fragmentation on files of all types and sizes. The Microsoft Outlook tests, for example, included: opening 50 messages simultaneously; moving messages from the inbox to a separate folder; opening (and displaying to: from: subject: and date:) a large subfolder; a full text search of all messages in a folder for a specific string; and a filter that displayed all messages in a folder that contained an attachment. Each of these tests was executed on the system when the personal folder was fragmented and defragmented. The SQL, Exchange, and Excel tests were carried out in a similar manner.
The results of this testing might exceed what some would have believed possible from defragmenting, but even if the real-world improvements are only half as good as NSTL recorded under lab conditions, it represents performance numbers to rival many hardware upgrades. (NSTL's "Final Report on Defragmentation Performance Testing," and white paper "System Performance and File Fragmentation in Windows NT" can be read in full at www.execsoft.com).
How Defragmenters Work
With fragmentation now proven to exact such a severe penalty on NT performance, safe and reliable methods of dealing with the problem are an absolute must. Fortunately, several recent advances have made it possible for system managers to eliminate file fragmentation as a reason for system slows. Here is how a modem network defragmenter functions.
The program checks each file on a disk to determine which files need to be defragmented and which files should be moved to another location to provide more contiguous free space. If a file is to be moved online, then the defragmenter uses special APIs (known as IOCTLs--input output controls) that work in harmony with the file system to accomplish defragmentation safely.
This special MoveFile API makes a contiguous copy of the file on the disk to the location specified by the defragmenter. Next, the MoveFile API changes the pointers to that file to point to the new contiguous copy of the file. Lastly, the original fragmented file has its disk space de-allocated, but only after successful completion of the prior cycles.
All this is safely conducted online by current-day defragmenters. Being online, files can be defragmented in the background, usually running at low priority so as not to tax overhead. As a result, the system doesn't have to be shut down to handle the disk.
However, there are two file types the 0/S will not allow to be defragmented online using the special APIs or "hooks." These are the paging file and, on NTFS, the Master File Table (MFT). These critical NT files can be safely defragmented at boot time and should be, as they are especially vulnerable to fragmentation. Allowing them to become fragmented is guaranteed to decrease overall system performance. (Note: On NT 4.0, the directories also cannot be defragmented online. In Windows 2000, though, the APIs have been enhanced to allow the directories to be moved safely online during defragmentation.).
Preventing MFT Fragmentation:
On NTFS, the MFT is the "map" of each and every file on the volume and it is itself a file. Every time a new file is made, a new record in the MFT file is created. As more files are added, the MFT expands. Unfortunately, files that are constantly growing such as the MFT are most susceptible to the extremes of fragmentation.
Since the MFT is such an important file to combat any tendency to fragment, Microsoft reserved space on the disk immediately after the MFT called the MFT Zone. Approximately one eighth of a NTFS volume is reserved for the MFT Zone. The theory is that the MFT Zone is reserved space on the disk into which the MFT expands, thus, preventing/minimizing MFT fragmentation.
Despite this precaution, the MFT does fragment due to other files being written into the MFT Zone under certain circumstances. For example, when a disk is full, files are stored in the MFT Zone. Now, let's say that some files get deleted from this volume--but NOT the ones stored in the MFT Zone. Despite having space on the disk, files reside in the MFT Zone. When more files are stored onto this disk, the MFT must expand to store these new files, but the files in the MFT Zone block the way. As a result, the MFT becomes fragmented, adversely affecting both system performance and backup velocity.
Several solutions exist to MFT fragmentation. One new method is to bypass the Windows NT 0/S and the special APIs in order to defragment the MFT online (Microsoft technical authorities consider this approach dangerous), Alternatives include offline defragmentation during rebooting (less convenient, but safe) and the latest approach of Diskeeper 5.0--preventing the MFT from becoming fragmented in the first place.
Here's how it works. When the defragmentation software is first installed on a machine, it is quickly defragmented at boot time if the MFT is fragmented. From that point on, an online monitoring process ensures that the MFT does not again become fragmented, thus eliminating the need to defragment it. As this form of fragmentation greatly increases head movement, a fragment-free MFT can make a considerable difference to backup duration.
Preventing Pagefile Fragmentation
The purpose of the paging file is to store over-committed system memory. As an active paging file is held open for the exclusive use of the NT O/S, online defragmenters cannot access it. This problem, however, can be handled either online (by going underneath the operating system), offline, or in the same preventive fashion as MFT defragmentation. By monitoring the area on the disk at the end of the paging file and ensuring that there is enough space for it to expand into, fragmentation can usually be prevented.
The Smarter Course
Performance degradation is a complex topic. Any system slowdown is potentially attributable to a slow processor, not enough memory, or even an old device driver. Where performance has tended to deteriorate over time, however, fragmentation is the probable cause.
Yet waiting for a slump in I/O pace or sluggish backup performance before acting is asking for irate calls from dissatisfied users and customers. The smarter course is to take fragmentation out of the equation entirely by installing a defragmenter on every single NT, 95, and 98 box across the network.
Drew Robb is the president of Robb Editorial (Tujunga, CA).
Fragmentation Worsens RAID Performance
According to popular mythology, RAID devices do not need defragmenting. Yet in actual practice, fragmentation degrades stripe set performance more than it degrades single volumes.
"RAID systems, using both hardware RAID and Windows NT Server's support for software RAID, are also susceptible to file fragmentation and need defragmentation," states NSTL.
Let's take as an example a RAID array of four physical disks. When data is written to the array, it is written more or less equally to all four devices. This means writing and reading can be almost four times as fast as to a single disk because all four devices are active simultaneously. If it takes 1000ms (milliseconds) to write a file to a single disk, the same write will take about 250 to 300ms to write to this RAID array. Reads are similarly faster.
If the data on one of the disks is in two fragments, the read will not take twice as long, but it will take longer than usual. The read/write head will have to perform an additional seek (seek = movement of the read/write head from where it is to the track where the data is) to get to the second fragment and wait for the disk to turn until the data comes under the head. How long this takes depends on the hard disk, but it typically averages about 9ms. If this file contains one extra fragment, it will take about 3% longer to read (about 9ms extra time added to a 250ms read). If it has ten extra fragments (all on the same disk), it will take about 30% longer to read (90ms added to 250ms).
Now, look at the same file if it is on a plain, single disk partition instead of a stripe set. It takes 1,000ms to read and each fragment adds 9ms. One fragment extends the read time only 9%; ten fragments extend the time only 9%. As we are adding the same amount of time in milliseconds whether the file is on a stripe set or not, the added percentage of time is much worse on a stripe set.
Of course, the stripe set figures assume all of the fragmentation is on one disk, but in the real world, fragmentation will be spread across all of the disks. If it is spread equally (best case), the percentage of slowdown will be the same as for a single disk. Any other distribution will be worse. So, at best, the stripe set will be no better off than a single disk. In most cases, though, the effects of fragmentation will be greater.
A properly designed online defragmenter defragments RAID arrays without defeating the striping effect. It does this by treating the segments of the file as separate files. That is, it consolidates all of the fragments on one member disk into a contiguous segment, but it never moves data from one member disk to another.
The History Of Fragmentation
Fragmentation first appeared about thirty years ago, right after that dark age when computers existed without disks or operating systems. By the late Sixties, disks began appearing, able to store thousands of bytes-a revolutionary concept at the time. One early computer, the PDP- 11, had an 0/S called RT-11 that introduced the concept of storing files in a formal file structure. The downside--all files had to be contiguous Disks with plenty of space, but lacking one free space large enough to accommodate a new file, were "full." With frequent file deletions, it wasn't unusual to have disk reach the point where no more files could be created, even though the disk was little over half full. The solution became the SQUEEZE command that compacted files at the beginning of the disk.
When a new 0/S (RSX-11) came along that allowed multiple simultaneous users of the same PDP 11 computer, SQUEEZE users became a problem, as using it meant all users had to stop working while it ran. That led to the creation of a file structure that could locate different parts of a file in different places on the disk. Each file had a header that gave the location and size of each section of the file, so the file could be in pieces scattered around the disk. Thus, fragmentation became a feature, not a bug in these early systems.
The RSX-11's fragmentation approach was carried over into the Open VMS 0/S and, when its principal designers moved over to Microsoft, they built the NT file on systems this same fragmentation model. The problem, though, is that as hard drive capacities and CPU speeds grew exponentially, disk speeds didn't keep pace.
In today's, client/server world where thousands of files are being written and deleted from disks repeatedly, the end product is files split into thousands of pieces that exertra significant toll on system I/O.
|Printer friendly Cite/link Email Feedback|
|Title Annotation:||Product Support; Microsoft's Windows NT operating system|
|Publication:||Computer Technology Review|
|Date:||Dec 1, 1999|
|Previous Article:||Fibre Channel SANs: Changing The Rules For NT.|
|Next Article:||The Old Guard In The New Era.|
|NCR ANNOUNCES SUPPORT FOR WINDOWS 2000.|