Many Linux users, like computer users generally, are speed freaks. We buy the latest hardware (or the latest hardware we can afford) in an effort to trim a few seconds here and there. In the case of desktop users, this is done for personal benefit, but for server computers, the benefit is improved server performance, which can be very important for a busy server.
In general, the same optimization principles apply to both types of system, although some performance measures are more important for some applications. A server isn’t likely to need the latest and greatest video card, for instance, but a desktop system that’s used for gaming or other video-intensive tasks might.
This column focuses on just one class of performance optimization: hard disks. Hard disk performance affects many different aspects of system operation, including boot time, program launch time, file load and save times, program compile times, and swap efficiency. A poor configuration can produce very sluggish performance, making even a brand-new computer feel like one that’s several years behind the times. A good configuration, on the other hand, can help you get the most out of your computer, and perhaps even extend the useful life of a middle-aged computer.
You should know at the outset that some disk performance testing and tuning features can be potentially risky or be very time-consuming. Thus, many of the methods described here are best applied when you first install Linux on a computer or when you need to perform major disk maintenance for other reasons (say, when you install a new disk to expand your disk space) and are already planning to create full backups.
Testing Disk Performance
To improve disk performance, you must first be able to measure disk
performance.Several approaches exist.
In many ways, the best approach is to time how long disk-intensive tasks take. For instance, you can pull out a stopwatch and measure how long the computer takes to boot. Additionally, you can time the operations that underlie a Linux command with the time utility. Type time followed by the command you want to run to obtain a report of the command’s run time:
$ time some-command
This output indicates that the command took 0.880 seconds to execute, with 0.005 seconds of user CPU time and 0.008 seconds of system CPU time. In measuring changes to disk speed, the real line is generally the most important one. Some changes to disk configuration, though, can affect CPU use.
When measuring the total time to complete an operation, be aware of disk caches. Linux uses memory that’s not otherwise occupied to cache disk accesses, so if you repeat a disk-intensive task, it’s likely to take less time than the first invocation. This can be important if you make a simple change, such as tuning your disk access mode, and then attempt to measure performance changes related to your tuning. You should run a command at least twice. If the times differ greatly, run the command again. If disk caches are resulting in quicker access times, you should use the longer time as a measure of disk performance.
If you want to measure disk access performance changes, your best bet is to unmount and then remount the filesystem you’re using between tests. This procedure flushes the disk caches, providing a good measure of actual disk performance for your new test. If the disk accesses necessarily involve a partition that can’t be easily unmounted (such as the root partition), you may need to reboot the computer between tests. (This can make performing tests a tedious activity.)
Factors other than actual disk speed, such as competing processes consuming CPU time and demanding disk access, can degrade performance. Thus, you should perform your tests on a lightly loaded system. If possible, run your tests in runlevel 1, a single-user mode that minimizes the number of running processes.
Some disk-tuning techniques involve changing the way the kernel interacts with the hardware, or changing the hardware altogether. Such low-level changes affect disk access below the filesystem level, and can be measured with the help of the hdparm command:
# hdparm –Tt /dev/hdb
Timing cached reads: 2520 MB in 2.00
seconds = 1258.93 MB/sec
Timing buffered disk reads: 170 MB in
3.03 seconds = 56.19 MB/sec
The –T parameter returns information on the performance of the disk cache (1258.93MB/s in this example), which is effectively a measure of the computer’s memory subsystem. The –t parameter returns information on uncached disk reads (39.29MB/s in this example), which is the raw read performance from the disk, without the intervention of the disk filesystem drivers. Modern disks normally return disk access performance measured in several tens of megabytes per second.
You can find theoretical maximum disk performance figures on your hard disk manufacturer’s web site. You may need to dig into technical specification sheets, though, and you must keep a few caveats in mind. First, disk manufacturers sometimes list speeds in megabits per second rather than megabytes per second. If so, divide the megabits per second by eight to convert to megabytes per second. Second, disk manufacturers often emphasize their interface speeds rather than their disk hardware speeds; interface speeds are often much higher than disk hardware speeds. (The hardware speeds are often referred to as “internal” transfer rates.) Third, disk performance varies depending on the location on the platter. You can see this if you measure performance of partitions rather than disks (/dev/hda1 versus /dev/hda10, say). Finally, manufacturers provide optimistic performance measures; hdparm is likely to deliver results that are about 50 percent to 90 percent of a drive’s rated performance.
Setting the Disk Access Mode
If you discover that your hard disk’s raw access speed, as revealed by hdparm, is lower than you think it should be, you should investigate setting your low-level disk-access parameters. Such changes are possible only for ATA disks, not for SCSI disks. (Serial ATA, or SATA, disks are treated like SCSI disks by some SATA drivers, but other drivers treat them like ATA disks. Your disk device filename indicates which is the case. If your disk is accessed as /dev/sda, /dev/sdb, or so on, it’s being treated as a SCSI device and hdparm can’t tweak its low-level performance.)
The most common problem with ATA disk options is a system that uses the disk with improper direct memory access (DMA) options. At the extreme is a system that uses programmed input/output (PIO) mode rather than DMA mode. PIO mode can sometimes produce reasonably good disk throughput, but consumes a lot of CPU time in the process. You can set DMA mode by using the –d1 option to hdparm. You can further tweak the configuration with the –X option, which takes a DMA mode as an option: sdma x, mdma x, or udma x, where x is a number referring to a simple DMA, multiword DMA, or UltraDMA mode, respectively.
Table One summarizes current ATA disk standards. This table, in conjunction with your hard disk’s documentation, should help you determine the maximum DMA mode your hardware supports.
||Maximum DMA Modes
||SDMA 0, 1, 2; MDMA 1
||UDMA 0, 1, 2
||UDMA 3, 4
Both your hard disk and your hard disk controller must both support the DMA mode you use. If you’re using a new hard disk on an older motherboard, you might want to look into buying an add-on ATA controller to improve disk performance.
Modern hard drives and controllers typically support UltraDMA mode 5, but older drives support lesser standards. If in doubt, consult your drive’s documentation. When you know the features it supports, you can activate them with a command like this:
# hdparm –d1 –X udma5 /dev/hda
Before issuing such a command, though, be aware that setting the DMA mode incorrectly can result in rendering the drive inaccessible. This, in turn, effectively crashes the entire computer, so you may need to hit the computer’s reset button. It’s recommended that you save all of your work, log out of normal user accounts, and unmount as many filesystems as possible (including /home) before experimenting with this command.
Once you’ve found the proper settings to optimize your disk performance, you can enter the hdparm command into a startup script, such as /etc/rc.d/rc.local or /etc/conf.d/local.start, depending on your distribution.
Fortunately, most distributions manage to set the disk access options optimally when they boot, so tweaking this setting is usually not necessary. When it is needed, though, it can greatly improve your system’s disk performance.
SCSI and the Linux SCSI drivers were designed in such a way that tuning disk performance with hdparm is unnecessary — the disks should always operate at the maximum mode allowed by the disk and the SCSI host adapter. You can still use hdparm to test a SCSI disk’s performance, though. Similar comments apply to SATA disks driven by SCSI drivers.
Beyond adjusting the way the kernel talks to your hard disks, you can choose which filesystem or filesystems your system uses. Unfortunately, picking a filesystem for optimal performance isn’t an easy task. Too many variables exist that affect performance, such as disk throughput, disk head seek speeds, overall system load, how full the filesystem is, and whether you’re accessing large or small files. A few generalizations can be drawn, though:
*Journaling filesystems are preferable. Journaling filesystems (ext3fs, ReiserFS, JFS, and XFS) maintain a journal, or log of pending changes. Maintaining the journal slightly degrades performance, but greatly speeds recovery after a power failure or other uncontrolled shutdown. Most journaling filesystems include more advanced features than do non-journaling filesystems (such as ext2fs), which partly or completely counteracts the extra effort the journal requires to maintain. Overall, a journaling filesystem is a big improvement, particularly if you want to minimize boot times after a problem shutdown. An exception is small filesystems (such as /boot, if it’s on a separate partition), on which the journal consumes too high a percentage of disk space to be worth its while.
*Small files work best with ReiserFS. If a partition holds many small files, look into ReiserFS. Although it might or might not perform any better than other filesystems, ReiserFS is more efficient at packing small files onto the disk. The result is that you can fit more small files on a disk.
*ext2fs and ext3fs are dependable on all platforms. Fortunately, all the major Linux filesystems (ext2fs, ext3fs, ReiserFS, XFS, and JFS) are reliable on x86 systems. On other platforms, one or more of these may be sluggish or unreliable. If in doubt, stick with ext2fs or ext3fs; these are the most likely to be speedy and reliable.
If getting the absolute best speed out of your filesystem is important, you may need to perform some tests, ideally using the hard disk and applications you intend to use. For a typical single-computer desktop or even small server installation, such tests are likely to consume so much time that they aren’t worthwhile. If you’re deploying hundreds of identical desktop systems, though, it might be worth running a few tests to see how different filesystems cope with the sorts of tasks your systems will be performing.
In addition to filesystem choice, filesystem layout can affect performance. Two factors are important to consider when designing a filesystem layout:
*Seek times. When Linux accesses data from different partitions on a single disk, the disk head must move (or seek) from one area of the disk to another. This action takes time, so if your disk layout is such that data from the start and end of the disk must be frequently accessed, performance will be degraded compared to a layout in which frequently accessed data lie close together.
*Disk throughput. As noted earlier, disk throughput varies from one part of a disk to another. As a general rule, earlier parts of the disk (partitions in low-numbered cylinders) perform better than do latter parts of the disk. Thus, putting frequently accessed data at the start of the disk generally makes sense.
Typically, the best performance can be achieved by placing the most-used partitions, such as partitions for /usr, /home, and swap space, in the middle of the disk. Partitions that are seldom accessed, such as /boot or a partition holding an emergency Linux installation, are best placed in the peripheral regions of the disk.
Figure One illustrates a good single-disk configuration. The assumption is that most accesses involve /usr, /home,, or swap space, with progressively less frequent accesses for partitions further from these. Such a layout will minimize disk seek times and therefore maximize performance. Of course, different systems might have different access patterns, so Figure One might be an excellent configuration for one system but poor for another.
In multi-disk systems, try to spread your access across disks. For instance, in a multi-boot configuration, don’t devote one disk entirely to Linux and the other disk entirely to the other OS. Splitting both OSs across both disks will improve performance for both OSs.
In a Linux-only configuration with multiple disks, put both commonly used and rarely used partitions on both disks. If /usr is on one disk and /home is on another, then a pattern of use that entails accessing files in both directories will require no head seeks to move between those two partitions (although of course there may be head seeks within each partition).
Advanced configurations take advantage of Linux’s support for Redundant Array of Independent Disks (RAID). This technology enables you to split a single virtual partition across two or more physical disks. This can be done to improve data security (in case one disk fails, a copy will exist on another disk), to improve performance (by spreading access across multiple disks), or both. Although RAID configuration is too complex to cover in this column, it’s well worth investigating if you need to get the most out of your disks.
Roderick W. Smith is the author or co-author of over a dozen books, including Linux in a Windows World and Linux Power Tools. He can be reached at