Size Can Matter: Would You Prefer the Hard Drive or the Ramdisk this Evening? Part 3

The past couple of weeks we ran the numbers on metadata performance for ramdisks and hard drive-based journals for ext4. Now let's compare/contrast the two journal devices and see what trends emerge.

In part 1 of this series we looked at the metadata performance when your journal was on a separate disk. Part 2 explored the ramdisk option. Now comes the favorite part of every high school English class: Compare and contrast.

These tests are not intended as “benchmarks” per se. They are intended more as experiments or explorations to determine how we can influence storage performance by changing options in the file system. This means we are not looking for a “winner” in the comparison. Rather, we are looking for differences or the lack of differences in metadata performance as a function of journal size and journal device to perhaps tell us something about how we can improve performance.

Testing Review

Recall that four journal sizes were tested to understand the impact of journal size on metadata performance. The four journal sizes are:

  • 16MB (0.0032% of file system size)
  • 64MB (0.0128% of file system size)
  • 256MB (0.0512% of file system size)
  • 1GB (0.2% of file system size)

Both a separate hard drive partition and a ramdisk of the appropriate size were created and then utilized for the journal for an ext4 file system.

To understand the impact of both journal size and the type of device (disk or ramdisk), the fdtree benchmark was used to test metadata performance. This benchmarks has been used in a number of previous articles to measure metadata performance because it is simple to use and offers a number of scenarios that can be used to match usage cases. For this examination, fdtree was used in 4 different scenarios in stressing the metadata capability:

  • Small files (4 KiB)
    • Shallow directory structure
    • Deep directory structure
  • Medium files (4 MiB)
    • Shallow directory structure
    • Deep directory structure

The two file sizes, 4 KiB (1 block) and 4 MiB (1,000 blocks) were used to get some feel for a range of performance as a function of the amount of data. The two directory structures were used to stress the metadata in different ways to discover if there is any impact on the metadata performance. The shallow directory structure means that there are many directories but not very many levels down. The deep directory structure means that there are not many directories at a particular level but that there are many levels. Further details of the metadata testing can be found in the first article.

Each test was run 10 times for the four journal sizes and for the two journal devices (hard disk and ramdisk). The test system used for these tests was a stock CentOS 5.3 distribution but with a 2.6.30 kernel. In addition, e2fsprogs was upgraded to 1.41.9. The tests were run on the following system:

  • GigaByte MAA78GM-US2H motherboard
  • An AMD Phenom II X4 920 CPU
  • 8GB of memory (DDR2-800)
  • Linux 2.6.30 kernel
  • The OS and boot drive are on an IBM DTLA-307020 (20GB drive at Ulta ATA/100)
  • /home is on a Seagate ST1360827AS drive
  • There are two drives for testing. They are both Seagate ST3500641AS-RK drives with a 16 MB cache each. These drives show up as devices, /dev/sdb and /dev/sdc.

The first Seagate drive, /dev/sdb, was used for the file system and was used exclusively in these tests. The second device, /dev/sdc was used for the journal for the hard drive based tests.

The details of creating an ext4 file system with a journal on a separate device are contained in a previous article. The basic steps are to first create the file system assuming the journal is located with the file system on the drive. Second, a new journal is created on the specific device (/dev/sdc1 or /dev/ram0). Finally, the file system is told that that it no longer has a journal and then it is told that it’s journal is on the specific device (the hard drive or the ramdisk).

Benchmark Results

This section presents the comparison of the results for the four scenarios for both devices. The hard drive and ramdisk results are plotted side by side for the same journal size along with the error bars to allow easy comparison. The full results are available in tabular form in the previous two articles.

The first test is for the “small file, shallow structure” scenario for the four journal sizes. Figure 1 below plots the average file create performance in KiB per second for the four journal sizes for both the hard drive device and the ramdisk device. Also note that error bars representing the standard deviation are shown.
compare_small_shallow_file_creates_updated.png
Figure 1: Average File Create Performance (KiB per second) for the Small File, Shallow Structure Scenario for the Four Journal Sizes for the Hard Drive based Journal and the Ramdisk based Journal

Figure 2 below plots the average “File Remove” results in “File Removes per second” for the four journal sizes for the small file, shallow structure scenario for both devices. Again, there are error bars representing the standard deviation in the plot as well.
compare_small_shallow_file_removes_updated.png
Figure 2: Average File Remove Performance (File Removes per second) for the Small File, Shallow Structure Test for the Four Journal Sizes for the Hard Drive based Journal and the Ramdisk based Journal

The next scenario uses small files but with a deep directory structure. For this scenario all four tests had run times long enough for consideration. Figure 3 below plots the average “Directory Create” results in “creates per second” for both journal devices for the four journal sizes. Again, there are error bars representing the standard deviation in the plot as well.
compare_small_deep_dir_creates_updated.png
Figure 3: Average Directory Create Performance (creates per second) for the Small File, Deep Structure Test for the Four Journal Sizes for the Hard Drive based Journal and the Ramdisk based Journal

Figure 4 below plots the average “File Create” results in KiB per second for the four journal sizes for the small file, deep structure scenario for both journal devices. Again, there are error bars representing the standard deviation in the plot as well.

compare_small_deep_file_creates_updated.png
Figure 4: Average File Create Performance (creates per second) for the Small File, Deep Structure Test for the Four Journal Sizes for the Hard Drive based Journal and the Ramdisk based Journal

Figure 5 below plots the average “File Remove” results in removes per second for the four journal sizes for the small file, deep structure test for both journal device types.

compare_small_deep_file_removes_updated.png
Figure 5: Average File Remove Performance (removes per second) for the Small File, Deep Structure Test for the Four Journal Sizes for the Hard Drive based Journal and the Ramdisk based Journal

Figure 6 below plots the average “Directory Remove” results in removes per second for the four journal sizes for the small file, deep structure test for both journal device types.

compare_small_deep_file_removes_updated.png
Figure 6: Average Directory Remove Performance (removes per second) for the Small File, Deep Structure Test for the Four Journal Sizes for the Hard Drive based Journal and the Ramdisk based Journal

The next test was the medium files, shallow directory structure scenario where only the file create test had a meaningful run time. Figure 7 below plots the the file create performance in KiB per second for the four journal sizes for both journal device types. Also note that the error bars are plotted as well.

compare_medium_shallow_file_creates_updated.png
Figure 7: Average File Create Performance (KiB per second) for the Medium File, Shallow Structure Test for the Four Journal Sizes for the Hard Drive based Journal and the Ramdisk based Journal

The final test was the medium files, deep directory structure scenario. The only result that had meaningful times was the file create test. Figure 8 below plots the the file create performance in KiB per second for the four journal sizes for both journal device types. Also note that the error bars are plotted as well.

compare_medium_deep_file_creates_updated.png
Figure 8: Average File Create Performance (KiB per second) for the Medium File, Deep Structure Test for the Four Journal Sizes for the Hard Drive based Journal and the Ramdisk based Journal

Observations (Compare/Contrast)

The benchmark results are very interesting since we actually see some variation in the results whereas in the first article we did not seem much variation. A quick summary of the results is given below.

  • Small files, shallow directory structure:
    • From Figure 1, the average file create performance for the ramdisk journal is slightly slower than the hard disk based journal for the 16MB journal size. However from 64MB on, the performance of the hard drive based journal is approximately the same as the ramdisk based journal.The average file create performance increased approximately 6% going from the 16MB journal size to the 256MB size. From 256MB to 1GB the performance didn’t increase appreciably.
    • From Figure 2, the average file remove performance for both the hard drive based journal and the ramdisk based journal are approximately the same for each journal size.
    • The average file remove performance increased by about 25% from the 16MB journal size to the 256MB journal size. However, increasing the journal size to 1GB didn’t increase performance any appreciable amount.
  • Small files, deep directory structure:
    • At the 16MB journal size, the hard drive and ramdisk journal device average directory create performance is about the same (see Figure 3). However, at 64MB, the ramdisk has 38.2% better performance than the hard drive. At 256MB the average ramdisk performance is 47.3% better, and at 1GB the average ramdisk performance is 29.8% than the hard drive.
    • For the ramdisk journal device, increasing the journal size from 16MB to 1GB increased the average directory create performance by 163%. For the hard drive journal device the same increase in journal size increased the average directory create performance by 115%.
    • The average file creation performance as seen in Figure 4 is also interesting. For a 16MB journal size the performance of both devices is about the same. But from 64MB to 1GB the performance of the ramdisk is much greater than the hard drive. With a journal size of 64MB, the ramdisk journal is 40.2% faster, for a 256MB journal size the ramdisk is 22.5% faster, and for a journal size of 1GB the ramdisk is 17.4% faster.
    • For both the ramdisk journal and the hard drive journal, the average file creation performance increased as the journal size increased for this scenario. The average file creation performance increased by 58% for the ramdisk based journal as the journal size was increased from 16MB to 1GB. For the hard drive based journal the performance increased by 56%.
    • The average file removal performance for both the ramdisk and the hard drive journals increased as the journal size increased (see Figure 5). The ramdisk performance increased by 144% and the hard drive performance increased by 126%
    • In general the average file removal performance of the two devices was about the same for a given journal size. At 64MB, the hard drive based journal was slightly faster and at 1GB, the ramdisk based journal was slightly faster.
    • Figure 6 compared the average directory removal performance for both devices for the four journal sizes. The performance of the hard drive was better than the ramdisk when the journal size was 64MB and 1GB (although the standard deviation at 1GB is very large and the differences are well within the standard deviation). So overall, with the exception of the 64MB journal case, the performance of the two devices was about the same.
    • However, the improvement in the average directory remove performance for both devices as the size of the journal is increased is very dramatic. The average directory removal performance of the ramdisk based journal increased by 410% in going from a journal of 16MB to a journal of 1GB. For the hard drive based journal the performance increased by 367% for the same change in journal sizes.
  • Medium files, shallow directory structure
    • Comparing the ramdisk based journal to the disk based journal is more difficult for the average file create performance (Figure 7) because the average performance for the various journal sizes and device options are all within the standard deviation of the tests. This means that it is difficult to determine if one case has better performance then the others. Even throwing good statistics out the window, sometimes the ramdisk journal is slightly faster and sometimes the disk journal is faster. In addition, the performance doesn’t vary too much as a function of the size of the journal.
  • Medium Files, deep directory structure
    • The average file creation performance was about the same for the ramdisk journal and the hard drive journal (see Figure 8). The performance for all journal sizes was within the standard deviation of the other results making it difficult to observe any statistical difference between journal sizes or devices. But once again, even if we toss our use of good statistics there still isn’t much of a trend in the results. Sometimes the ramdisk journal is faster than the hard drive journal and sometimes not. Also, there doesn’t seem much of a variation in the performance for either journal device as a function of the journal size.

Conclusion

One would have expected the ramdisk to the be run away favorite to have the best metadata performance because one would assume it has the best IOPS and throughput performance of the two devices. However, the comparison in this article showed that in several scenarios and several performance measures, the hard drive based journal had about the same performance or even better performance than the ramdisk based journal. At the same time, there are also scenarios and tests where a ramdisk based journal clearly had better performance than a disk based journal.>

Knowing that there is a performance difference between the ramdisk journal and the disk journal is good information but does not go deep enough to allow us to truly understand what is driving the metadata performance. It is fairly safe to assume that even without testing, the ramdisk journal has superior IOPS and throughput (and latency) than the hard drive. However, it is unclear which aspect of device performance or which combinations are driving the metadata performance differences. But, in my opinion, there is enough of a performance difference between the ramdisk based journal and the hard drive based journal to warrant an examination of using SSD’s for file system journals.

Why SSD’s and not ramdisks? Arguably ramdisks will give you better performance than SSD’s (for the most part), but using a ramdisk from the system memory has some issues that must be addressed (see the previous article). SSD’s are gaining in performance compared to their first incarnations and have very good IOPS performance. More importantly, they can survive a reboot of the system (ramdisks cannot). Consequently, it is worthwhile to test an SSD as an external journal device for ext4.

“Batman, now that we’ve drawn to the end of comparing metadata performance for journal sizes and devices for ext4, what does the future hold?”

“Actually it holds more testing Robin. Commissioner Gordon has asked us to examine the impact of journal size on a separate hard drive on throughput performance measured by IOZone. So hike up your leotard Robin and get ready for our next round of tests – to the Batcave!”

And… scene! Apologies for slipping into character for a moment. I’ve been watching reruns of “The Big Bang Theory” and I’m waiting for Sheldon to appear in a Batman costume and I find I’m starting to identify with him a bit more than I should.

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62