Size Can Matter: Improving Metadata Performance with Ext4 Journal Sizing – Part I

Recently we saw that the journal device location, unfortunately, didn't make much of a difference on ext4 metadata performance. But can the size of the journal will have an impact on metadata performance? The first in a series of articles examining the journal size and performance.

An aspect of file system or storage performance that may be overlooked is metadata performance. This is the performance in creating and removing files and directories, updating metadata information, and other aspects of file systems that don’t involve actual data transfers. In a previous article we saw that metadata performance did not change appreciably regardless of whether the journal device was on a ramdisk, a separate disk, or on the same disk as the file system. However, the journal size in that article was only 16MB for a 500GB file system (0.0032% of file system size). This seems to be a fairly small size and could potentially have an impact on metadata performance.

In this article the journal is placed on a second hard drive and its size is varied to understand the impact on metadata performance measured by the metadata benchmark fdtree.

Testing Discussion

Four journal sizes are tested to understand the impact of journal size on metadata performance. The four journal sizes are:


  • 16MB (0.0032% of file system size)
  • 64MB (0.0128% of file system size)
  • 256MB (0.0512% of file system size)
  • 1GB (0.2% of file system size)

A partition of the appropriate size is created on a second drive and is then utilized for the journal for the ext4 file system.

This article will focus on metadata performance as measured by fdtree. This benchmark has been used before to examine the metadata performance of various Linux file systems. To read about fdtree and how it was used for benchmarking please see read the original article.

As a quick recap, the benchmark, fdtree, is a simple bash script that performs four different metadata tests:


  • Directory creation
  • File creation
  • File removal
  • Directory Removal

It creates a specified number of files of a given size (in blocks) in a top-level directory. Then it creates a specified number of sub-directories and then in turn sub-directories are recursively created up to a specified number of levels and are populated with files.

Fdtree was used in 4 different approaches to stressing the metadata capability:


  • Small files (4 KiB)


    • Shallow directory structure

    • Deep directory structure


  • Larger files (4 MiB)


    • Shallow directory structure

    • Deep directory structure


The two file sizes, 4 KiB (1 block) and 4 MiB (1,000 blocks) were used to get some feel for a range of performance as a function of the amount of data. The two directory structures were used to stress the metadata in different ways to discover if there is any impact on the metadata performance. The shallow directory structure means that there are many directories but not very many levels down. The deep directory structure means that there are not many directories at a particular level but that there are many levels.

The command lines for the four combinations are:

Small Files – Shallow Directory Structure

./fdtree.bash -d 20 -f 40 -s 1 -l 3

This command creates 20 sub-directories from each upper level directory at each level (“-d 20″) and there are 3 levels (“-l 3″). It’s a basic tree structure. This is a total of 8,421 directories. In each directory there are 40 files (“-f 40″) each sized at 1 block (4 KiB) denoted by “-s 1″. This is a total of 336,840 files and 1,347,360 KiB total data.

Small Files – Deep Directory Structure

./fdtree.bash -d 3 -f 4 -s 1 -l 10

This command creates 3 sub-directories from each upper level directory at each level (“-d 3″) and there are 10 levels (“-l 10″). This is a total of 88,573 directories. In each directory there are 4 files each sized at 1 block (4 KiB). This is a total of 354,292 files and 1,417,168 KiB total data.

Medium Files – Shallow Directory Structure

./fdtree.bash -d 17 -f 10 -s 1000 -l 2

This command creates 17 sub-directories from each upper level directory at each level (“-d 17″) and there are 2 levels (“-l 2″). This is a total of 307 directories. In each directory there are 10 files each sized at 1,000 blocks (4 MiB). This is a total of 3,070 files and 12,280,000 KiB total data.

Medium Files – Deep Directory Structure

./fdtree.bash -d 2 -f 2 -s 1000 -l 10

This command creates 2 sub-directories from each upper level directory at each level (“-d 2″) and there are 10 levels (“-l 10″). This is a total of 2,047 directories. In each directory there are 2 files each sized at 1,000 blocks (4 MiB). This is a total of 4,094 files and 16,376,000 KiB total data.

Each test was run 10 times for the four journal sizes when the journal is placed on a separate disk (specifically a partition on a separate disk). The test system used for these tests was a stock CentOS 5.3 distribution but with a 2.6.30 kernel. In addition, e2fsprogs was upgraded to 1.41.9. The tests were run on the following system:


  • GigaByte MAA78GM-US2H motherboard
  • An AMD Phenom II X4 920 CPU
  • 8GB of memory
  • Linux 2.6.30 kernel
  • The OS and boot drive are on an IBM DTLA-307020 (20GB drive at Ulta ATA/100)
  • /home is on a Seagate ST1360827AS drive
  • There are two drives for testing. They are both Seagate ST3500641AS-RK drives with a 16 MB cache each. These drives show up as devices, /dev/sdb and /dev/sdc.

The first Seagate drive, /dev/sdb, was used for the file system and the second drive, /dev/sdc, was used for the journal.

The details of creating an ext4 file system with a journal on a separate device are contained in a previous article. The basic steps are to first create the file system assuming the journal is located with the file system on the drive. Second, a new journal is created on the separate partition. Finally, the file system is told that that it no longer has a journal and then it is told that it’s journal is on the specific device (the second drive).

Benchmark Results

In past articles, the results were always tabulated in the interest of full disclosure. But these tables can be a bit unwieldy to read and understand. So in this article, the results are presented in graphical form (the peasants, including me, rejoice). However, the full results are available in tabular form at the end of the article after the Summary.

The first test is for the “small file, shallow structure” scenario for the four journal sizes. As with all of the metadata tests before, only the “file create” test had a run time long enough to be considered a worthwhile result. The average file create time for each journal size was over 300 seconds. Figure 1 below plots the average file create performance in KiB per second for the four journal sizes. Also note that error bars representing the standard deviation are shown.

Figure 1: Average File Create Performance (KiB per second) for the Small File, Shallow Structure Test for the Four Journal Sizes
Figure 1: Average File Create Performance (KiB per second) for the Small File, Shallow Structure Test for the Four Journal Sizes

The next test uses small files but with a deep directory structure. For this scenario three of the tests had run times long enough for consideration. The “Directory Create” test ran from an average of 127 seconds to 324.5 seconds depending upon the journal size. The “File Create” test ran an average of 406.2 seconds to 644.3 seconds. And finally the “Directory Remove” test ran from an average of 147.4 seconds to 330.6 seconds.

Figure 2 below plots the average “Directory Create” results in “creates per second” for the four journal sizes for the small file, deep structure scenario. Again, there are error bars representing the standard deviation in the plot as well.

Figure 2: Average Directory Create Performance (creates per second) for the Small File, Deep Structure Test for the Four Journal Sizes
Figure 2: Average Directory Create Performance (creates per second) for the Small File, Deep Structure Test for the Four Journal Sizes

Figure 3 below plots the average “File Create” results in KiB per second for the four journal sizes for the small file, deep structure scenario. Again, there are error bars representing the standard deviation in the plot as well.

Figure 3: Average File Create Performance (creates per second) for the Small File, Deep Structure Test for the Four Journal Sizes
Figure 3: Average File Create Performance (creates per second) for the Small File, Deep Structure Test for the Four Journal Sizes

Figure 4 below plots the average “Directory Remove” results in removes per second for the four journal sizes for the small file, deep structure test.

Figure 4: Average Directory Remove Performance (removes per second) for the Small File, Deep Structure Test for the Four Journal Sizes
Figure 4: Average Directory Remove Performance (removes per second) for the Small File, Deep Structure Test for the Four Journal Sizes

The next test was the medium files, shallow directory structure scenario. The only result that had a meaningful run time was the file create test (154.2 seconds to 154.4 seconds). Figure 5 below plots the the file create performance in KiB per second for the four journal sizes. Also note that the error bars are plotted as well.

Figure 5: Average File Create Performance (KiB per second) for the Medium File, Shallow Structure Test for the Four Journal Sizes
Figure 5: Average File Create Performance (KiB per second) for the Medium File, Shallow Structure Test for the Four Journal Sizes

The final test was the medium files, deep directory structure scenario. The only result that had meaningful times was the file create test (220.4 seconds to 225.9 seconds). Figure 6 below plots the the file create performance in KiB per second for the four journal sizes. Also note that the error bars are plotted as well.

Figure 6: Average File Create Performance (KiB per second) for the Medium File, Deep Structure Test for the Four Journal Sizes
Figure 6: Average File Create Performance (KiB per second) for the Medium File, Deep Structure Test for the Four Journal Sizes

Benchmark Observations

The benchmark results are very interesting since we actually see some variation in the results whereas in the previous article we did not seem much variation. A quick summary of the results is given below.


  • Small files, shallow directory structure:

    • From Figure 1, increasing the journal size to 256 MB increased the average file creation by 5% (from 3,885.9 KiB/s to 4,072.5 KiB/s).
    • Increasing the journal size to 1GB did not increase the file create performance by any measurable amount.

  • Small files, deep directory structure:

    • The directory creation performance increased dramatically as the journal size increased (Figure 2). The average performance increased by 115% (from 324.5 KiB/s at 16MB to 698.9 KiB/s at 1GB).
    • The file creation performance also increased fairly remarkably as seen in Figure 3. It increased by 56% (2237.7 KiB/s at 16MB to 3491.1 KiB/s at 1GB).
    • The most remarkable change in performance was for the directory removal result as seen in Figure 4. The directory removal rate (removes per second) increased by 367% in going from a 16MB journal to a 1GB journal (412.4 removes/s to 1924 removes/s).

  • Medium files, shallow directory structure

    • The file create performance did not appreciably change as the journal size was varied except at 256MB (Figure 5). For some reason, the performance of the 256MB case was lower than the other journal sizes including the 16MB and 1GB cases. The reason for this is unknown at this time.

  • Medium Files, deep directory structure

    • The file creation performance (in KiB/s) only really changed in going from a 16MB journal size to a 64MB journal size (see Figure 6). The performance with a 64MB journal size is about 3% better than with 16MB (74,388.2 KiB/s vs. 72495.3 KiB/s).
    • After 64MB the file creation performance did not change appreciably.

Summary

As discussed in past articles, the file system journal is a very important aspect of a file system for many reasons. The ability to adjust attributes of the journal gives you the freedom to “tailor” the file system to your usage pattern(s).

Some previous results, surprisingly, showed very little metadata performance difference between a journal based on a disk device and a journal on a ramdisk. One would have expected the ramdisk to obviously produce faster better performance. A possible reason for the unexpected results was the small size of the journal, 16MB, relative to the size of the file system (approximately 500GB). So, this article examined the performance impact of varying the journal size on the metadata performance of ext4.

The metadata benchmark, fdtree, which has been used before in metadata testing, was used for testing the performance of four journal size: 16MB, 64MB, 256MB, and 1GB. For all four sizes, the journal was placed on a partition on a separate disk. As with previous metadata testing, four scenarios were tested: (1) small files (4 KiB) and a shallow directory structure, (2) small files (4 KiB) and a deeper directory structure, (3) medium sized files (4 MiB) and a shallow directory structure, and (4) medium sized files (4 MiB) and a deeper directory structure.

The results are very interesting for the four scenarios. Increasing the journal size improved the file create performance for all of the scenarios except for the medium files, shallow directory structure scenario. The improvement in performance ranged from 3% (medium files, deep structure) to 56% (small files, deep structure).

The only scenario that had tests other than the file creation test run for longer than 60 seconds or more was the small file, deep structure scenario. However, the results were quite remarkable. Increasing the journal size from 16MB to 1GB improves the directory create performance by 115% and directory removal performance by 367%!

Based on these results it is difficult to say which journal size is the “best”. In general, a larger journal size is better but it really depends upon the usage pattern (the scenario). For these tests, if one were forced to pick a “winner” (perhaps with a proverbial “file system gun” to one’s head), then it is likely to be the 256MB journal size (0.0512% of the file system).

Be sure to stay tuned to this Bat-Channel for more articles about the impact of journal size on performance.

Next: The Results

Fatal error: Allowed memory size of 134217728 bytes exhausted (tried to allocate 88 bytes) in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-includes/load.php on line 569