Metadata Performance Exploration Part 2: XFS, JFS, ReiserFS, ext2, and Reiser4

More performance: We add five file systems to our previous benchmark results to creating a "uber" article on metadata file system performance. We follow the "good" benchmarking guidelines presented in a previous article and examine the good, the bad and the interesting.

Last week we tested four Linux file systems — ext3, ext4, nilfs2, and btrfs — for metadata performance using a benchmark called fdtree. The point of the benchmarks was not really to comparison the performance of the file systems per say, although comparisons are inevitable. Rather, the benchmarks were performed as part of an exploration into the metadata performance of Linux file systems.

We’re using the same benchmark from the last article and applying it to additional Linux file systems – xfs, jfs, reiserfs, ext2, and resier4. As you are probably aware there are a large number of file systems available in Linux from some fairly old ones such as ext2, to some that are still considered “experimental” in the latest kernel (2.6.30 as of this writing) such as btrfs and nilfs2. But these other file systems, ext2, xfs, jfs, reiserfs are still in production use in a number of places. Consequently this article performs the same benchmarks as the previous article on these additional file systems. It also adds in reiser4 which, believe it or not, is still moving ahead thanks to the determination of some of the developers.

Quick Review of Benchmark

The benchmark used in the previous article is fdtree. It’s perhaps not the best known metadata benchmark for file systems but it is fairly common in HPC circles. It stresses the creation and removal of directories and files using a simple bash script and *nix utilities. It builds a tree structure of directories. A number of files of a given size is created in each directory. The number of directories (branches) at each level in the tree, the depth of the tree, the number of files at each point, and the size of the files are all under user control.

For the specified benchmark two different directory tree structures were used: (1) shallow tree (not much depth) with a larger number of directories at each level, and (2) a deep tree structure with only a few directories at each level but many levels. There are also two file system sizes used: (1) a small size, 4 KiB (1 block), (2) a medium size, 4 MiB (1,000 blocks). This makes a total of four (4) benchmark sets run.


  • Small files (4 KiB)

    • Shallow directory structure
    • Deep directory structure

  • Medium files (4 MiB)

    • Shallow directory structure
    • Deep directory structure

To create the specific parameters for fdtree used in the exploration, there were three overall goals:


  • Keep the total run time to approximately 10-12 minutes at a maximum

  • Keep the total data for the two directory structures approximately the same

  • Keep the run time for each of the four functions greater than 1 minute if possible

All four functions were not always run for 1 minute, sometimes only for a few seconds. These will be noted in the results

In keeping with the good benchmark behavior laid out in previous article, the test was run 10 times with the four combinations for the file systems. The test system used for these tests was a stock CentOS 5.3 distribution but with a 2.6.30 kernel and e2fsprogs was upgraded to the latest version as of the writing of this article, 1.41.9. In addition, the following file system tools were used:


  • xfs: xfsprogs pulled via git as of 9/5/2009
  • jfs: jfsutils-1.1.14
  • reiserfs: reiserfsprogs-3.6.21
  • ext2: e2fsprogs 1.41.9
  • reiser4: resier4progs-1.0.7

The defaults for all the file systems were used (tuning is a subject for a whole series of articles and tons of additional work).

The tests were run on the following system:


  • GigaByte MAA78GM-US2H motherboard
  • An AMD Phenom II X4 920 CPU
  • 8GB of memory
  • Linux 2.6.30 kernel
  • The OS and boot drive are on an IBM DTLA-307020 (20GB drive at Ulta ATA/100)
  • /home is on a Seagate ST1360827AS
  • There are two drives for testing. They are Seagate ST3500641AS-RK with 16 MB cache each. These are /dev/sdb and /dev/sdc.

Only the first Seagate drive was used, /dev/sdb, for all of the tests.

Further details of the specific parameters for fdtree that were run are in the previous article.

Results

The results for the new file systems are included in the tables with the results from the previous study. The new results are presented first followed by the previous data. There are tables for each of the four tests (making a total of 8 tables).

The first table for each test is the time for each test. This data is used to help validate the test since the desire is to have the test run for more than a few seconds. Ideally, it should run for at least 60 seconds (1 minute). The second table presents the actually performance data. Larger values are better than smaller values.

Table 1 – Benchmark Times Small Files (4 KiB) – Shallow Directory Structure

File System Directory Create
(secs.)
File Create
(secs.)
File Remove
(secs.)
Directory Remove
(secs.)
xfs 8.80
0.40
858.40
1.62
608.40
6.45
7.10
0.54
jfs 11.10
0.54
418.50
1.20
142.90
12.64
1.60
0.49
reiserfs 8.90
0.30
891.90
33.28
62.40
1.74
1.50
0.50
ext2 9.30
0.90
325.40
1.02
52.60
10.82
3.60
5.54
reiser4 8.90
0.30
347.00
3.10
63.50
9.51
1.50
0.50
ext3 13.00
3.61
342.90
42.69
69.40
6.92
1.30
0.46
ext4 10.60
0.92
327.20
4.89
58.10
1.87
1.40
0.92
btrfs 8.80
0.40
335.00
1.00
65.30
0.78
1.40
0.66
nilfs2 9.10
0.30
345.70
8.14
51.60
0.92
1.20
0.40

Table 2 – Performance Results of Small Files (4 KiB) – Shallow Directory Structure

File System Directory Create
(Dirs/sec)
File Create
(Files/sec)
File Create
(KiB/sec)
File Remove
(Files/sec)
Directory Remove
(Dirs/sec)
xfs 958.40
46.80
392.20
0.75
1,569.20
3.25
553.10
6.01
1,192.80
91.96
jfs 759.90
37.23
804.20
2.09
3,219.20
9.30
2,373.70
188.90
5,894.40
2,062.96
reiserfs 946.70
35.10
891.90
33.28
3,568.50
133.01
5,411.70
173.06
6,315.50
2,105.50
ext2 912.70
81.56
1,034.80
3.06
4,140.20
12.83
6,583.80
878.44
5,922.40
2910.84
reiser4 947.80
34.89
970.30
8.566
3,882.50
34.83
5,400.30
67.51
6,315.50
2,105.50
ext3 695.20
177.37
993.70
94.66
3,975.90
378.91
4,900.30
473.28
7,578.80
1,684.40
ext4 800.00
69.88
1,029.10
15.21
4,118.30
60.59
5,803.40
111.90
7,368.30
2,157.37
btrfs 958.40
46.80
1,005.00
3.00
4,021.70
12.01
5,167.10
78.22
7,017.40
2,174.42
nilfs2 925.70
27.90
974.70
21.67
3,889.20
88.54
6,529.20
112.75
7,578.80
1,684.40

Table 3 – Benchmark Times Small Files (4 KiB) – Deep Directory Structure

File System Directory Create
(secs.)
File Create
(secs.)
File Remove
(secs.)
Directory Remove
(secs.)
xfs 125.00
8.10
1,190.00
14.46
708.70
20.45
139.30
7.44
jfs 104.00
0.77
460.30
1.90
200.50
1.63
62.70
0.78
reiserfs 103.70
0.46
882.60
14.25
189.70
3.38
36.90
0.30
ext2 106.70
13.11
389.40
1.20
115.70
0.64
35.30
0.46
reiser4 104.30
1.68
419.00
5.98
140.50
22.82
37.70
0.46
ext3 46.20
26.97
182.40
72.55
53.70
24.78
14.60
7.55
ext4 187.00
11.22
443.20
7.69
192.50
12.51
73.30
42.09
btrfs 102.40
0.66
398.6
1.91
132.50
0.67
38.10
0.70
nilfs2 108.20
2.68
417.30
6.48
122.10
3.39
37.20
0.60

Table 4 – Performance Results of Small Files (4 KiB) – Deep Directory Structure

File System Directory Create
(Dirs/sec)
File Create
(Files/sec)
File Create
(KiB/sec)
File Remove
(Files/sec)
Directory Remove
(Dirs/sec)
xfs 711.30
297.30
297.30
3.52
1,190.60
14.46
499.90
14.69
641.90
36.23
jfs 851.00
6.20
771.10
5.87
2,978.30
301.71
1,766.70
14.35
1,412.10
18.02
reiserfs 853.40
3.67
401.10
6.24
1,605.70
25.37
1,867.70
33.04
2,399.70
20.10
ext2 839.10
77.68
909.40
2.73
3,638.90
11.22
3,061.80
16.65
2,509.00
32.08
reiser4 848.70
13.13
855.00
35.26
3,382.60
47.90
2,576.30
341.51
2,348.90
28.87
ext3 783.90
39.08
927.90
16.58
3,713.00
65.88
3,180.70
209.90
2,452.40
207.90
ext4 475.00
29.05
799.10
13.45
3,198.00
53.73
1,848.00
124.31
1,539.60
201.76
btrfs 864.30
5.76
888.10
4.23
3,554.80
16.92
2,673.60
13.87
2,324.90
42.57
nilfs2 818.60
19.11
848.50
12.71
3,396.40
51.73
2,903.60
75.52
2,380.80
36.60


Table 5 – Benchmark Times Medium Files (4 MiB) – Shallow Directory Structure

File System Directory Create
(secs.)
File Create
(secs.)
File Remove
(secs.)
Directory Remove
(secs.)
xfs 0.10
0.30
143.10
0.94
18.10
5.11
0.40
0.49
jfs 0.40
0.49
202.50
6.00
62.20
2.96
0.30
0.46
reiserfs 0.10
0.30
180.70
4.38
13.50
5.14
0.10
0.30
ext2 0.60
0.49
194.80
1.08
16.80
3.94
0.00
0.00
reiser4 0.40
0.49
155.20
30.82
95.80
138.74
0.00
0.00
ext3 0.30
0.46
174.90
17.46
17.40
3.47
0.00
0.00
ext4 0.20
0.40
156.80
4.75
11.80
2.99
0.20
0.40
btrfs 0.50
0.50
114.40
1.11
15.60
0.49
0.10
0.30
nilfs2 0.70
0.78
196.30
3.07
7.50
2.87
0.20
0.40

Table 6 – Performance Results of Medium Files (4 MiB) – Shallow Directory Structure

File System Directory Create
(Dirs/sec)
File Create
(Files/sec)
File Create
(KiB/sec)
File Remove
(Files/sec)
Directory Remove
(Dirs/sec)
xfs 30.70
92.10
21.00
0.00
85,817.60
566.78
184.60
55.60
122.80
150.40
jfs 122.80
150.40
14.90
0.54
60,694.00
1,770.60
48.90
2.47
92.10
140.69
reiserfs 30.70
92.10
16.30
0.46
67.998.20
1,678.36
286.60
168.05
20.70
92.10
ext2 184.20
150.40
15.10
0.30
63,040.40
347.46
191.80
41.74
0.00
0.00
reiser4 122.80
150.40
20.80
4.31
81,248.50
10,741.87
61.30
1.35
0.00
0.00
ext3 92.10
140.69
17.30
1.90
70,889.80
6,798.06
182.30
32.53
0.00
0.00
ext4 61.40
122.80
18.90
0.54
78,393.20
2,252.90
278.30
75.69
61.40
122.80
btrfs 153.50
153.50
26.20
0.60
107,342.50
1,063.70
196.20
6.37
30.70
92.10
nilfs2 122.70
133.80
15.00
0.00
62,572.00
968.91
442.50
90.62
61.40
122.80

Table 7 – Benchmark Times Medium Files (4 MiB) – Deep Directory Structure

File System Directory Create
(secs.)
File Create
(secs.)
File Remove
(secs.)
Directory Remove
(secs.)
xfs 2.60
0.49
201.30
0.90
25.20
0.98
1.10
0.30
jfs 2.40
0.49
255.80
1.08
72.20
1.08
2.10
0.30
reiserfs 2.50
0.50
292.40
8.10
18.50
7.97
1.10
0.30
ext2 2.70
0.64
299.90
9.39
21.50
5.92
2.00
1.61
reiser4 2.60
0.49
201.50
3.96
60.60
2.91
1.20
0.4
ext3 2.70
0.78
248.30
9.99
18.80
4.07
1.80
1.08
ext4 3.20
0.75
219.50
1.12
13.40
4.72
1.20
0.40
btrfs 2.40
0.49
159.30
1.42
16.20
1.17
1.10
0.30
nilfs2 2.50
0.50
287.70
10.67
11.50
0.50
1.40
0.49

Table 8 – Results of Medium Files (4 MiB) – Deep Directory Structure

File System Directory Create
(Dirs/sec)
File Create
(Files/sec)
File Create
(KiB/sec)
File Remove
(Files/sec)
Directory Remove
(Dirs/sec)
xfs 818.40
167.06
20.00
0.00
81,352.40
363.58
162.10
6.25
1,944.60
307.20
jfs 886.60
167.06
15.50
0.50
63,019.20
2,945.03
56.20
0.75
988.90
102.30
reiserfs 852.50
170.50
13.50
0.50
56,048.30
1,560.17
278.60
142.84
1,842.60
408.80
ext2 709.20
265.91
12.90
0.30
54,654.10
1,619.59
299.90
9.39
1,518.00
675.90
reiser4 818.40
167.06
19.800
0.40
81,301.40
1,597.82
67.10
3.36
1,842.20
409.60
ext3 818.30
213.10
16.20
0.60
66,053.10
2,515.72
225.60
35.42
1,518.00
658.48
ext4 671.70
147.98
18.10
0.30
74,607.50
380.54
331.50
112.06
1,842.20
409.60
btrfs 886.60
167.06
25.20
0.40
102,807.40
917.56
253.20
17.72
1,944.60
307.20
nilfs2 852.50
170.50
13.70
0.64
56,998.60
2,122,26
356.50
15.50
1,637.40
501.66

Discussion of Results

In general, only the file create and file removal tests ran long enough to be useful. In the small file, deep directory test, the directory creation step ran long enough to produce meaningful results, but it is the only test where this happens. Consequently it won’t be discussed here.

The observations from the previous article are still valid with some accommodation for the new file systems. These observations are:


  • Small files put extreme pressure on metadata performance regardless of the file system. Compare the file create and removal rates for the small files versus the medium files. The rates are about an order of magnitude smaller for small files. However, this is to be expected because there are simply many more files.

  • For small files, a shallow or deep directory structure did not appreciably impact metadata performance. However, the deep directory structure did produce slower results in general.

  • For larger files, a shallow or deep directory structure also did not appreciably impact metadata performance. However, again, for deep directories, the performance was slightly slower than shallow directories.

  • There can be a great deal of variation in metadata performance for some of the file systems. The reason for this is unknown at this time.

These are general observations. However, I’m sure most readers are comparing the file systems even before they reach this point in the article. In keeping with the crowd, let’s do a little contrasting of the file systems (and I mean a little).


  • Small Files:

    • A number of file systems had about the same performance on fdtree: btrfs, ext4, ext2, reiser4, and nilfs2.

    • Surprisingly, xfs did not have good performance falling down badly relative to the others.

    • Reiserfs did well on the shallow test but not so well on the deep test.


  • Medium Files:

    • btrfs had the top performance by a fairly wide margin – approximately 10%
    • In second place, xfs and resier4 did very well. So xfs has redeemed itself from small file performance

    • ext4 is close behind xfs and reiser4.
    • The remaining file systems, jfs, reiserfs, ext3, and nilfs23 all drop down a bit below xfs, reiser4, and ext4.

    • As a secondary consideration, jfs and reiser4 actually improve in performance when going to a deep directory structure. They are the only file systems to do this.


And finally a few quick observations about the file systems in general:


  • Log-base file systems such as nilfs2 should work well with metadata tests. But the developers are evolving the garbage collection (gc) algorithm which should improve performance.

  • Reiserfs is going through some changes to remove some locks. This should help performance

  • The problems with xfs and small files is unknown at this time. If you any suggestions as to options for improving performance, please let me know.

  • btrfs has really good performance at this stage and it is still experimental

One final word of caution. Do not pick your file system based on these results alone. However, let me know if this “type” of article is useful. And please, if you are upset that your file system isn’t included or if your file system didn’t do as well as you would like, please try out the benchmark yourself and post the results in the forums for everyone.

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62