Metadata Performance Exploration Part 2: XFS, JFS, ReiserFS, ext2, and Reiser4
More performance: We add five file systems to our previous benchmark results to creating a “uber” article on metadata file system performance. We follow the “good” benchmarking guidelines presented in a previous article and examine the good, the bad and the interesting.
Last week we tested four Linux file systems — ext3, ext4, nilfs2, and btrfs — for metadata performance using a benchmark called fdtree. The point of the benchmarks was not really to comparison the performance of the file systems per say, although comparisons are inevitable. Rather, the benchmarks were performed as part of an exploration into the metadata performance of Linux file systems.
We’re using the same benchmark from the last article and applying it to additional Linux file systems - xfs, jfs, reiserfs, ext2, and resier4. As you are probably aware there are a large number of file systems available in Linux from some fairly old ones such as ext2, to some that are still considered “experimental” in the latest kernel (2.6.30 as of this writing) such as btrfs and nilfs2. But these other file systems, ext2, xfs, jfs, reiserfs are still in production use in a number of places. Consequently this article performs the same benchmarks as the previous article on these additional file systems. It also adds in reiser4 which, believe it or not, is still moving ahead thanks to the determination of some of the developers.
Quick Review of Benchmark
The benchmark used in the previous article is fdtree. It’s perhaps not the best known metadata benchmark for file systems but it is fairly common in HPC circles. It stresses the creation and removal of directories and files using a simple bash script and *nix utilities. It builds a tree structure of directories. A number of files of a given size is created in each directory. The number of directories (branches) at each level in the tree, the depth of the tree, the number of files at each point, and the size of the files are all under user control.
For the specified benchmark two different directory tree structures were used: (1) shallow tree (not much depth) with a larger number of directories at each level, and (2) a deep tree structure with only a few directories at each level but many levels. There are also two file system sizes used: (1) a small size, 4 KiB (1 block), (2) a medium size, 4 MiB (1,000 blocks). This makes a total of four (4) benchmark sets run.
Small files (4 KiB)
Shallow directory structure
Deep directory structure
Medium files (4 MiB)
Shallow directory structure
Deep directory structure
To create the specific parameters for fdtree used in the exploration, there were three overall goals:
Keep the total run time to approximately 10-12 minutes at a maximum
Keep the total data for the two directory structures approximately the same
Keep the run time for each of the four functions greater than 1 minute if possible
All four functions were not always run for 1 minute, sometimes only for a few seconds. These will be noted in the results
In keeping with the good benchmark behavior laid out in previous article, the test was run 10 times with the four combinations for the file systems. The test system used for these tests was a stock CentOS 5.3 distribution but with a 2.6.30 kernel and e2fsprogs was upgraded to the latest version as of the writing of this article, 1.41.9. In addition, the following file system tools were used:
xfs: xfsprogs pulled via git as of 9/5/2009
jfs: jfsutils-1.1.14
reiserfs: reiserfsprogs-3.6.21
ext2: e2fsprogs 1.41.9
reiser4: resier4progs-1.0.7
The defaults for all the file systems were used (tuning is a subject for a whole series of articles and tons of additional work).
The tests were run on the following system:
GigaByte MAA78GM-US2H motherboard
An AMD Phenom II X4 920 CPU
8GB of memory
Linux 2.6.30 kernel
The OS and boot drive are on an IBM DTLA-307020 (20GB drive at Ulta ATA/100)
/home is on a Seagate ST1360827AS
There are two drives for testing. They are Seagate ST3500641AS-RK with 16 MB cache each. These are /dev/sdb and /dev/sdc.
Only the first Seagate drive was used, /dev/sdb, for all of the tests.
Further details of the specific parameters for fdtree that were run are in the previous article.
Results
The results for the new file systems are included in the tables with the results from the previous study. The new results are presented first followed by the previous data. There are tables for each of the four tests (making a total of 8 tables).
The first table for each test is the time for each test. This data is used to help validate the test since the desire is to have the test run for more than a few seconds. Ideally, it should run for at least 60 seconds (1 minute). The second table presents the actually performance data. Larger values are better than smaller values.
Table 1 - Benchmark Times Small Files (4 KiB) - Shallow Directory Structure
File System
Directory Create (secs.)
File Create (secs.)
File Remove (secs.)
Directory Remove (secs.)
xfs
8.80 0.40
858.40 1.62
608.40 6.45
7.10 0.54
jfs
11.10 0.54
418.50 1.20
142.90 12.64
1.60 0.49
reiserfs
8.90 0.30
891.90 33.28
62.40 1.74
1.50 0.50
ext2
9.30 0.90
325.40 1.02
52.60 10.82
3.60 5.54
reiser4
8.90 0.30
347.00 3.10
63.50 9.51
1.50 0.50
ext3
13.00 3.61
342.90 42.69
69.40 6.92
1.30 0.46
ext4
10.60 0.92
327.20 4.89
58.10 1.87
1.40 0.92
btrfs
8.80 0.40
335.00 1.00
65.30 0.78
1.40 0.66
nilfs2
9.10 0.30
345.70 8.14
51.60 0.92
1.20 0.40
Table 2 - Performance Results of Small Files (4 KiB) - Shallow Directory Structure
File System
Directory Create (Dirs/sec)
File Create (Files/sec)
File Create (KiB/sec)
File Remove (Files/sec)
Directory Remove (Dirs/sec)
xfs
958.40 46.80
392.20 0.75
1,569.20 3.25
553.10 6.01
1,192.80 91.96
jfs
759.90 37.23
804.20 2.09
3,219.20 9.30
2,373.70 188.90
5,894.40 2,062.96
reiserfs
946.70 35.10
891.90 33.28
3,568.50 133.01
5,411.70 173.06
6,315.50 2,105.50
ext2
912.70 81.56
1,034.80 3.06
4,140.20 12.83
6,583.80 878.44
5,922.40 2910.84
reiser4
947.80 34.89
970.30 8.566
3,882.50 34.83
5,400.30 67.51
6,315.50 2,105.50
ext3
695.20 177.37
993.70 94.66
3,975.90 378.91
4,900.30 473.28
7,578.80 1,684.40
ext4
800.00 69.88
1,029.10 15.21
4,118.30 60.59
5,803.40 111.90
7,368.30 2,157.37
btrfs
958.40 46.80
1,005.00 3.00
4,021.70 12.01
5,167.10 78.22
7,017.40 2,174.42
nilfs2
925.70 27.90
974.70 21.67
3,889.20 88.54
6,529.20 112.75
7,578.80 1,684.40
Table 3 - Benchmark Times Small Files (4 KiB) - Deep Directory Structure
File System
Directory Create (secs.)
File Create (secs.)
File Remove (secs.)
Directory Remove (secs.)
xfs
125.00 8.10
1,190.00 14.46
708.70 20.45
139.30 7.44
jfs
104.00 0.77
460.30 1.90
200.50 1.63
62.70 0.78
reiserfs
103.70 0.46
882.60 14.25
189.70 3.38
36.90 0.30
ext2
106.70 13.11
389.40 1.20
115.70 0.64
35.30 0.46
reiser4
104.30 1.68
419.00 5.98
140.50 22.82
37.70 0.46
ext3
46.20 26.97
182.40 72.55
53.70 24.78
14.60 7.55
ext4
187.00 11.22
443.20 7.69
192.50 12.51
73.30 42.09
btrfs
102.40 0.66
398.6 1.91
132.50 0.67
38.10 0.70
nilfs2
108.20 2.68
417.30 6.48
122.10 3.39
37.20 0.60
Table 4 - Performance Results of Small Files (4 KiB) - Deep Directory Structure
File System
Directory Create (Dirs/sec)
File Create (Files/sec)
File Create (KiB/sec)
File Remove (Files/sec)
Directory Remove (Dirs/sec)
xfs
711.30 297.30
297.30 3.52
1,190.60 14.46
499.90 14.69
641.90 36.23
jfs
851.00 6.20
771.10 5.87
2,978.30 301.71
1,766.70 14.35
1,412.10 18.02
reiserfs
853.40 3.67
401.10 6.24
1,605.70 25.37
1,867.70 33.04
2,399.70 20.10
ext2
839.10 77.68
909.40 2.73
3,638.90 11.22
3,061.80 16.65
2,509.00 32.08
reiser4
848.70 13.13
855.00 35.26
3,382.60 47.90
2,576.30 341.51
2,348.90 28.87
ext3
783.90 39.08
927.90 16.58
3,713.00 65.88
3,180.70 209.90
2,452.40 207.90
ext4
475.00 29.05
799.10 13.45
3,198.00 53.73
1,848.00 124.31
1,539.60 201.76
btrfs
864.30 5.76
888.10 4.23
3,554.80 16.92
2,673.60 13.87
2,324.90 42.57
nilfs2
818.60 19.11
848.50 12.71
3,396.40 51.73
2,903.60 75.52
2,380.80 36.60
Table 5 - Benchmark Times Medium Files (4 MiB) - Shallow Directory Structure
File System
Directory Create (secs.)
File Create (secs.)
File Remove (secs.)
Directory Remove (secs.)
xfs
0.10 0.30
143.10 0.94
18.10 5.11
0.40 0.49
jfs
0.40 0.49
202.50 6.00
62.20 2.96
0.30 0.46
reiserfs
0.10 0.30
180.70 4.38
13.50 5.14
0.10 0.30
ext2
0.60 0.49
194.80 1.08
16.80 3.94
0.00 0.00
reiser4
0.40 0.49
155.20 30.82
95.80 138.74
0.00 0.00
ext3
0.30 0.46
174.90 17.46
17.40 3.47
0.00 0.00
ext4
0.20 0.40
156.80 4.75
11.80 2.99
0.20 0.40
btrfs
0.50 0.50
114.40 1.11
15.60 0.49
0.10 0.30
nilfs2
0.70 0.78
196.30 3.07
7.50 2.87
0.20 0.40
Table 6 - Performance Results of Medium Files (4 MiB) - Shallow Directory Structure
File System
Directory Create (Dirs/sec)
File Create (Files/sec)
File Create (KiB/sec)
File Remove (Files/sec)
Directory Remove (Dirs/sec)
xfs
30.70 92.10
21.00 0.00
85,817.60 566.78
184.60 55.60
122.80 150.40
jfs
122.80 150.40
14.90 0.54
60,694.00 1,770.60
48.90 2.47
92.10 140.69
reiserfs
30.70 92.10
16.30 0.46
67.998.20 1,678.36
286.60 168.05
20.70 92.10
ext2
184.20 150.40
15.10 0.30
63,040.40 347.46
191.80 41.74
0.00 0.00
reiser4
122.80 150.40
20.80 4.31
81,248.50 10,741.87
61.30 1.35
0.00 0.00
ext3
92.10 140.69
17.30 1.90
70,889.80 6,798.06
182.30 32.53
0.00 0.00
ext4
61.40 122.80
18.90 0.54
78,393.20 2,252.90
278.30 75.69
61.40 122.80
btrfs
153.50 153.50
26.20 0.60
107,342.50 1,063.70
196.20 6.37
30.70 92.10
nilfs2
122.70 133.80
15.00 0.00
62,572.00 968.91
442.50 90.62
61.40 122.80
Table 7 - Benchmark Times Medium Files (4 MiB) - Deep Directory Structure
File System
Directory Create (secs.)
File Create (secs.)
File Remove (secs.)
Directory Remove (secs.)
xfs
2.60 0.49
201.30 0.90
25.20 0.98
1.10 0.30
jfs
2.40 0.49
255.80 1.08
72.20 1.08
2.10 0.30
reiserfs
2.50 0.50
292.40 8.10
18.50 7.97
1.10 0.30
ext2
2.70 0.64
299.90 9.39
21.50 5.92
2.00 1.61
reiser4
2.60 0.49
201.50 3.96
60.60 2.91
1.20 0.4
ext3
2.70 0.78
248.30 9.99
18.80 4.07
1.80 1.08
ext4
3.20 0.75
219.50 1.12
13.40 4.72
1.20 0.40
btrfs
2.40 0.49
159.30 1.42
16.20 1.17
1.10 0.30
nilfs2
2.50 0.50
287.70 10.67
11.50 0.50
1.40 0.49
Table 8 - Results of Medium Files (4 MiB) - Deep Directory Structure
File System
Directory Create (Dirs/sec)
File Create (Files/sec)
File Create (KiB/sec)
File Remove (Files/sec)
Directory Remove (Dirs/sec)
xfs
818.40 167.06
20.00 0.00
81,352.40 363.58
162.10 6.25
1,944.60 307.20
jfs
886.60 167.06
15.50 0.50
63,019.20 2,945.03
56.20 0.75
988.90 102.30
reiserfs
852.50 170.50
13.50 0.50
56,048.30 1,560.17
278.60 142.84
1,842.60 408.80
ext2
709.20 265.91
12.90 0.30
54,654.10 1,619.59
299.90 9.39
1,518.00 675.90
reiser4
818.40 167.06
19.800 0.40
81,301.40 1,597.82
67.10 3.36
1,842.20 409.60
ext3
818.30 213.10
16.20 0.60
66,053.10 2,515.72
225.60 35.42
1,518.00 658.48
ext4
671.70 147.98
18.10 0.30
74,607.50 380.54
331.50 112.06
1,842.20 409.60
btrfs
886.60 167.06
25.20 0.40
102,807.40 917.56
253.20 17.72
1,944.60 307.20
nilfs2
852.50 170.50
13.70 0.64
56,998.60 2,122,26
356.50 15.50
1,637.40 501.66
Discussion of Results
In general, only the file create and file removal tests ran long enough to be useful. In the small file, deep directory test, the directory creation step ran long enough to produce meaningful results, but it is the only test where this happens. Consequently it won’t be discussed here.
The observations from the previous article are still valid with some accommodation for the new file systems. These observations are:
Small files put extreme pressure on metadata performance regardless of the file system. Compare the file create and removal rates for the small files versus the medium files. The rates are about an order of magnitude smaller for small files. However, this is to be expected because there are simply many more files.
For small files, a shallow or deep directory structure did not appreciably impact metadata performance. However, the deep directory structure did produce slower results in general.
For larger files, a shallow or deep directory structure also did not appreciably impact metadata performance. However, again, for deep directories, the performance was slightly slower than shallow directories.
There can be a great deal of variation in metadata performance for some of the file systems. The reason for this is unknown at this time.
These are general observations. However, I’m sure most readers are comparing the file systems even before they reach this point in the article. In keeping with the crowd, let’s do a little contrasting of the file systems (and I mean a little).
Small Files:
A number of file systems had about the same performance on fdtree: btrfs, ext4, ext2, reiser4, and nilfs2.
Surprisingly, xfs did not have good performance falling down badly relative to the others.
Reiserfs did well on the shallow test but not so well on the deep test.
Medium Files:
btrfs had the top performance by a fairly wide margin - approximately 10%
In second place, xfs and resier4 did very well. So xfs has redeemed itself from small file performance
ext4 is close behind xfs and reiser4.
The remaining file systems, jfs, reiserfs, ext3, and nilfs23 all drop down a bit below xfs, reiser4, and ext4.
As a secondary consideration, jfs and reiser4 actually improve in performance when going to a deep directory structure. They are the only file systems to do this.
And finally a few quick observations about the file systems in general:
Log-base file systems such as nilfs2 should work well with metadata tests. But the developers are evolving the garbage collection (gc) algorithm which should improve performance.
Reiserfs is going through some changes to remove some locks. This should help performance
The problems with xfs and small files is unknown at this time. If you any suggestions as to options for improving performance, please let me know.
btrfs has really good performance at this stage and it is still experimental
One final word of caution. Do not pick your file system based on these results alone. However, let me know if this “type” of article is useful. And please, if you are upset that your file system isn’t included or if your file system didn’t do as well as you would like, please try out the benchmark yourself and post the results in the forums for everyone.
Jeff Layton is an Enterprise Technologist for HPC at Dell. He can be found lounging around at a nearby Frys enjoying the coffee and waiting for sales (but never during working hours).