Bcache Testing: Metadata

Our two prior articles have detailed the performance results from a new patch, bcache, that uses SSDs to cache hard drives. We've looked at the throughput and IOPS performance of bcache and -- while it is still very new and under heavy development -- have found that in some cases it can help performance. This article examines the metadata performance of bcache hoping to also find areas where it can further boost performance.

Summary

Just as a reminder, bcache is a brand new patch still undergoing heavy development and testing, so don’t expect great performance from it (yet). However, for metadata performance, bcache seemed to improve the average performance by a notable amount.

For example, the file create/close performance increased by a reasonable amount when using either the CFQ IO Scheduler or the NOOP IO Scheduler. But this was limited to a single process (NP=1). The file create/close performance of bcache with the CFQ IO Scheduler with was about 10% better than the plain disk while the performance with the NOOP IO Scheduler was about 17% better than the plain disk. However, as the number of processes increases, the performance gain from bcache diminishes to the point where for NP=4, bcache actually makes performance worse than the plain disk. At NP=4, bcache with the CFQ IO Scheduler was 11% worse than the plain disk and bcache with the NOOP IO Scheduler is about 4% worse. However, note that these differences are, for the most part, within the standard deviation making if difficult to draw any conclusions.

For file stat performance, bcache make performance worse across the tests conducted. For NP=1, bcache with CFQ is about 9% worse than the disk alone test, and bcache with NOOP is about 0.75% worse (actually this is fairly small). One possible reason that file stat performance is worse for bcache than the plain disk is that bcache has to pull the data from the disk back to cache and then perform the operation (assuming the data has been sent to disk). This extra copy could slow things down compared to the disk alone configuration depending upon how much data is copied back to the SSD.

Since the performance for file stat operations was slower with bcache than the plain disk once would expect the file utime performance with bcache to be worse as well. The utime operation actually reads less data than the file stat operation as well, possibly indicating that bcache performance could suffer. However if you examine Figures 5 and 6 you will see that this is not the case.

For NP=1 (one process), the file utime performance for bcache with the CFQ IO Scheduler is about 8% better than the plain disk while the file utime performance of bcache with the NOOP IO Scheduler is about 12% better. But, as the number of processes increases to NP=4 (4 processes), the benefits of bcache diminishes. The file utime performance of bcache with the NOOP IO Scheduler is actually worse for NP=4 than the plain disk (about 2% worse) and the file utime performance of bcache with the CFQ IO Scheduler is about 3.8% worse than the plain disk.

If you step back and examine all of the results it appears that in aggregate, bcache helps performance versus the plain disk. While the performance is not nearly equal to the SSD by itself, overall bcache helps performance. In addition, it also looks like that the NOOP IO Scheduler helps performance more than the CFQ IO Scheduler. This was not the case for throughput nor IOPS performance where the CFQ IO Scheduler was probably better for overall performance versus the NOOP IO Scheduler.

Next Time

I have one more exploration with bcache that I want to discuss before concluding the bcache testing series. Up to this point all tests created a file that fit entirely within the SSD. In the next article I’ll examine the throughput performance with the file is larger than the SSD.

Appendix

This section contains the data from the plots in tabular form in case you need or want exact values from the figures.

Table 1 below contains the File Create/Close performance values (operations per second).

Table 1 – File Create/Close performance (operations per second) for the four configurations for the three number of processes (NP=1, NP=2, NP=4)

Configuration NP=1 NP=2 NP=4
Disk Alone 28,140.98
3,087.20
18,681.38
1,203.08
13,247.63
1,225.22
SSD Alone 45,606.92
782.58
32,488.60
225.89
12,495.27
4,205.56
Bcache – CFQ IO Scheduler 30,887.51
5,085.27
19,997.25
1,757.82
11,738.53
1,423.36
Bcache-NOOP IO Scheduler 32,927.65
937.76
19,724.05
2,267.71
12,848.41
1,533.17

Table 2 below contains the File Stat performance values (operations per second).

Table 2 – File Stat performance (operations per second) for the four configurations for the three number of processes (NP=1, NP=2, NP=4)

Configuration NP=1 NP=2 NP=4
Disk Alone 679,263.72
3,205.00
496,762.64
4,885.55
356,555.98
3,282.35
SSD Alone 684,607.20
3,520.90
493,191.51
4,642.36
343,179.09
3,889.73
Bcache – CFQ IO Scheduler 617,553.08
196,727.24
492,516.60
6,087.22
343,388.36
15,371.81
Bcache-NOOP IO Scheduler 674,373.79
3,897.56
495,005.07
5,683.31
339,834.92
18,863.83

Table 3 below contains the File Utime performance values (operations per second).

Table 3 – File Utime performance (operations per second) for the four configurations for the three number of processes (NP=1, NP=2, NP=4)

Configuration NP=1 NP=2 NP=4
Disk Alone 46,786.07
15,860.44
31,012.00
796.30
18,171.31
2,578.24
SSD Alone 199,522.54
11,372.63
114,712.98
2,087.59
63,887.66
20,497.79
Bcache – CFQ IO Scheduler 50,390.74
31,943.68
30,708.82
1,601.22
18,817.60
2,527.60
Bcache-NOOP IO Scheduler 52,487.40
31,943.68
30,071.81
2,523.91
17,874.83
3,000.56
Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62