dcsimg

Bcache Testing: Metadata

Our two prior articles have detailed the performance results from a new patch, bcache, that uses SSDs to cache hard drives. We've looked at the throughput and IOPS performance of bcache and -- while it is still very new and under heavy development -- have found that in some cases it can help performance. This article examines the metadata performance of bcache hoping to also find areas where it can further boost performance.

Metadata Performance Testing Results

The results are plotted using bar charts to make them easier to compare. However, be sure to carefully examine the y-axis since the major and minor divisions are not the same for every graph. The plots are of the average values with error bars representing the standard deviation based on 10 runs of the same test. Each plot has four groups of three bars each. Each bar is the per core performance for the cases of running with 1 core (NP=1), 2 cores (NP=2), and 4 cores (NP=4). The legend tells you what color corresponds to the number of processes run. Each group of bars represents a specific test configuration (Disk-only, SSD-only, bcache with the CFQ IO Scheduler, and bcache with the NOOP IO scheduler).

Figure 1 below has the results for the File Create and Close test for the disk alone, the SSD alone, bcache using the CFQ scheduler, and bcache using the NOOP IO Scheduler configurations, for all 3 numbers of cores (NP=1, NP=2, NP=4). The data is plotted as the average value from the 10 tests and the error bars represent the standard deviation.

metarates_file_create_close.png
Figure 1: Average File Create/Close Performance (operations per second) with standard deviations for the three numbers of cores (1, 2, 4), and for the disk, SSD, bcache with CFQ, and bcache with NOOP configurations

It is pretty obvious that the SSD is faster in terms of files creates and closes than either the disk alone or bcache for all three number of processors, but the difference isn’t as large as it was for the throughput and IOPS tests. The big question is how much does bcache improve performance over the plain disk? Figure 2 below is a plot of the percent difference between the disk performance and the two bcache options, (1) bcache with the CFQ IO Scheduler, and (2)bcache with the NOOP IO Scheduler. If the percent difference is positive it means bcache is faster. If it’s negative, then the plain disk is faster.

metarates_file_create_close_compare.png
Figure 2: Percent Difference of the Averages for File Create/Close (%) for bcache with CFQ and bcache with NOOP vs. just the disk

Both the CFQ and NOOP IO Schedulers help bcache improve file create/close performance relative to the uncached disk. For a single process (NP=1), CFQ improves performance by almost 10% and NOOP improves performance by almost 17%.

Figure 3 below has the results for the File Stat test for the disk alone, the SSD alone, bcache using the CFQ scheduler, and bcache using the NOOP IO Scheduler configurations for all 3 number of cores (NP=1, NP=2, NP=4). The data is plotted as the average value from the 10 tests and the error bars represent the standard deviation.

metarates_file_stat.png
Figure 3: Average File Stat Performance (operations per second) with standard deviations for the three numbers of cores (1, 2, 4), and for the disk, SSD, bcache with CFQ, and bcache with NOOP configurations

For this test, the performance of the SSD is about the same as the other configurations including the uncached disk. This could indicate that the performance is limited by the file system and not the storage devices.

Figure 4 below is a plot of the percent difference between the disk performance and the two bcache options, (1) bcache with CFQ IO Scheduler, and (2)bcache with the NOOP IO Scheduler. If the percent difference is positive it means bcache is faster. If it’s negative, then the plain disk is faster.

metarates_file_stat_compare.png
Figure 4: Percent Difference of the Averages for File Stat Performance (%) for bcache with CFQ and bcache with NOOP vs. just the disk

Both the CFQ and NOOP IO Schedulers hurt the file stat performance for bcache relative to the uncached disk. The difference is large and negative for both configurations. The CFQ IO Scheduler with bcache seems to impact file stat performance more severely than the NOOP IO Scheduler. For NP=1, the CFQ IO Scheduler reduces file stat performance by about 9% relative to the uncached disk while it is only about 0.4% for the NOOP IO Scheduler. However, for the NP=4 case, the CFQ IO Scheduler impacts file stat performance by only about 3.8% and the NOOP IO Scheduler impacts file stat performance by about 4.4%

Figure 5 below has the results for the File Utime test for the disk alone, the SSD alone, bcache using the CFQ scheduler, and bcache using the NOOP IO Scheduler configurations for all 3 number of cores (NP=1, NP=2, NP=4). The data is plotted as the average value from the 10 tests and the error bars represent the standard deviation.

metarates_file_utime.png
Figure 5: Average File Utime Performance (operations per second) with standard deviations for the three numbers of cores (1, 2, 4), and for the disk, SSD, bcache with CFQ, and bcache with NOOP configurations

The SSD file utime performance is about 4 times greater than the uncached disk and bcache performance. This is true for all three processor counts (1, 2, and 4). But it also looks like the average bcache performance could be a bit faster than the uncached disk (but the differences are within the standard deviation of the tests). Figure 6 below is a plot of the percent difference between the disk performance and the two bcache options, (1) bcache with CFQ IO Scheduler, and (2)bcache with the NOOP IO Scheduler. If the percent difference is positive it means bcache is faster. If it’s negative, then the plain disk is faster.

metarates_file_utime_compare.png
Figure 6: Percent Difference of the Averages for File Utime (%) for bcache with CFQ and bcache with NOOP vs. just the disk

Both the CFQ and NOOP IO Schedulers help bcache improve file utime performance relative to the uncached disk for a single process (NP=1). CFQ improves performance by almost 8% and NOOP improves performance by almost 12% but the the differences are within the standard deviation of the tests, limiting the impact of the comparison.

For the case of two processes (NP=2), bcache with either the CFQ or NOOP IO Scheduler produces slightly worse file utime performance compared to the uncached disk but the differences are fairly small. The same is true for 4 processes (NP=4) but the CFQ IO Scheduler actually improves the average utime performance by about 4% (still within the standard deviation). Recall that the CFQ IO Scheduler is designed to be fair to all IO requests and creates separate queues for processes. This could possible be the reason that bcache with the CFQ IO Scheduler performs better than the NOOP IO Scheduler on this test.

Comments are closed.