dcsimg

FS-Cache & CacheFS: Caching for Network File Systems

FS-Cache along with CacheFS is now in the 2.6.30 kernel and can be used for local caching of AFS and NFS.

Using CacheFS or CacheFiles

Using FS-Cache and CacheFS/CacheFiles is fairly straight forward. The first thing is to make sure you are using a kernel that supports both. You can check the .config file in the root of the kernel source tree. Make sure that both FS-Cache and CacheFS are activated. In addition, make sure the NFS Client Caching Support is selected as well.

The second step is to make sure that nfsutils is up to date. It is recommended that you download, build, and install the latest version.

The third step is to build and install the latest cachefilesd. As of the writing of this article the latest version is version 0.9 and is dated 14 Feb. 2009.

Installing cachefilesd should create a file, /etc/cachefilesd.conf. This file controls the behavior of FS-Cache and CacheFS (or CacheFiles). It’s fairly easy to edit. There is a HOWTO that explains some of the options within the configuration file. There is also a man page covering cachefilesd.conf so you can use that as a resource (i.e. man cachefilesd.conf).

The first step in configuring things is to define where you want the cache to reside. For a simple laptop you may want the cache to be on the same file system as the main OS installation (i.e. /). The reason is that it may be hard to add a second drive to a laptop to dedicate to caching, although you could use an SD or USB flash drive but the performance is not likely to be good. Alternatively, for a client that is a desktop, you might want to use a disk partition specifically for the cache. You could even use a small inexpensive SSD drive for this but be sure the performance of the SSD is better than a hard drive – otherwise just buy a small hard drive.

For the case of using a partition you will need to format the partition with a file system that supports extended attributes (xattr). Most file systems in Linux support xattr. This includes ext 2/3/4, xfs, reiserfs, and jfs. For this example, ext3 will be used.

Next, make a ext3 file system on the designated file system (see mkfs.ext). Before proceeding any further you need to make sure that the file system has extended attributes turned on. For ext3 this is fairly easy.

% tune2fs -o user_xattr /dev/sda1

where sda1 is the particular partition that will be used for the cache and has been formatted for ext3. Alternatively, you can add xattr support when the file system is mounted via the user_xattr option. Before proceeding, be sure that the partition is mounted as /var/fscache (assuming the defaults in cachefilesd.conf). In the /etc/fstab file you would have something like the following

/dev/sda1   /var/fscache   ext3   defaults,user_xattr  0   0

Make sure the mount point /var/fscache exists. Also noticed that the file system was mounted with extended attributes (user_xattr). If this is turned on in the file system using tune2fs it is not necessary to use this mount option, but it never hurts to be doubly sure

At this point the steps for both CacheFS and CacheFiles are the same. The next step is to start the cachefilesd daemon. If you are using CacheFS make sure the partition is mounted on /var/fscache. Starting cachefilesd is fairly simple but just in case,

% service cachefilesd start

To make sure that cachefilesd is turned on between reboots just use chkconfig.

% chkconfig cachefilesd on

If everything was successful you should see two directories under /var/fscache:


  • cache

  • graveyard

If you look in /var/fscache/cache you will see files with very strange and cryptic names. This means everything is working. But before you will see these files, you need to enable a netfs to use the cache.

As an example, NFS will be enabled to use FS-Cache and CacheFS/CacheFiles. This is very easy to do with a specific mount flag fsc. From the command line, this is simply,

% mount -o fsc bigserver:/group-data /mnt/group-data

where “bigserver” is the name of the NFS server exporting the directory /group-data. It is mounted on /mnt/group-data on the client. The key option is -o fsc which allows FS-Cache the defined backing cache mechanism to be used.

Once the netfs is mounted the cache is not automatically populated. Files need to be either read or written on the netfs. To find out if this happens, check the /var/fscache/cache directory. If you see files, then the cache is active.

It doesn’t matter if you mount the file system with NFS v2, v3, or v4 – it should work for all of these version except for the cases of Direct IO or writing. For all three version of NFS, Direct IO is not supported as explained previously. For the case of opening files for writing, v2 and v3 will not use the cache because the protocols don’t provide enough coherency management information for the client to be able to detect a write from another client that overlaps with the one that it the writing. This is a common limitation of NFS v2 and NFS v3. However, NFS v4 provides the coherency needed. So be sure to use NFS v4 if you want to use write caching.

In the case of using the root partition for caching files (i.e. CacheFiles), then you have the potential for the root file system to be filled with cache data. This is not a good situation and fortunately, CacheFS has a solution for this. In the /etc/cachefiles.conf file you can tell how much of the file system to keep free. The HOWTO explains how to set three limits that define the behavior. Complimentary to the limits on the amount of space used for cache files, are limits on the number of files. Again, consult the HOWTO for an explanation.

Quick example

The first thing many people ask is “how do I tell if CacheFS is working?”. The first thing is make sure that files have been read or written to the netfs causing the cache to be used. In addition, there are lots of statistics put into the /proc file system. To get the maximum amount of information (statistics), make sure the following options are set in the kernel:


  • CONFIG_FSCACHE_STATS=y

  • CONFIG_FSCACHE_HISTOGRAM=y

This results in a long list of stats that are written to two primary locations:


  • /proc/fs/fscache/stats

  • /proc/fs/fscache/histograms

The existence of these directories is also a good indication that CacheFiles is working correctly. The first directory in particular, gives you a great deal of information. It lists a large list of events that take place. You can get a list of these events in the kernel documentation located here.

The Gentoo documentation for CacheFS has a reasonable example. In the example, a 350MB file is used as an example. The initial copy performance on the client is (from the wiki):

% time cp /nfsmount/oneBig.file .
  real    6m10.907s
  user    0m0.172s
  sys     0m12.161s

Then the file is read again, but this time it comes from the cache:

% time cp /nfsmount/oneBig.file /dev/null
  real    1m42.042s
  user    0m0.144s
  sys     0m52.467s

Finally, the file is read from cache and written one more time.

% time cp /nfsmount/oneBig.file .
  real    3m33.246s
  user    0m0.176s
  sys     1m1.348s

While not a great example, it does show you what FS-Cache and CacheFS can do. However, be careful because this example is a single large file. Recall that large numbers of small files can be problematic. Your mileage may vary (YMMV).

Summary

This article is a quick introduction to FS-Cache and CacheFS. The goal of FS-Cache is to provide a central point for local caching of data, primarily from network based file systems, while keeping the originating file system agnostic to the caching mechanism. CacheFS is the actual caching file system that FS-Cache uses for caching. CacheFS uses a partition of a block device for storing cache data. There is also a complimentary caching mechanism called CacheFiles that can be used.

CacheFS has made it’s way into the official 2.6.30 kernel which is loaded with new file systems. Both NFS and AFS are ready to use FS-Cache with either CacheFS or CacheFiles. While all version of NFS can use FS-Cache, none of them will cache any data if the file is opened as Direct IO. In addition, only NFS v4 will cache data if the file is opened for writing because NFS v2 and v3 lack sufficient coherency management.

One area that FS-Cache could prove to be of future use is for caching local file systems. Currently, file systems rely on the kernel for caching data and scheduling for writing/reading to/from the storage. This caching is not directly under your control. But if a local file system can be modified to use FS-Cache then you could use a small but very fast SSD or even a Ramdisk for caching of data.

Since both options have tremendous read performance then using a very large cache (much larger than a drive’s cache) could prove to be useful. Moreover, coupling a very fast write file system such as NILFS with FS-Cache and CacheFS could also give it a tremendous read performance. However, the local system is likely going to need some sort of battery backup with an automatic shutdown to make sure the data is flushed from the cache to the real file system (remember TANSTAAFL).

An additional area where FS-Cache and CacheFS/CacheFiles could make an impact is with a compressed file system such as SquashFS. Recall that SquashFS is a compressed file system that is mounted as read-only. Since FS-Cache has the possibility of really helping read performance then coupling it with SquashFS could prove useful. However, SquashFS would need to be modified to use FS-Cache. Phillip – are you reading this?

If you have NFS mounts on your desktops, laptops, or clients, it behooves you to try FS-Cache and CacheFS/CacheFiles. It may give you a performance boost that you need, but be warned that it also may not give you any more performance.

Fatal error: Call to undefined function aa_author_bios() in /opt/apache/dms/b2b/linux-mag.com/site/www/htdocs/wp-content/themes/linuxmag/single.php on line 62