There is a new file distributed file system in the staging area of the 2.6.30 kernel called POHMELFS. Sporting better performance than classic NFS, it's definitely worth a look.
There is a very good blog that discusses how to build and install POHMELFS including the user space tools. You can also read the kernel documentation around POHMELFS to learn about the mount options.
Assuming that you have POHMELFS installed and configured then the first thing one wants to do is run tests. On the POHMELFS website there is a blog post that presents some benchmark comparisons. The configuration is a dual-socket server node with only 1GB of memory connected via a GigE network to a dual-socket client node. The benchmark ran IOzone, bonnie++, and dbench.
From IOzone, it was found that POHMELFS has very good read, write, reread, and rewrite performance. However, it does not have good random read performance. From Bonnie++ the performance was better than NFS but it had some small trouble computing the object creation/removal times since the local data writeback cache of POHMELFS is very fast compared to NFS. However, dbench would not run very well at that time because the rename operation is synchronous and ran rather slow. However, Evgeniy was going to work on that aspect of POHMELFS.
From these simple benchmarks it can be seen that the performance of POHMELFS is very interesting and compelling over the benchmarks tested. It is a very interesting distributed file system that should be tested if more performance than NFS is needed. The choice of replacing a classic NFS configuration with POHMELFS configuration is up to you but it does warrant investigation because it can has good performance and can solve some problems with NFS.
POHMELFS Roadmap – The Elliptics Network
At this time POHMELFS is very much a NFS-like file system. Evgeniy also discusses how he doesn’t believe that POHMELFS in it’s present form, will replace NFS. He argues that it is not a distributed file system. But he also states that he will be leaving it in the staging directory while he ports it to use the Elliptics Network.
The Elliptics Network is a fault tolerant distributed hash table object storage system (lots of concepts in there). Using this as a basis for a distributed parallel file system can create a resilient parallel distributed file system. In particular, a distributed hash table design. Recall that a hash table is a data structure that uses a Hash function that maps keys or identifiers to their associated values. For example, you can use an identifier such as a person’s name (the key) and get the person’s address. It is a very popular data structure that can be used for a huge number of tasks and is the essential part of many languages such as Perl and Python. Below, in Figure 2, is an image from wikipedia that illustrates a distributed hash table
Figure 2 – Distributed Hash Table (from Wikipedia)
Distributed hash tables have been used in peer-to-peer network systems such as Napster or Gnutella. For file systems they provide some immediate benefits.
A Distributed Hash Table takes the basic hash table and decentralizes the hash table. It allows any node (server) to perform a hash lookup (key, value) and retrieve the data. All of the participating nodes share the task of maintaining the key and value mappings. But the design of the distributed hash table allows nodes to join, leave, and fail with minimal disruption. For large distributed file systems, distributed hash tables can be a good starting point for a file system design. This is exactly what Evgeniy has planned for POHMELFS – the ability to add, remove, or have servers fail that are part of the file system with little or no impact to the file system. In addition, using a distributed hash table means that there doesn’t have to be a single metadata server (one of the limitations of NFS).
POHMELFS will be ported to use the Elliptics Network as the underlying data structure while keeping the client interface the same. Work is just beginning on this port.
POHMELFS is an object based file system that has been developed to reduce or eliminate many of the NFS bottlenecks (with the exception of NFS v4.1). It is a parallel and distributed file system that allows data servers to be removed, added, or even fail while the file system is functioning without much of an impact on performance. Moreover, it allows data to be distributed across the IO nodes and also balance read operations across the IO servers. Coupled with this is a local cache capability for both metadata and data that maintains coherency across the servers.
Initial performance results of POHMELFS versus NFS show some promise. The results are a bit old, but they do indicate the POHMELFS can have very good performance.
But POHMELFS isn’t done. While the current version is usable and ready for testing it is still in the “staging” area of the kernel (i.e. not ready for prime time). In addition the developers are porting it to use a distributed hash table called Elliptics Network. There are lots of features in Elliptics network that work well for a distributed file system as previously discussed.
While POHMELFS might not be ready to replace NFS, it does have some benefits that can prove very useful in some situations. However, be warned that POHMELFS is very much a work in progress. It is in the “staging” area of kernel drivers and the developer has announced that POHMELFS will be ported to use a distributed has table.
Jeff Layton is an Enterprise Technologist for HPC at Dell. He can be found lounging around at a nearby Frys enjoying the coffee and waiting for sales (but never during working hours).