File System Evangelist and Thought Leader: An Interview with Valerie Aurora

Jeff Layton talks to Valerie Aurora, file system developer and open source evangelist, about a wide range of subjects including her background in file systems, ChunkFS, the Union file system and how the developer ecosystem can chip in.

There are many people who are influential in Linux. Many of them you are already aware of, but there are some, who are very influential, that you might not be aware. Valerie Aurora is a very influential thought leader in the Linux and FOSS community. She is responsible for the ChunkFS concepts and for much of the current work in Union file system work within Linux.

For many years, file systems on Linux was stalled. The community at large was happy with ext2 and ext3 and that’s where things stayed. There were some efforts with ReiserFS to develop something new, but, overall, the community had stopped FS development. Valerie Aurora pointed this out and started cajoling the community and organized the first ever Linux File Systems Workshop to help jump start the community. In addition she was on the program committee for Fast09.

Valerie has helped raise awareness of the need for file system development and progress. Not to embarrass Valerie, but take a look at her consulting site and you will see the depth of her experience. (Author’s note: The links in the interview are from the author and are intended as pointers for starting your own searches.)

Jeff Layton Please introduce yourself and tell us a bit about your background and what you doing these days.

Valerie Aurora I grew up in New Mexico (the state between Texas and Arizona), raising goats and training horses. My education was irregular, including a stint as the least religious wacky home-schooled kid in the neighborhood, but good enough to get me into the not-very-exclusive New Mexico Tech state university. I graduated with a degree in computer science and mathematics and got my first programming job by reading the “Help Wanted” ads in the Albuquerque Journal.

Today I live in a managed apartment complex in downtown San Francisco with two cats and a Roomba. Being a special unique snowflake, I work part-time for Red Hat as a file systems developer and part-time as a science writer and Linux consultant. I love having more than one job; boredom is my greatest enemy and switching gears every week keeps me interested and entertained.

JL You mention your work in file systems, could you go into some detail with your background on file systems? In particular, what file systems have you worked? Which ones do you prefer and why? (always good to find out what professionals like to use).

VA Perhaps a touch long, but here ya go.

I had no particular interest in file systems until I got a job working on one – ZFS, in 2002. Since I didn’t know much about file systems, I read and reread file systems research papers and went to the USENIX file systems conferences (JL: see Fast09 for the latest one). The few good operating systems books talked a little about file systems, but usually only described the Sun-style VFS architecture and FFS-style on-disk format.

After Sun, I went to work at IBM with Theodore Y. T’so’s group. We were basically level infinity customer support for Linux, which could be quite fun, especially since I didn’t get to talk to customers at all when I was at Sun. In our spare time, Ted and I came up with crazy ideas for adding file systems features to ext2 and ext3 that didn’t require 20-30 staff years to implement. At the time, only Namesys was making any significant investment in Linux file systems development, and their business model didn’t seem sustainable.

I moved to Intel and had time to implement some of our crazy ideas and come up with new ones. One was the ext2 dirty bit, which let you skip fsck after a crash if the file system was idle at the time of the crash. Another was relative atime, which, 3 years later, is now the default atime setting. While discussing why the dirty bit idea didn’t work at the block group level, Arjan van de Ven and I came up with the idea for ChunkFS, which I began work on.

All of this seemed pointless, though – how can Linux stay competitive in file systems without Linux developers being paid to work on file systems full-time? So, with Zach Brown and Arjan, I cobbled together the first Linux File Systems Workshop out of shoestring and duct tape, hoping that if the existing Linux file systems developers could just talk to each other, we could come up with a way to improve funding for Linux file systems development. I don’t know that it helped, but if it did, I consider this to be my most important contribution to Linux file systems.

I started a consulting business and did more chunkfs and ext2/3/4 work (parallelizing fsck), and now work for Red Hat. Currently I’m working on union mounts with Jan Blunck, a VFS-based approach to unioning file systems.

On my own systems, I always run ext3 with noatime or relative atime if it’s available. I also disable the paranoia file system checks, with “tune2fs -i 0″ and “tune2fs -c 0″. And I turn off SELinux, which uses extended attributes heavily as well as generally being a pain. For my recent work on 64-bit ext4 file systems, I found it easiest to create sparse 16TB+ files on an XFS partition and mount them loopback to fake up a 16TB+ device. (You can do this on ext4 using md, but it’s an enormous pain.) In general, I use ext3 by default and XFS if ext3 can’t handle what I’m doing. When I move to a new file system, it will be btrfs.

After 2.6.30, I started adding data=ordered too, since the new default for ext3 is data=writeback, which can result in nasty file data corruption after a crash.

JL You mentioned switching to btrfs. People are always asking for a comparison between btrfs and ZFS and the discussion usually get a little heated but can you tell us your opinion(s) around the two file systems? Do you see any file system on the horizon potentially being “better” than btrfs or ZFS (chose any definition of “better” that you want).

VA For some reason, file systems bring out the zealot in all of us. In my old age, I have come to find zealotry tiring, so I’ll try to stick to useful facts.

ZFS and btrfs are similar in structure and goals in a lot of ways. Features they share are copy-on-write everything, checksums, writable snapshots, storage pooling, simple administration, ad nauseum. More interesting are the major differences in architecture. At the highest level, ZFS uses plain ol’ trees of pointers to blocks, FFS-style, and variable block sizes, inspired by the SLAB kernel memory allocator. Btrfs uses a specialized, COW-friendly form of b-trees (as presented by Ohad Rodeh at LSF ’07) and extents. Btrfs is actually slightly more exciting than this: every single piece of data or metadata in btrfs is an item in a b-tree, and items are packed together indiscriminately, without regard to their types. ZFS reduced all file system metadata and data to objects and related operations; btrfs reduced them all to items in a b-tree. Now all the interesting decisions are about how to assign keys to items and order them inside the b-tree.

Now for personal opinion/zealotry time. Initially, I thought the ZFS approach was the simpler and cleaner – at the time, the code to manage b-trees and extents was complicated enough, and adding copy-on-write and checksums to that made your brain want to explode. It took Ohad Rodeh’s simplified, robust b-tree algorithms and Chris Mason’s everything-is-an-item insight to change my mind. I’m a convert.

Valarie Aurora, as usual, makes some interesting points about filesystems development. However, her comment that \”For many years, file systems on Linux was stalled\” is slightly misleading in that, perhaps inadvertently, it implies this is a Linux only phenomenon. Filesystems development everywhere for many years has been stalled. This is why operating systems text books still only describe the \”Sun-style VFS architecture and FFS-style on-disk format\”. It is amazing that the Unix FFS a 20+ year old filesystem is still the default (or only) filesystem available on many Unixes.

Why is this? In my opinion it is because the Unix FFS and FFS-like filesystems for many years was always \”good enough\” for companies in the position to fund filesystems development not to bother designing anything new. It may surprise some people, but the research community even in the 80s and 90s published many papers which showed that file sizes and file usage/access patterns were changing, which challenged many of the design assumptions in FFS-like filesystems. For instance it should still be possible to find these papers somewhere on the internet:

Ousterhout J.K. et al, \”A Trace Driven Analysis of the Unix 4.2 BSD File System\”, proceedings of the 10th Symposium on Operating Systems Principles, 15-24 December 1985.

Ousterhout J.K., \”Why Aren\’t Operating Systems Getting Faster as Fast as Hardware\”, Proceedings of the Summer 1990 Usenix Conference, Anaheim, CA, June 11-15, pp. 247-256.

These and other papers led to research prototypes like the Bullet Fileserver (does anybody remember that?) and log structured filesystems (again proposed by Ousterhout in 1989), concepts which are only now appearing in development filesystems.

I have to declare a vested disappointment in the lack of funding for filesystems development, as I did filesystems research in the late 80s/early 90s in the new area of multimedia filesystems. Disks and processors in those days were inadequate to properly support the demands of multimedia file access which led to new filesystem layouts and concepts, many of which I proposed. Again this paper should still be somewhere on the internet (shameless plug):

Lougher, P. et al, \”The Design of a Storage Server for Continuous Media\”, The Computer Journal, 36(1), 32-43 January 1993.

Like now, in those days it was impossible to get research funding for filesystems (in the UK at least), but by piggy-backing onto the new area of multimedia I managed to get funding for a PhD and post-doc research in distributed multimedia filesystems. However, the failure of VOD (Video on Demand) trials in the 90s to gain sufficient ROI led to interest in this to disappear.

Since then I have written and released SquashFS as an unpaid filesystem project for Linux. This is successful (much more than I anticipated), but being an unpaid niche project, it is regularly ignored as a source of filesystem innovation.

The lack of interest in filesystems development over the last 20 years has been IMHO an industry and a personal career disaster.


What about support for non-linux file systems?
In particular I am worried about exFat (both read and write) because it is going to be the format of the next generation of memory cards (SDXC at least) and of course the gadgets that use them! It would also be valuable for partitions (which may be on portable drives) shared between OSs, with support for 4GiB+ files.


thank you for share!


thank you for share!


