ext4 File System: Introduction and Benchmarks

Destined to become the default file system for the more popular Linux distributions, ext4 is out of experimental mode and gearing up for production environments. Here's what you need to know.

If you have spent enough time around Linux it’s almost certain you know about the file systems ext2 and ext3, and have probably heard of ext4. Get ready to hear some more.

On October 11, 2008, the “experimental” label for ext4 was removed. While this doesn’t necessarily mean that you should change all of your file systems over to ext4 immediately, it does mean that you should consider using ext4 moving forward. With the “experimental” label gone and openSUSE (among others) considering it for the default file system in a late-2009 release, it’s a good time to review ext4 so you have a solid working knowledge of what it is and what features it brings to the table.


A Bit of Linux History

If you go back to the Linux days of yore, you might remember that early distributions used the Minix File System. While MinixFS (my abbreviation) allowed Linux to get up and functioning quickly because a new file system did not have to be developed, it had a few limitations. It used 16-bit offsets internally resulting in a maximum file size limit of 64 Megabytes (MB) and only allowed file names of 14 characters. It’s pretty obvious that this wasn’t an ideal file system so work quickly began on the Extended File System (ext) by Remy Card and others.

ext was added to Linux 0.96c. It was able to handle file systems up to 2GB with files up to 255 characters. But there were still some issues with the file system so work on the second-generation version of ext, called ext2, was begun. It quickly became the most popular file system in Linux with a 4TB file system limit, a 2GB maximum file size, files with up to 255 characters, and 10^18 files. To this day, you can still use ext2 and it’s likely to be used for many years to come.

But just like everything in Linux, the ext2 file system was not standing still. Stephen Tweedie was evolving ext2 by adding, among other things, a journaling capability. Journaling improves the reliability of the file system and eliminates the need to check the file system after an unclean shutdown. In addition to journaling, the ability to resize the file system while it was on-line was added. Also, since, 64-bit computing was coming quickly, the b-tree algorithm was replaced with h-trees, allowing a larger number of files in a single directory.

Ext3 quickly became, arguably, the most popular file system in Linux. One attribute that contributed to it’s popularity is that you could upgrade from ext2 to ext3 very easily (basically it just added a journal to the existing ext2 file system). So you didn’t lose any data in the upgrade process from ext2 to ext3. By adding a journal to ext2 you increased it’s reliability and also significantly reduced the need for a file system check (fsck) in the event of an unclean umount.

However, ext3 still has limitations that people were not happy with. The biggest complaints were the size of the file systems that is limited to 16TB and the performance was not on-par with other file systems such as XFS and JFS. The first complaint, the limited size of the file system, is perhaps the biggest complaint given that fact you can buy 1.5TB SATA drives and soon will be able to buy 2TB drives. It’s pretty easy to create a simple RAID system in your home system that hits the 16TB limit. But there are other disadvantage as well.

Enter ext4, Stage Left

In 2006, the uber Linux developer, Theodore Ts’o, who was, at the time, the ext3 maintainer, began work on ext4. Unlike ext3, which just added some features to ext2 while keeping the on-line format and approach of ext2, ext4 is a fork of ext3 that is a deep code change affecting the data structures used in ext4 to make it a better file system – faster, more reliable, more features, better code, etc. Ext4 brought ext3 into the world of 64-bits allowing individual files of 16TB (assuming 4KB blocks), as well file systems of 1 Exabyte (EB) by using 48-bit data structures. One EB is the same as 1,048,576 Terabytes (TB).

Click here to read Jeff Layton’s interview with Theodore Ts’o.

While past predictions have been wrong about the amount of memory we would need (640KB) as well as storage, it is likely that our home machines won’t get to 1 EB for a long time. But just in case, ext4 is set to go to 64-bits but the surgery to get there is likely to be deep enough to require some fundamental changes in the file system.

From the perspective of many, one of the most positive features of ext4 is that it is backward compatible with ext2 and ext3, allowing you to take the ext2 or ext3 file systems, change a few options, and mount them as ext4 file systems. The existing data is not lost and ext4 will use the new data structures only on new data (pretty nifty feature if you ask me).

Additionly, there is a nice upgrade capability that will allow you to take an ext2 or ext3 file system and upgrade it to ext4 without a loss of data (but — as always! — back up your data just in case). However, ext4 has limited forward compatibility with ext3. That is you can’t always take an ext4 file system and mount it using ext3 because the data structures are completely different.

The hard work that went into ext4 added new features such as, extents, journaling checksumming, block allocation, delayed allocation, faster fsck, on-line defragmentation, and larger directory sizes (up to 64,000 files). Let’s look at a few of these:

Extents:
Extents are a feature that describe how the blocks are laid out on the drive in order to store the data for the file.

Ext3 (and ext2) use an indirect method of keeping track of the blocks used by a particular file. This means they have to keep track of every single block. For example, for a 100MB file, you have 25,600 4KB blocks. So for that file, ext3 has to keep track of all 25,600 blocks and how they are ordered.

Ext4 allows the blocks for a particular file to be stored as an extent. An extent is just a contiguous set of blocks. So the file system only has to store two bits of information, the starting block, and how many contiguous blocks are in the extent. Extents also help prevent file fragmentation improving performance because you are storing the data in contiguous blocks. Extents also help with file deletion because you have much less metadata information to change.

Journaling Checksumming:
One of the big developments in ext3 was the implementation of a journal. A journal is just a list of the changes that need to be done to a file system (e.g. reads, writes, deletes, etc.). So a file system just “plays” this journal to commit the changes to the file system. If there is a crash, the journal, which is stored on disk, is just “replayed” and the file system is brought into a consistent state. But don’t forget that the journal is stored on the disk and is subject to disk failures.

Journaling Checksumming creates a checksum of the journal data so that ext4 can tell if the area of the disk where the journal is kept is failing or going corrupt. This improves reliability but can also improve performance because it allows faster commits of the journal compared to ext3.

Multi-block Allocation:
Ext3 allocates blocks for a file one at a time (typically using 4KB blocks). For very large files, the associated function that does the allocation will have to be called thousands of times. ext4 uses “multi-block allocation” which allows multiple blocks (hence the name) to be allocated during one function call. This can greatly improve the performance of ext4 relative to ext3, particularly for large files.

Delayed Allocation:
In ext3 and other traditional file systems, blocks are allocated as soon as they are needed by a write function. But, in reality they may not be needed right away because the data may be in cache for some time. So delayed allocation allows blocks to be allocated only when they are actually needed to write the data. This can improve the performance of ext4 because during that time the allocator can be optimizing the allocation of blocks to minimize fragmentation and improve performance. It also has great benefits when coupled with extents and multi-block allocations.

Next: Distro Adoption and Creating an ext4 File System

Comments on "ext4 File System: Introduction and Benchmarks"

cwtryon

Thanks for the introduction! While I realize that file systems can be an even more hotly debated religious topic than KDE vs. Gnome, do you have any indications on how ext4 performs compared to some of the other “new” file systems, such as JFS? One of the other big problems with ext3 has been handling massive numbers of really small files in a single directory. I actually worked a job where we were worried that a 2TB partition was “too small” for the amount of data we had to store, but we were stuck with older versions of RH EL, and not sure where to go. This may be the new way forward, as file system requirements continue to explode.

Reply
laytonjb

cwtryon,

I agree with your thoughts about ext3 and ext4. In the past ext3 has felt kind of limiting and with RHEL that was the primary file system of choice. I’ve never tried to use JFS or Reiser on RHEL. Typically what I do is install RHEL using ext3 for / (I use ext2 for /boot), and once I’m happy, I build a new kernel with XFS and JFS enabled. I install all of the support tools and then build a new file system using XFS or JFS on other drives. This works well enough for me :)

I don’t know about performance of JFS vs. ext3 or ext4. One of my goals is to do some extensive benchmarking to get a feel for the relative performance differences. Drop a note to the editor about a file system benchmarking article and maybe he’ll ask me to do one :)

BTW – I’m working on a similar article for btrfs. While it’s still experimental I want to get a feel for it as well. Ted Ts’o think the combination of ext4 and btrfs is the future and I couldn’t agree more. There are some other up and coming file systems that show some promise as well.

Also – thansk for the compliment! Glad it helped.

Jeff

Reply
ndatta

This is a great article and a very good introduction to ext4. Thanks!

Reply
rkoski

Actually, the ext3 file system size limit is only 8 TB. Tried to make 9 TB ext3, but the only block size option was 4 KB, which resulted in said max 8 TB file system. Used CentOS 5.2. Maybe there is some combination of kernel, e2fsprogs, etc. which can use 8 KB blocks, perhaps kernel 2.4.x ;)

Reply
dbindner

There’s always something to complain about with benchmarks, so naturally I have a complaint. Given that this was a very introductory article, I think it would have made sense to test the two filesystems in the configurations that are most common, i.e. their default configurations.

When more technical articles follow, you can delve into the options that a careful sysadmin would tune. But most people (and many admins) are going to use the default settings and will want to know about performance as well as reliability. It may be a bit hard to say about reliability for the moment, but at least that “typical” performance can be measured.

Reply
graemeharrison

Great article… but I too have an issue about the benchmarking needing to compare ‘out of the box’ default configurations. You disabled “barriers” which are by-default ON with Ext4 (for cited reason of compatibility) but really, if Ext4 will have data-security features, at least one column in results should have been the default use of Ext4.

Reply
sdean7855

Great article. Jeff, since you are an Enterprise Technologist for Dell, I have a question about the inter-relation/robustness/capability of ext4 and btrfs and PC hardware…or Dell server hardware. Ts’o wrote a telling piece back in ’04, entitled ‘reiserfs’ but it might as well have been entitled crappy PC hardware. It was his take back then that SGI and Sun had built their hardware to deal somewhat gracefully and sequentially with a power cord yank but that crappy PC hardware died a thrashing chaotic death that would write crap…even upon the metadata. Have things improved, at least with Dell Poweredge server hardware? Does ext4 and btrfs do physical journalling?

Reply
gigo6000

Nice article , just yesterday while installing ubuntu 9.04 I noticed this new filesystem and didn’t know if it was ok to use it since ext3 works fine for my needs.

Reply
rosbif

Thank you for a nice article.

One little nit-pick:
The maximum file size for ext2 is much greater than the 2GB stated.
It depends on the block size but seems to be 2TiB with a 4KiB block size.

Of course it is true that some 32-bit applications may be limited to a 2GiB maximum file size but this is a limitation of the application and not of the file system.

Reply

I like it when folks come together and share ideas. Great
blog, keep it up!

Also visit my web blog: Arcane Legends Hack

Reply

thank you for share!

Reply

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>