dcsimg

Jeffrey B. Layton Archive

Jeff Layton is an Enterprise Technologist for HPC at Dell. He can be found lounging around at a nearby Frys enjoying the coffee and waiting for sales (but never during working hours).
Extended File Attributes Rock!
Worldwide, data is growing at a tremendous rate. However, one recent study has pointed out that the size of files is not necessarily growing at the same rate; meaning the number of files is growing rapidly. How do we manage all of this data and files? While the answer to that question is complex, one place we can start is with Extended File Attributes.
Checksumming Files to Find Bit-Rot
In a previous article extended file attributes were presented. These are additional bits of metadata that are tied to the file and can be used in a variety of ways. One of these ways is to add checksums to the file so that corrupted data can be detected. Let's take a look at how we can do this including some simple Python examples.
What’s an inode?
As you might have noticed, we love talking about file systems. In these discussions the term "inode" is often thrown about. But what is an inode and how does it relate to a file system? Glad you asked.
SandForce 1222 SSD Testing, Part 5: Detailed Throughput and IOPS Analysis with a 2.6.38.2 Kernel
In this series we've been working over the 2.6.32 kernel, which is a bit old. Let's kick the tires on a 2.6.38.2 kernel to see if it helps or hurts performance of the SandForce SSD.
iotop: Per Process I/O Usage
Based on a reader comment, we take iotop for a spin to see if it can be used for monitoring the IO usage of individual processes on a system. The result? It has some interesting capability that we haven't found in other tools.
SandForce 1222 SSD Testing, Part 4: Detailed IOPS Analysis
In this installment of our series examining the performance of SandForce-based consumer SSDs we dig into the IOPS performance of the drive. How does it stand up to an enterprise-class SSD? Let's find out.
SandForce 1222 SSD Testing, Part 3: Detailed Throughput Analysis
Our last two articles have presented an initial performance examination of a consumer SandForce based SSD from a throughput and IOPS perspective. In this article we dive deeper into the throughput performance of the drive, along with a comparison to an Intel X-25E SSD. I think you will be surprised at what is discovered.
SandForce 1222 SSD Testing – Part 2: Initial IOPS Results
SandForce has developed a very interesting and unique SSD controller that uses real-time data compression. This affects data throughput and SSD longevity. In this article, we perform an initial examination of the IOPS performance of a SandForce 1222-based SSD. The results can be pretty amazing.
Storage Highlights in 2.6.38
We look into some of the new features/additions/changes in the 2.6.38 kernel. In a nutshell: think performance enhancements, additional capability, and additional management options.
SandForce 1222 SSD Testing, Part 1: Initial Throughput Results
SandForce has developed a very interesting and unique SSD controller that uses real-time data compression. This can improve performance (throughput) and extend the life of the SSD but it hinges upon the compressibility of your data. This article is the first part in a series that examines the performance of a SandForce 1222-based SSD and the impact of data compressibility.
Software RAID on Linux with mdadm
Now that we've completed our initial examination of the basics of RAID levels (including Nested RAID) it's time to turn our attention to RAID functionality on Linux using software. In this article we will be discussing mdadm -- the software RAID administration tool for Linux. It comes with virtually every Linux distribution and has some unique features that many hardware RAID cards don't.
Aligning SSD Partitions
Do you have a brand new SSD? Do you plan to partition it? Let's talk about the best way to set up your SSD so partitions -- and the resulting file systems -- align on page boundaries, thus improving performance and minimizing the number of rewrite cycles.
Nested-RAID: The Triple Lindy
Thus far we have talked about single-level RAID configurations and Nested RAID configurations. But we've artificially restricted ourselves to only two levels in Nested RAID. Couldn't we have three RAID levels or more? The answer is yes, and in this article we'll talk about three levels (the proverbial "Triple Lindy") and have some fun with a couple of examples.
Tools for Storage Monitoring: nfsiostat
In our never ending-quest for reasonable storage management and monitoring tools, we examine a simple tool in the sysstat collection: nfsiostat. Coupled with iostat, the combination creates a nice set of tools for monitoring NFS.
On-the-fly Data Compression for SSDs
The key to good SSD performance is the controller. One SSD controller that has received good reviews is the SandForce SF-1200. However, a recent test of a SF-1200 SSD reveals some interesting things about what this controller does and just how it does it. Depending upon your point of view and, radically, your data, performance can be amazing.
3TB Drives are Here
In real estate it's about location. In storage it's about capacity. The next crop of high density drives are available but there are some gotchas related to some 3TB drives that you need to know before making a land grab.
Short Stroking Hard Disks for Performance
Less is more. More performance, that is. Learn how to use less of your hard drive (even though you paid for all of it) to get an I/O boost.
Linux 2.6.37: Scalability Improvements Abound
While 2.6.37 might be considered a quiet release, there are some very nice scalability improvements for file systems and one cool new feature that warrant a review.
Should We Abolish User Access to rm?
Lately, I've been hearing system administrators and managers ask about solutions to keep people from accidentally removing their data. These are very smart and dedicated people asking for a solution so that data isn't lost either by accident or on purpose. A wild idea I've heard to solve the problem is getting rid of user access to the rm command. Is this truly a crazy idea?
Nested-RAID: RAID-5 and RAID-6 Based Configurations
A Nested RAID configuration is built on top of a standard single-level RAID configuration in order to address performance and data redundancy limitations. Digging deeper into Nested RAID, we check out RAID-5 and RAID-6, which have some truly amazing features (if you have enough drives).
Intro to Nested-RAID: RAID-01 and RAID-10
In the last article we reviewed the most basic RAID levels. In this article, we take things to the next level and discuss what is called "Nested RAID Levels". These concepts can provide both performance and redundancy for a data-rich world. We then look at RAID-01 and RAID-10; two of the most common Nested RAID configurations.
Introduction to RAID
RAID is one of those technologies that has really revolutionized storage. In this article, we'll review the six most common single RAID levels and describe how each works and what issues surround them.
Tools for Storage Monitoring: iostat
The world of Linux storage tools for both monitoring and management just stinks. But that doesn't mean that there is absolutely nothing to help with monitoring. One such tool, iostat, can be used to watch what your storage devices are up to.
Kernels 2.6.35 and 2.6.36: Storage Updates
Two kernel releases have gone by since we last checked in with the check-ins. While the storage related changes are seemingly minimal, it's always good to review what changed; you might be surprised.
2010 Gift Guide for Storage Geeks
If that someone special in your life has storage on their mind come the holidays, we may be able to help with gift ideas. With ideas ranging from the very affordable (free) to very expensive (skipping a few mortgage payments), we've combed the world of storage procurement so you don't have to.
SuperComputing 2010: Faster, Denser Storage Technologies
The SuperComputing Conference is THE international conference and expo for all things HPC (High Performance Computing). The astute attendee of this year's conference could see that storage is a big part of this year's show. Two major storage trends from this year's conference: really fast storage and really dense storage.
Data Replication Using rsync
Having just discussed replication in Linux -- what it is, how it can be used and how it's not the same as a backup -- it's time to tackle a simple example of one of the replication tools: rsync. You will be surprised how easy it is to use rsync to replicate data to a second storage pool.
Saving Yourself with Data Replication
Data can be the currency, Intellectual Property, and life blood of many a company. One technique to make sure that your data is readily available is data replication. Not quite the same as data backup but can be equally important.
One Billion Dollars! Wait… I Mean One Billion Files!!!
The world is awash in data. This fact is putting more and more pressure on file systems to efficiently scale to handle increasingly large amounts of data. Recently, Ric Wheeler from Redhat experimented with putting 1 Billion files in a single file system to understand what problems/issues the Linux community might face in the future. Let's see what happened...
Bcache Testing: Large Files and a Wrap-Up
This month we have been testing a new kernel patch named bcache that takes SSDs and uses them as a cache for block devices (with the typical device being hard drives). This article wraps up the testing with an investigation of the throughput of large files and summarizes all the testing to date (and there's a lot of that).
Storage Monitoring via SystemTap
With storage becoming increasingly complex, being able to monitor what's happening with your servers has taken on a critical role. To truly understand what is happening with your storage you may need to monitor what is happening within the kernel.
Bcache Testing: Metadata
Our two prior articles have detailed the performance results from a new patch, bcache, that uses SSDs to cache hard drives. We've looked at the throughput and IOPS performance of bcache and -- while it is still very new and under heavy development -- have found that in some cases it can help performance. This article examines the metadata performance of bcache hoping to also find areas where it can further boost performance.
Bcache Testing: IOPS
Previously we looked at the throughput performance of bcache by running IOzone on a common SATA disk, an Intel X25-E SSD, and Bcache using the SSD to cache a single drive. This article explores the IOPS performance of the same configuration hoping to find areas where bcache might shine.
Bcache Testing: Throughput
Get your wetsuit on, we're going data diving. Throughput benchmarks using IOzone on a common SATA disk, an Intel X25-E SSD, and Bcache, using the SSD to cache a single drive.
Hard Drive Caching with SSDs
Caching is a concept used through computing. CPUs have several levels of cache; disk drives have cache; and the list goes on. Adding a small amount of high-speed data storage relative to a large amount of slower-speed storage can make huge improvements to performance. Enter two new kernel patches -- bcache and flashcache -- that leverage the power of SSDs.
Cool User File Systems: GlusterFS
One the coolest file systems in User Space has got to be GlusterFS. It has a very unique architecture that allows it to be configured for specific storage requirements and scenarios. It can be used as a high-performance parallel file system, or a cloud based file system, or even a simple NFS server. All of this in user-space. Could GlusterFS represent the future of file system development for Linux?
Cool User File Systems: ArchiveMount
Have you ever wanted to look inside a tar.gz file but without expanding it? Have you ever wanted to just dump files in a .tar.gz file without having to organize it and periodically tar and gzip this data? This article presents another REALLY useful user-space file system, archivemount. It allows you to mount archives such as .tar.gz files as a file system and interact with it using normal file/directory tools.
Cool User File Systems, Part 1: SSHFS
Userspace file systems are one of the coolest storage options in Linux. They allow really creative file systems to be developed without having to go through the kernel gauntlet. This article presents one of them, SSHFS, that allows you to remotely mount a file system using ssh (sftp).
Storage Management with an LVM GUI
Have you been looking for open-source storage management tools that are easy to use and provide a graphical representation of your storage. Alas, there are no comprehensive tools but there are graphical tools that you can pair with command-line wizardry, particularly LVM.
OCFS2: Unappreciated Linux File System
It's common knowledge that Linux has a fair number of file systems. Some of these are unappreciated and can be very useful outside their "comfort zone". OCFS2 is a clustered file system initially contributed by Oracle and can be a great back-end file system for general, shared storage needs.
User Space File Systems
Having file systems in the kernel has its pros and cons. Being able to write file systems in user-space also has some pros and cons, but FUSE (File System in Userspace) allows you to create some pretty amazing results. This article takes a very brief look at user-space file systems and FUSE.
Creating a NAS Box Using OpenFiler
In a recent walkthru we outlined the steps for taking an existing server and converting it into a NAS box. That article assumed that you already installed Linux on the server and you will maintain that installation (i.e. updates, security, etc.). This article takes examines an alternative: a dedicated NAS distribution called OpenFiler that allows you to very simply create a stand-alone NAS box that can be administered over the web.
2.6.34 is Out; Let’s Review
If you blinked you might have missed the announcement of the new 2.6.34 kernel. Things have been happening very quickly around file systems and storage in the recent kernels so it's probably a good idea to review the kernels from 2.6.30 to 2.6.34 and see what developments have transpired.
Creating a NAS Box with an Existing System
Standalone Network Attached Storage (NAS) servers provide file level storage to heterogeneous clients, enabling shared storage. This article presents the basics of NAS units (NFS servers) and how you can create one from an existing system.
Saving Your Data Bacon with Write Barriers and Journal Check Summing
Mmmm.... bacon. This article examines two mechanisms to prevent data loss -- write barriers and check summing. Both can be particularly important for drives with larger and larger caches. Pay attention: This can save your data bacon.
Smartmontools: Ya Mon!
Last article we introduced the SMART capabilities of hard drives (who knew your drives were SMART?). In this article smartmontools, an application for examining the SMART attributes and trigger self tests, is examined.
Introduction to SMART
Did you know your drive was SMART? Actually: Self-Monitoring, Analysis, and Reporting Technology. It can be used to gather information about your hard drives and offers some additional information about the status of your storage devices. It can also be used with other tools to help predict drive failure.
Storage Technology for the Home User
Sometimes you just have to get excited about what you can buy, hold in your hand, and use in your home machines. Let's look at some cool storage technology that the average desktop user can tackle.
Ceph: The Distributed File System Creature from the Object Lagoon
Did you ever see one of those terrible Sci-Fi movies involving a killer Octopus? Ceph, while named after just such an animal, is not a creature about to eat an unlucky Spring Breaker, but a new parallel distributed file system. The client portion of Ceph just went into the 2.6.34 kernel so let's learn a bit more about it.
Harping on Metadata Performance: New Benchmarks
Metadata performance is perhaps the most neglected facet of storage performance. In previous articles we've looked into how best to improve metadata performance without too much luck. Could that be a function of the benchmark? Hmmm...
IO Profiling of Applications: strace_analyzer
In the last couple of articles we have talked about using strace to help examine the IO profile of applications (including MPI applications; think HPC). But strace output can contain hundreds of thousands of lines. In this article we talk about the using a tool called strace_analyzer to help sift through the strace output.
Intro to IO Profiling of Applications
One of the sorely missing aspects of storage is analyzing and understanding the IO patterns of applications. This article will examine some techniques for performing IO profiling of an application to illustrate what information you can gain.
2.6.33 is Out! Say Good Bye to the Anticipatory Scheduler
It's been a few days but the latest kernel, 2.6.33 is out. There are some changes that affect the storage world that you probably need to check out.
POSIX IO Must Die!
POSIX IO is becoming a serious impediment to IO performance and scaling. POSIX is one of the standards that enabled portable programs and POSIX IO is the portion of the standard surrounding IO. But as the world of storage evolves with greatly increasing capacities and greatly increasing performance, it is time for POSIX IO to evolve or die.
Geeking Out on SSD Hardware Developments
When you're hot, you're hot. And SSD's are hot right now. Let's review recent developments in SSD hardware and to see where the technology is headed. Prepare to drool over new hardware!
Size Can Matter: Throughput Performance with a Disk-Based Journal – Part 4
Turning from Metadata performance to throughput performance, we examines the impact of journal size on ext4 when the journal is disk-based. Dig into the numbers and see what you can do to improve throughput performance.
Size Can Matter: Would You Prefer the Hard Drive or the Ramdisk this Evening? Part 3
The past couple of weeks we ran the numbers on metadata performance for ramdisks and hard drive-based journals for ext4. Now let's compare/contrast the two journal devices and see what trends emerge.
Size Can Matter: Ramdisk Journal Metadata Performance – Part 2
Previously, we examined the impact of journal size using a separate disk on metadata performance as measured by fdtree. In this follow-up we repeat the same test but use a ramdisk for the journal, thereby boosting the best performance. Or does it?
Size Can Matter: Improving Metadata Performance with Ext4 Journal Sizing – Part I
Recently we saw that the journal device location, unfortunately, didn't make much of a difference on ext4 metadata performance. But can the size of the journal will have an impact on metadata performance? The first in a series of articles examining the journal size and performance.
And the Sign of the Beast is 6 (Gbps that is)
In the quest for more performance there are two new standards for SATA and SAS focused on doubling current throughput to 6 Gbps. While the standards may sound like a nice potential boost don't expect individual hard drives to increase in performance.
Improving MetaData Performance of the Ext4 Journaling Device
In the never-ending quest for more performance, we examine three different journaling device options for ext4 with an eye toward improving metadata performance. Who doesn't like speed?
Storage Highlights of 2009
It's the end of the year and that means it's time to either make predictions for the coming year or review the highlights from the past year. This article takes a look at the cool things that happened around storage in the past year and perhaps hints at some things in the coming year.
2.6.32 is Out! But a Word of Caution Around CFQ
Everyone loves a shiny new kernel. The latest one, 2.6.32, was released on Dec. 3 and there are some nice updates/fixes for file systems and IO in general. But there is a very important change for the CFQ IO scheduler that you need to understand.
Two Storage Trends From SuperComputing 2009
The SuperComputing Conference/Exhibition is always a great conference for learning about storage trends in the HPC world. This year the alert attendee could spot two emerging trends: smaller companies developing innovative storage solutions and the rise of flash storage units.
Cloud Storage Concepts and Challenges
Cloud Storage -- while perhaps not the best label ever invented -- holds promise for the massive future storage requirements looming on the horizon. And does it at a very good price/performance ratio. This article takes a quick look at the concepts and the challenges of Cloud Storage.
Introduction to iSCSI
iSCSI is one of the hottest topics in Storage because it allows you to create centralized SANs using TCP networks rather than Fibre Channel (FC) networks. Get a handle on the main iSCSI concepts and terminology.
Helping Out SSDs
The last article talked about the anatomy of SSDs and the origins of some of the their characteristics. In this article, we break down tuning storage and file systems for SSDs with an eye toward improving performance and helping overcome some of the platform's limitations.
Anatomy of SSDs
SSDs (Solid-State Drives) are a hot topic right now for a number of reasons; not the least of which being their power to performance ratio. But to better understand SSDs you should first get a grip on how they are constructed and the features/limitations of these drives.
Pick Your Pleasure: RAID-0 mdadm Striping or LVM Striping?
A fairly common Linux storage question: Which is better for data striping, RAID-0 or LVM? Let's take a look at these two tools and see how they perform data striping tasks.
Tuning CFQ – What Station is That?
The last article was a quick overview of the 4 schedulers in the Linux kernel. This article takes a closer look at the Completely Fair Queuing (CFQ) scheduler and how you can tune it.
I Have a Schedule to Keep – IO Schedulers
The Linux kernel has several different IO schedulers. This article provides an introduction to the concept of schedulers and what options exist for Linux.
IOzone Performance Exploration, Part 2: The Rest of the Crowd (Almost)
We finish off our IOzone performance exploration of the major Linux file systems. This time adding ext2, jfs, xfs, btrfs, and reiserfs. Let's take a look at the numbers.
Deduping Storage Deduplication
One of the hottest topics in the enterprise storage world is deduplication. We take a look at the technology behind the concept and discuss where it is best applicable in your storage strategy.
I Feel the Need for Speed: Linux File System Throughput Performance, Part 1
While metadata performance is important, another critical metric for measuring file systems is throughput. We put three Linux file systems their paces with IOzone.
Metadata Performance Exploration Part 2: XFS, JFS, ReiserFS, ext2, and Reiser4
More performance: We add five file systems to our previous benchmark results to creating a "uber" article on metadata file system performance. We follow the "good" benchmarking guidelines presented in a previous article and examine the good, the bad and the interesting.
Metadata Performance of Four Linux File Systems
Using the principles of good benchmarking, we explore the metadata performance of four linux file systems using a simple benchmark, fdtree.
On-line Backups: Flexible Enough for Home & the Office
Backups are a technology or process that everyone -- everyone! -- needs to consider. This article looks at some on-line backup options for Linux that can apply to the spectrum of home to enterprise-class users.
Linux Software RAID – A Belt and a Pair of Suspenders
Linux comes with software-based RAID that can be used to provide either a performance boost or add a degree of data protection. This article gives a quick introduction to Linux software RAID and walks through how to create a simple RAID-1 array.
Lies, Damn Lies and File System Benchmarks
Benchmarking has become synonymous with marketeering to the point it is almost useless. This article takes a look at a very important paper that can demonstrate how bad it has become and makes recommendations on how to improve the situation.
Storage Pools and Snapshots with Logical Volume Management
Logical Volume Management (LVM) on Linux: A great tool for creating pools of storage hardware that can be divided, resized, or used for snapshots.
#!*A5%amp;j9 – How to Encrypt Your File System
Protecting your data has become more important than ever. Let's look at some options for encrypting Linux file systems.
I Like My File Systems Chunky: UnionsFS and ChunkFS
Diving deeper into UnionFS: walking through how to create and manage large file systems using the principles of ChunkFS and UnionFS.
File System Evangelist and Thought Leader: An Interview with Valerie Aurora
Jeff Layton talks to Valerie Aurora, file system developer and open source evangelist, about a wide range of subjects including her background in file systems, ChunkFS, the Union file system and how the developer ecosystem can chip in.
Read/Write Compression: Combining UnionFS and SquashFS
Need to have write capability on your SquashFS compressed filesystem? UnionFS to the rescue!
From Russia with Love: POHMELFS – A New Distributed Storage Solution
There is a new file distributed file system in the staging area of the 2.6.30 kernel called POHMELFS. Sporting better performance than classic NFS, it's definitely worth a look.
Ramdisks – Now We Are Talking Hyperspace!
Ramdisks can offer a level of performance that is simply amazing. More than just a tool for benchmarking, there are new devices that utilize ramdisks for a bit of the ultra-performance.
FS-Cache & CacheFS: Caching for Network File Systems
FS-Cache along with CacheFS is now in the 2.6.30 kernel and can be used for local caching of AFS and NFS.
SquashFS: Not Just for Embedded Systems
Who knew that compression could be so useful in file systems? SquashFS, typically used for embedded systems, can be a great fit for laptops, desktops and, yes, even servers.
NILFS: A File System to Make SSDs Scream
The 2.6.30 kernel is chock full of next-gen file systems. One such example is NILFS, a new log-structured file system that dramatically improves write performance.
FS_scan: Getting Detailed with Your Data
Need details on your file system's data? FS_scan allows you dig deep into your storage, giving you the ability to perform trend analysis on the results.
How Old is that Data on the Hard Drive?
The vast of amount of data being stored in this day and age, naturally leads to files sitting unused for longer and longer periods of time. A new app, agedu, can quickly tell you what data on your filesystem is lying fallow.
Churning Butter(FS): An Interview with Chris Mason
The founder of btrfs talks about features, terabyte raid arrays and comparisons with ZFS.
Linux Don’t Need No Stinkin’ ZFS: BTRFS Intro & Benchmarks
ZFS may be locked into the Solaris operating system but "Butter FS" is on the horizon and it's boasting more features and better performance.
From ext3 to ext4: An Interview with Theodore Ts’o
Jeff Layton talks with Theodore Ts'o about getting the best performance out of your file system, painless migration and the work still to do.
ext4 File System: Introduction and Benchmarks
Destined to become the default file system for the more popular Linux distributions, ext4 is out of experimental mode and gearing up for production environments. Here's what you need to know.
Caos NSA and Perceus: All-in-one Cluster Software Stack
Silence the struggle around cluster software stack configuration. Caos NSA is a distribution that focuses on making things simple, easy to install and upgrade, and easy to manage.
NFS with Native Infiniband
NFS frees you from proprietary file systems and, coupled with Infiniband, is the only standard file system that can be used for high-peroformance distributed processing.
strace: The Friend You Never Knew You Had
While strace is often used for troubleshooting and debugging, you can also use strace to get started on examining the I/O pattern of your serial codes.
Parallel Platters: File Systems for HPC Clusters Part Three
In the last installment of our Parallel Platters series, Jeff Layton looks at the next generation of parallel file systems: Object Based File Systems.
Life, The Universe, and Your Cluster
Getting the most out of your cluster is always important. But how exactly is that done? Do you really need to dissect your code and analyze every instruction to get optimal performance? Do you need to build custom kernels? Not necessarily. By testing some basic assumptions, you may be able to eke ten-node performance out of an eight-node cluster. Here’s how.