Ramdisks can offer a level of performance that is simply amazing. More than just a tool for benchmarking, there are new devices that utilize ramdisks for a bit of the ultra-performance.
Ramdisks are usually spoken of in hushed tones when sysadmins get together. Some view them with awe because of their performance and some view them with skepticism and say they are only useful for creating benchmarks. Almost all of them agree that using them on a regular basis must be done with care because of the problem of losing power shuts immediately shuts down the ramdisk erasing the contents of the ramdisk. However, there are uses for ramdisks besides running benchmarks. More over there are storage devices that use RAM as the storage media that can reach rarefied levels of performance of millions of IOPS.
This article takes a quick break from the 2.6.30 kernel bonanza of file systems to discuss ramdisks and file systems that use them, such as tmpfs. It will also discuss possible uses for ramdisks. Finally it will present some storage devices that use RAM as the storage media. These devices range from very expensive enterprise class storage to devices that can fit into your desktop.
Introduction to Ramdisks, RamFS, and tmpfs
The concept of a ramdisk is fairly simple – take a piece of memory and make it appear as a hard drive (block device) to the OS. It can be then used for storing data. But the memory used for the ramdisk is unavailable to the OS for anything else. If the power is removed then the data on the ramdisk is lost. This can be thought of as a good or bad result depending upon your perspective. It’s bad because you lose the data if you haven’t copied it permament media (hard drives or flash). It’s a good thing from a privacy perspective because when the power is lost all data is gone and can’t be reconstructed. In addition, if you want to install the OS after a power loss or a reboot, then using a ramdisk is also a good option.
Linux has two primary ramdisk file systems available for use. The first, and earliest is called RamFS and was developed in the 2.4 kernel time frame and is still available in the 2.6.30 kernel. There is also tmpfs. Each has it’s own pluses and minuses that are summarized here:
| Feature |
RamFS |
tmpfs |
| Dynamically grow file system? |
Yes |
No |
| Uses swap space? |
No |
Yes |
| Fill Maximum Space and Continue Writing? |
Will continue writing |
Generate error |
Recall that in both cases, RamFS and tmpfs, they are volatile storage and losing power will erase all of the data in the file system. Which ramdisk file system you select depends upon your requirements. RamFS is attractive in that you can keep filling the file system as needed. But there is a danger in that you can fill it until you run out of memory and starve the OS (the VM can’t free the memory used for RamFS because it thinks that the files should be written to the backing storage, a block device, but RamFS doesn’t have any backing store). On the other hand, tmpfs has a fixed size so you need to know how large a file system you need apriori and create it. But if you need more space than the mounted tmpfs, then it will use swap space. This could be good or bad depending upon your use case and your perspective.
With both RamFS and tmpfs you are basically mounting the disk cache as a file system. Normally all files are cached in memory by Linux. In the case of a read function, data is read from the what is called the backing store device (basically a block device that has a file system such as a hard drive). These files are kept around in memory in case they are needed again. However, the pages are marked as clean which allows them to be used by the VM for something else. For write situations the pages associated with the data are marked as clean as soon as they are written to the backing store and can be used by the VM for its needs.
With RamFS the files are written into the cache as usual but there is no backing store. Consequently, the pages are not marked as clean so the VM cannot reuse them. Unless the data is erased it will stay in the cache. More over, if you write to the file system the data will continue to use more memory. Since it’s definitely possible to run out of memory and lock the system it is recommended to only allow root to write to a RamFS file system.
Tmpfs was derived from RamFS but added size limits and the ability to write the data to swap space. With these limits, it is fairly safe to allow users to write to a tmpfs mount point (don’t forget the permissions!). Tmpfs lives in the page cache and on swap and all pages in memory will show up as cache. In addition, you can use some of the familiar *nix tools such as df and du to check on a tmpfs file system. You can’t do this with RamFS.
There is a good article that explains how to create a RamFS or tmpfs file system. It’s very easy to create a RamFS or tmpfs file system. You don’t really “make” a file system per say. Rather, the mount options creates it. There are really no options for a RamFS file system. A quick example for RamFS is below:
[root@test64 laytonjb]# mount -t ramfs -o size=20m ramfs /mnt/ramfs
[root@test64 laytonjb]# mount
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/hda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sda1 on /home type ext3 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
nfsd on /proc/fs/nfsd type nfsd (rw)
ramfs on /mnt/ramfs type ramfs (rw,size=20m)
[root@test64 laytonjb]# umount /mnt/ramfs
[root@test64 laytonjb]# mount
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/hda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sda1 on /home type ext3 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
nfsd on /proc/fs/nfsd type nfsd (rw)
This simple example is for a 20 MB file system. You can’t use df to look at RamFS since the file system can be expanded so df doesn’t mean anything.
Tmpfs is more likely to be used be used by users since it has a size limit. But creating a file system with tmpfs is just as easy. It has only four options (from the man pages):
size=nbytes Override default maximum size of the filesystem. The size is given in bytes, and rounded down to entire pages. The default is half of the memory.
nr_blocks= Set number of blocks.
nr_inodes= Set number of inodes.
mode= Set initial permissions of the root directory.
Below is a simple example of creating a 20 MB tmpfs file system.
[root@test64 laytonjb]# mount -t tmpfs -o size=20m tmpfs /mnt/tmpfs
[root@test64 laytonjb]# df -k
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
17330592 13649208 2786820 84% /
/dev/hda1 99043 48021 45908 52% /boot
tmpfs 3957596 0 3957596 0% /dev/shm
/dev/sda1 153834852 1890064 144130372 2% /home
tmpfs 20480 0 20480 0% /mnt/tmpfs
[root@test64 laytonjb]# mount
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/hda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sda1 on /home type ext3 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
nfsd on /proc/fs/nfsd type nfsd (rw)
tmpfs on /mnt/tmpfs type tmpfs (rw,size=20m)
[root@test64 laytonjb]# umount /mnt/tmpfs
[root@test64 laytonjb]# df -k
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
17330592 13649208 2786820 84% /
/dev/hda1 99043 48021 45908 52% /boot
tmpfs 3957596 0 3957596 0% /dev/shm
/dev/sda1 153834852 1890064 144130372 2% /home
[root@test64 laytonjb]# mount
/dev/mapper/VolGroup00-LogVol00 on / type ext3 (rw)
proc on /proc type proc (rw)
sysfs on /sys type sysfs (rw)
devpts on /dev/pts type devpts (rw,gid=5,mode=620)
/dev/hda1 on /boot type ext3 (rw)
tmpfs on /dev/shm type tmpfs (rw)
/dev/sda1 on /home type ext3 (rw)
none on /proc/sys/fs/binfmt_misc type binfmt_misc (rw)
sunrpc on /var/lib/nfs/rpc_pipefs type rpc_pipefs (rw)
nfsd on /proc/fs/nfsd type nfsd (rw)
Notice that once the file system is mounted you can use how large it is by using the df command.
To give you an idea of the speed of a ramdisk, a 1.1GB tmpfs file system was created on an AMD Phenom II X4 920 system (2.8 GHz) with 8GB of DDR2-1066 MHz memory. Then iozone version 3.2.1 was run on the file system. The command used was,
mount -t tmpfs -o size=1100m tmpfs /mnt/tmpfs
[laytonjb@test64 ~]$ df -k
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/VolGroup00-LogVol00
17330592 13649432 2786596 84% /
/dev/hda1 99043 48021 45908 52% /boot
tmpfs 3957596 0 3957596 0% /dev/shm
/dev/sda1 153834852 1913512 144106924 2% /home
tmpfs 1126400 0 1126400 0% /mnt/tmpfs
[root@test64 ~]# mkdir /mnt/tmpfs/laytonjb
[root@test64 ~]# chown laytonjb:laytonjb /mnt/tmpfs/laytonjb
[root@test64 ~]# su laytonjb
[laytonjb@test64 root]$ cd /mnt/tmpfs/laytonjb
./iozone -Rab /home/laytonjb/tmpfs.wks -i 0 -i 1 -g 1G
The results below are for a 16,384 byte record size for a 1GB file:
- Write = 1.59 GB/s
- Re-write = 1.95 GB/s
- Read = 1.99 GB/s
- Re-read = 2.01 GB/s
Contrast these with the throughput of a typical single hard drive that may be 100 MB/s with the same set or parameters. Now you know why people like ramdisks.
Next: Uses for Ramdisks
Comments on "Ramdisks – Now We Are Talking Hyperspace!"
Good article, Jeffrey. Timely, too, in that I’ll be presenting a webinar on Thursday (June 25) titled “What Makes A Database System ‘In-Memory’?” In it, I’ll contrast in-memory database systems to:
1. using a database in a RAM disk
2. using DBMS cache to cache 100% of the database
3. using the “memory tables” feature offered by some DBMS
4. using solid-state disk
All the approaches have pluses and minuses that we’ll explore. Folks that found this article interesting/informative might also find the content in the webinar useful.
Registration for the webinar is here:
http://www.mcobject.com/in-memory-database-webinar-june-25
Just another thumbs up to ramdisk. We recently had a situation where one of our ganglia collection servers was hitting some resource consumption issues. Specifially it was running at a steady 55% iowait because of all the ganglia type data that was being written to the database (to the point where we were seriously considering a redesign of our ganglia system and adding more hardware). As a solution we created a small tmpfs and moved the database to that location. We have a script that copies the database off to harddisk every 10 mins (and of course one that loads data at startup and unloads data at shutdown). That way if we have a major catastrophe on the server the most we could lose is ten minutes worth of performance data (which is acceptable for our purposes).
Adding this tmpfs for the ganglia database completely eliminated the iowait issue. Now Im a big fan of ramdisks!
I use a ramdisk. I would like to set scheduler elevator=noop for the ramdisk but I want to use another scheduler for the hard disks.
I can’t understand if it is possible and how to do
Thank You
I don’t think you can have different schedulers for different file systems. The scheduler has to be defined at boot (and have to exist in the kernel or as a module).
Did Google reveal anything?
Jeff
(please correct my english, I\’m Italian…)
With google I understood that I can use different schedulers for different hard disks setting fstab but not for ramdisk and this is what I can\’t understand
The Amiga operating system back in the late \’80s came with a default RAM disk, which was very useful in those days of limited storage (my first Amiga had no hard drive). But better yet, before long, the upgraded OS offered a \”Recoverable RAM disk\” (RRD) — the contents of this would survive a soft reboot or a system crash.
Of course the RRD used volatile system RAM and would not survive total loss of power, but given the system\’s high reliability overall, setting up so that frequently accessed data swapped to the recoverable disk was efficient and reasonably reliable. It was a good place to tuck downloads before checking them and consigning them to permanent storage, as well. And you could run programs from such a disk, so it could be used as a sort of virtual sandbox, too.
You could also copy OS boot-up routines to RAM if you needed for some reason to frequently reboot, e.g., testing a new program or upgrade (part of the boot routine was in ROM, the rest on disk; you could also copy the BIOS portion to RRD). I tended to use the RRD as a fast cache for heavy-duty application data, such as image files during editing, and also to temporarily store fetches of BBS or web pages — in sort, a scratch pad as mentioned in the article, only this was two decades ago, and didn\’t require special hardware.
Add an uninterruptible power supply to such a recoverable RAM disk, and there would little you could not do with the latter.
Well, tmpfs can\’t dynamically grow, but you can always expand it with flag mount \”remount\”.
@robzane: the I/O schedulers are an attempt to solve latency problems of mechanical device. They\’re useless with RAM. They would be a waste of CPU cycles.
Bye,
Gelma
@andrea: I want to use noop for memory ram disk and other schedulers for mechanical device! I think You haven\’t understood the question