dcsimg

Hard Drive Caching with SSDs

Caching is a concept used through computing. CPUs have several levels of cache; disk drives have cache; and the list goes on. Adding a small amount of high-speed data storage relative to a large amount of slower-speed storage can make huge improvements to performance. Enter two new kernel patches -- bcache and flashcache -- that leverage the power of SSDs.

Introduction

The hard drives we use today are very complex beasts. Cramming that much data onto a very small space while insuring that the data will be there when we need it, is not something easy or simple. The pace of data density increase in hard drives has been nothing short of amazing. At the same time, we all want faster and faster performance from our drives. To help improve hard drive performance, manufacturers introduced a small amount of RAM to the drive. It can be used to hold data for a period of time before writing it to disk, allowing the drive to tell the operating system that the data is actually on the disk, improving application performance. It can also be used to pre-fetch data for reading. In either case, a small amount of RAM can make a very large improvement in performance. If you don’t believe me, run some benchmarks with the disk cache turned on and then off and compare the two results.

SSD’s (Solid State Disks) are a competitor to hard drives that have become much more mainstream in the last few years. SSD’s use a different storage technology, floating-date transistors, to store data. They can be very, very fast relative to hard drives but are much smaller in capacity and much more expensive than hard drives. At the same time, even SSD manufacturers have added RAM caches to SSD’s to improve performance.

If we rank the performance of general storage media from fastest to the slowest we usually end up with something like the following.

  1. RAM
  2. SSD’s with RAM caching
  3. SSD’s with no RAM caching
  4. Hard drive with RAM cache
  5. Hard drive with no RAM cache

At the same time if you ranked the most expensive storage media from the most expensive to the cheapest, the ranking would look exactly the same as the performance ranking. That is, RAM is the most expensive for the same capacity, SSD’s with RAM caching is next, and so on. So while we would like to use RAM to store all of our data, it would just be too expensive (never mind the headache of keeping the power on all of the time). So then we move down to SSD’s. It would be lovely to store all of our data on SSD’s but they have limited capacity and are very expensive relative to hard drives. So then we move to hard drives and we arrive at today’s situation – very large capacity hard drives but with low performance.

What people have realized is that using small amounts of RAM on drives as a caching mechanism has led to much better overall drive performance (once again – try running a hard drive with caching turned off to understand the impact of the cache on performance). The largest drives today only have about 64MB of RAM cache on very large drives (usually 500GB and up). But the amount of RAM is still fairly small compared to the capacity of the drive.

But with the advent of mainstream SSD’s people have realized that SSD’s have much better performance than disk drives while having more capacity than RAM for an equivalent price. So perhaps SSD’s can also be used in some caching role to help performance. The performance difference between SSD’s and hard drives can be very large, as is the price, but if you could take a 32GB or a 64GB SSD and use it to cache a 2TB or many-TB disk array, perhaps you could get a performance improvement that would justify the price of the SSD.

The article talks about two new kernel patches that allow you to use SSD’s as disk caches. Generically these approaches are block caching since really they are caching block devices which are typically hard drives (but don’t have to be). The first patch is called bcache, and the second patch is called flashcache.

bcache

Bcache takes one block device, preferably a SSD based device, and uses it to cache another block device, typically a hard drive. Hence the name, “block cache” or bcache. It implicitly assumes that the caching device is SSD based where it could be a single device or a RAID-array of SSD devices (e.g. RAID-0 array). This assumption affects how bcache works.

Recall from past articles, that SSD’s erase data in blocks. Even if a single cell in the block needs to be erased, the entire block is erased by first copying the data from the block that is not being erased to some other storage, then erasing the entire block, and finally copying the saved data back to the original block. Current SSD’s, for the most part, have controllers that use techniques to avoid this dramatic sequence of events, to keep performance as high as possible. But fundamentally, erasing always takes place on a block level.

Bcache is designed to work with buckets of data, as it refers to them, that are block sized. Even better, it fills them sequentially, so that random writes never really happen. Bcache will start using these buckets in sequence to store data. If the buckets don’t cache data any more, they are marked for deletion. Then a “lazy” garbage collection process will erase the marked buckets. Remember that since the buckets are block sized the erasing is very efficient (if a complete bucket is marked for erasing).

Even though bcache is a “cache”, it is caching a file system so it has to keep track of data in much the same manner as a file system. Bcache uses a btree to track the cached data. The data structure is also designed to use the previously mentioned garbage collection to clean up stale pointers in the data structure as well as freed buckets that no longer cache the latest data on the cached block device.

flashcache

There is another disk or block device caching patch in the wild. This patch, called flashcache, was actually developed by Facebook to help them scale the performance of InnoDB/MySQL which is used by Facebook. Flashcache is somewhat similar to bcache in that it is a write back cache concept for block devices. It also assumes that the caching device is an SSD or a RAID array of SSD devices such as RAID-0.

As with bcache, flashcache assumes that data is cached on block boundaries to line up with the block size of SSD’s. It too refers to these as “buckets of data”. But flashcache is different from bcache in how it implements the caching.

flashcache is built using the Linux kernel Device Mapper (DM). Even if you’ve never heard of DM before if you are using software RAID or LVM then you are using DM. Basically DM is a way of mapping one block device onto another by taking data passed to it from a virtual block device that the DM provides and passing it to another block device. For example, if you use LVM, the DM provides, among other things, the mapping between the VG’s and PV’s, and ultimately the block devices.

Then flashcache uses a “set associative hash” to cache data for the drives it is associated with. Without diving into details, the README associated with the patch says that this approach allows very simple invalidation of cache blocks to help improve performance. That is, it helps identify and flag buckets that are no longer needed and can be erased by garbage collection.

Summary and Next Steps

This article is really an introduction into the concept of using SSD’s for caching hard drives. In particular, it mentions two current patches that enable SSD caches for disks – bcache and flashcache. These are still patches that have not been incorporated into the kernel and are under development and testing. Given the popularity of SSD’s and the mad desire for more storage performance (who doesn’t want more performance after all) and despite of their limitations in terms of price and capacity, it makes sense for SSD’s to be used as caches for spinning media. These two patches represent what appears to be a push for using SSD’s to cache block devices (really it’s using block devices to cache block devices).

However, just a word of caution. If you expect to use these patches with an SSD and a disk and get immediate almost SSD-like performance you may be disappointed. The effectiveness of caching hard disks using SSD’s depends upon a number of factors, but the IO pattern of the application is the biggest driver. If the application is doing mostly random data access (read or write), then it is unlikely that these two patches will help. They are not really designed for that. The good news is that not many applications exhibit truly random IO behaviour. There is always some underlying pattern in the IO of the application. The question becomes, does this application have an IO pattern that can take advantage of the SSD caching?

Also remember that these are patches so they are under heavy development and testing so your performance may vary and there is even the possibility of data loss. For example, the current version of flashcache as described in the README, has a “torn page problem”.

“It is important to note that in the first cut, cache writes are non-atomic, ie, the “Torn Page Problem” exists. In the event of a power failure or a failed write, part of the block could be written, resulting in a partial write. We have ideas on how to fix this and provide atomic cache writes (see the Futures section).”

This isn’t to say that flashcache is irrevocably broken, just that it is in development and has known problems.

But one of the coolest things is that if you try these patches, you have the opportunity to influence a patch that goes into the kernel. By running applications, benchmarks, etc., using these patches, and providing good constructive feedback to the developers, you can help them tweak/change/alter the patch for better performance for applications that you care about. That is not to be underestimated by any stretch.

So in future articles I will be testing both bcache and flashcache using the benchmarks that I have been using throughout my articles. I will be using IOzone for testing both throughput and IOPS, and I will be using metarates for testing metadata performance. Plus, don’t forget our good testing skills.

Keep an eye out for results in future articles. I think the results will be interesting and fun.

Comments on "Hard Drive Caching with SSDs"

You could return once again after one year in the Philippines without having to stress, however if you have debts or a lawful instance versus you do not return!my web-site … log cabins with hot tubs weekend breaks,Terence,

Hi Daniel, I have no idea Tabuk particularly however assuming it is valued in a comparable manner to various other components of the nation you could obtain a very standard house for 10k for the year if you take a look around.my website :: log cabins with hot tub (Ashton)

Here is my webpage :: hip hop beats (Emilie)

my webpage: beats for sale (Graciela)

Hiya! Quick question that’s completely off topic. Do you know how to make your site mobile friendly? My website looks weird when viewing from my iphone 4. I’m trying to find a template or plugin that might be able to correct this issue.If you have any recommendations, please share. Many thanks!Also visit my webpage :: spolszczenie anno 1404 venice

Thanks for your marvelous posting! I genuinely enjoyed reading it, you might be a great author.
I will make sure to bookmark your blog and may come back from now on.
I want to encourage continue your great job, have
a nice weekend!

This is very interesting, You are an excessively professional blogger.
I’ve joined your feed and stay up for searching for more of your fantastic post.
Also, I’ve shared your web site in my social networks

my webpage … buy beats (Quentin)

Here is my blog – rap beats for sale (Elma)

Feel free to visit my website: buy rap beats online (Tanja)

I’d like to thank you for the efforts you have put in writing this blog.
I really hope to see the same high-grade content from you in the future as well.
In truth, your creative writing abilities has motivated me to get my
very own website now ;)

Also visit my weblog … buy rap beats online [Marc]

When-you’re investing forex, you wish all that performance for various antivirus software and your currency trading.

Review my website meaningful fork influence

It’s likewise essential to keep on evaluating these techniques and building appropriate modifications from time.

My page … sample title

my blog post … buy hip hop beats (Rosalie)

Also visit my blog post hip hop beats (Bev)

Stock brokers have their sites by which they offer a platform to enjoy online trading
of futures.

Feel free to visit my weblog; between connect

Also visit my page – hip hop beats for sale (Jens)

Before you start counting your cash, you’d need
to choosing a Metatrader Forex Broker that delivers of making you wealthy on their claims.

Look at my weblog – iforex trader reviews

Here is my homepage :: buy beats

Feel free to surf to my web site: rap beats for sale

Visit my page … hip hop beats – Nellie,

Feel free to surf to my blog; rap beats for sale (Allan)

Anthony Morrison’s genius was in recognizing these opportunities.But he seldom stood up to her on any issues, even small ones, or ever communicated at all until he would finally blow up. Even the individual who won put back a good portion of his weight.It was there that I joined the troupe “All that Glitter Is Not Girls” from 1971-1976.”Greatness inspires envy, envy engenders spite, spite spawns lies.Feel free to visit my page :: shit

Feel free to visit my web-site :: buy beats online (Werner)

Hola! I’ve been reading your blog for some time now and finally got the bravery to go ahead and give you a shout out from Atascocita Tx!Just wanted to mention keep up the good work!Visit my weblog: Payday Loans; http://www.crossroads-festival.com,

my web-site; buy hip hop beats online – Zora -

Like some of the other systems we examined, LiveWatch starts with a phone assessment after you order.Feel free to visit my homepage … best security system (http://karriemoa.co.kr/)

Feel free to visit my web page: beats for sale (Uwe)

Here is an excellent Weblog You might Come across Interesting that we encourage you to visit.

I’m curious to find out what blog platform you happen to be working with? I’m having some minor security problems with my latest website and I would like to find something more secure.Do you have any solutions?Review my blog :: free psn codes (Marcel)

my web page; beats for sale (Dexter)

Seek a house safety and security system that interacts with its tracking company through a cellular connection, not merely broadband or landline.my web-site :: surveillance camera system (Ricardo)

Heya! I just wanted to ask if you ever have any issues with hackers?
My last blog (wordpress) was hacked and I ended up losing months
of hard work due to no back up. Do you have any methods
to prevent hackers?

H? there, You’ve don an incredib?e job. I’ll cert?inly digg it and personally suggest to my friends. I’m confi?ent they ?ill bbe benefited from this w?bsite.Have a look at my website;

Here is my weblog :: buy rap beats online (Tyrone)

Anxiety iis a really robust factor to cope wiith but i’m finding this blopg ctually interesting and helpful.my blog post: anxiety disorders list

I don’t ordinarily comment but I gotta admit appreciate it for the post on this one :D.Stop by my blog post … queens peak condo

My page – buy beats online [Lin]

What’s up, I check your new stuff like every week.Your writing style is witty, keep up the good work!My homepage :: Steroids Canada

Thanks in favor of sharing such a pleasant opinion, post is good, thats why i have read it completely

Here is my blog post :: buy hip hop beats online (Williams)

I think this is one of the so much important information for me. And i am glad reading your article. However should remark on few basic issues, The web site style is wonderful, the articles is really nice : D. Just right job, cheers

Hello, just wanted to say, I enjoyed this blog post.

It was helpful. Keep on posting!

Look at my weblog … garcinia cambogia max reviews, Corina,

Hell? t?ere! I could have ssworn I’ve been to this site before but after browsing th?ouh some of the post I realized it’s new to me. Nonetheless, I’m def?nitew?y delighted I found iit and I’ll be book-marking and checking back frequently!Feel fre? to surf tt? my web page;

Wow, incredible blog layout! How long have you been blogging for? you made blogging look easy. The overall look of your website is fantastic, let alone the content!Here is my blog :: puls ziemi sprawdzian 4 wnetrze ziemi 1 gimnazjum wersja a

Here is my homepage; buy beats online; Frank,

Leave a Reply