Consistent Hashing is a useful technique for horizontal scaling while also protecting yourself against future growth as well as server failures.
A while ago I read about consistent hashing as way of helping to mitigate some of the problems associated with horizontal scaling of services, especially when you need to grow or shrink clusters. But at the time I didn’t really have a reason to use consistent hashing, so I didn’t dig too deeply into how it works or how I might use it. Recently I turned to consistent hashing to the deployment of a new service based on Redis (see Redis: Lightweight key/value Store that Goes the Extra Mile) and thought it would be worth writing about consistent hashing in a bit more detail.
Before we jump into consistent hashing, let’s first consider the problem it tries to solve. When you want to scale beyond the capabilities of a single server service, be it MySQL, an in-memory cache, or anything else, you need a mechanism for randomly but evenly spreading data across all the servers (or clusters). You’d typically accomplish this by running the “key” (the thing that identifies the piece of data) through a hash function and using that value to determine which server owns the data.
For example, you want to fetch object 4355623 from one of your 10 servers. So you run that value through a hash function which maps it to server #8. A good hash function will have consistently fast performance and a fairly even distribution of hashed keys–even similar ones. In other words, keys like
firstname.lastname@example.org, and so on are not going to all end up on the same server.
The implementation of such a hash function is often pretty straightforward. Take the given key, convert it to a string, run that through a well-known hash function, and map the resulting number into one of the available slots. In this case there are 10 slots. Consider the following Perl code which is a simple script you can run from the command-line to try out hashing various values and various numbers of servers:
use Digest::MD5 qw(md5);
my $key = shift;
my $num_servers = shift;
my $val = unpack "N", md5($key);
my $num = $val % $num_servers;
print "$key maps to server $num\n";
Given a key as the first argument and the number of servers as the second, it will tell you which server the key maps to. So let’s do a few trial runs to see where those two email addresses map:
$ ./test_hash.pl email@example.com 10
firstname.lastname@example.org maps to server 6
$: ./test_hash.pl email@example.com 10
firstname.lastname@example.org maps to server 9
Excellent. Those two addresses land on different servers. And if you run enough addresses through that script, you’d find a pretty even distribution. That means you get roughly 10x capacity for your system by going from a single server (or single cluster) to 10. That’s what we call linear scaling, and it beats the pants off a lot of other solutions.
You could run quite well for a long time with a solution like this, happily making use of all the excess capacity to soak up growth. And, even better, if a single server goes offline for a bit, only 10% of your data is affected. But sooner or later you may need to add more servers to the service. Maybe growth has continued and you want to increase the number of servers from 10 to 15. Or maybe you’re buying new (bigger) hardware and can comfortable go from 10 servers to 5.
Unfortunately, that completely changes which server nearly every key lands on.
$ ./test_hash.pl email@example.com 5
firstname.lastname@example.org maps to server 1
$ ./test_hash.pl email@example.com 5
firstname.lastname@example.org maps to server 4
$ ./test_hash.pl email@example.com 15
firstname.lastname@example.org maps to server 1
$ ./test_hash.pl email@example.com 15
firstname.lastname@example.org maps to server 9
Well, things definitely shift around. Sure some keys will land on the same server when there are 15 as when there were 10, but that’s not the common case at all. If you had millions of keys, you’d find the bulk of them shifting around after changing the number of servers!
In other words, you end up re-hashing all the keys. That’s incredibly painful.
If you’re not an algorithms geek, reading the Wikipedia summary of consistent hashing probably leaves you with a lot of questions:
Consistent hashing is a scheme that provides hash table functionality in a way that the addition or removal of one slot does not significantly change the mapping of keys to slots. In contrast, in most traditional hash tables, a change in the number of array slots causes nearly all keys to be remapped. By using consistent hashing, only K/n keys need to be remapped on average, where K is the number of keys, and n is the number of slots.
So let’s simplify that.
The basic idea is to avoid re-hashing most keys to a new server when a server is removed or added. We do this by trying to make hash keys consistently map to the same servers in most (but obviously not all) cases. The code required to accomplish this is a bit longer than is worth pasting here, but the theory behind it is fairly easy to describe. Instead of having a one-to-one relationship between hash “slots” and servers, in consistent hashing, each server is given a larger number of slots-say 1,000 of them. So if you want to add “serverA” to the list, behind the scenes, you end up with “serverA-0001″ … “serverA-1000″ on the list. Each of those names can be mapped to a 64bit number (using a hash function, of course) and stored in a sorted list (or similar data structure). The values will be semi-randomly and semi-evenly spread out in the 64bit number space.
That means instead of 10 servers and 10 slots in a hash, we actually have 10 servers mapping to total of 10,000 slots. To map a key to a server, you run that key through the hash function, just as before, but this time time you don’t simply “mod $num_servers” to find the server. Instead, you take the resulting 64bit value and consult the sorted list. The server immediately after that number in the list is the one that is responsible for the data. So if your 64bit value was closest to the hashed value for “serverD-0532″, the server-D is your answer.
Now, here’s where the real magic happens. If you add a server to the list, that adds another 1,000 slots to the list. But instead of having to rehash all the keys, you end up with only a percentage of them moving to the new server. Similarly, if you need to remove a server from the list, you remove its 1,000 points and only a fraction of the keys and up having to map to other servers (those that lived on the removed server). Understanding Consistent Hashing provides a more visual version of this technique.
Consistent Hashing has been around as a useful technique for over a dozen years but is still underutilized in many architectures. If you’re interested in experimenting with consistent hashing in Perl, Brad Fitzpatrick’s Set::ConsistentHash module is available on the CPAN.
Have you used consistent hashing or similar techniques to scale?