Like most production web sites these days, we use memcached to cache all sorts of data to increase responsiveness. It’s great, partly because it’s really simple. One of its simplicities, though, is that the available client libraries use a very simple hashing mechanism to assign a resource to a server. If you have N memcached servers, the server id is just (hash(key) mod N).
This isn’t great if your servers come and go, because only 1/N resources get mapped to the same server when adding or removing a server. Also, if some servers have much larger capacity than the rest, either the big servers won’t fill up, or the small servers will have a lower hit rate as they flush more frequently. Consistent Hashing is a technique for assigning resources to servers much more flexibly. When adding a new server, only 1/N resources change their server, and different servers can be assigned varying fractions of the resources.
Now we just need a Python memcached implementation...
Comments
Ned, check out
svn://svn.audioscrobbler.net/misc/ketama/
John, thanks, but that's a PHP library. Good to know people are getting close though...
In the svn repos there is a python wrapper library for a c based consistent hashing library. Its not pure python, but it looks to be a simple compile away. ;-)
You may want to check out: http://amix.dk/blog/viewEntry/19367 - Pure Python implementation of consistent hashing.
Add a comment: