Though, to argue with the article's title -- while a straightforward addition of more memcached nodes did not solve their capacity problem, the strategy of replicating nodes & load balancing read requests between them appears to be a solution to this particular capacity problem; so more machines = more capacity, so long as they are organized appropriately.
Being a little smarter on the client side can help a lot here.
If you store and retrieve an object only to server # (object.id % numservers), you don't face this issue. (Obviously you could use a better hash than %.)
This would help to distribute requests for individual objects, but doesn't help for multi-get requests. The problem is adding more nodes increases the number of actual requests a multi-get call needs to make (assuming you're asking for a sufficient # of objects they are likely to be distributed across all nodes). This decreases the # of keys requested per node but increases the total number of requests to the cluster. Because the bound was on throughput of requests (bound by CPU), minimizing the number of keys per request to a node doesn't help.
The proposed solution is to instead replicate nodes and load balance read requests. In this case, this doubles your read capacity, though you must write twice (or N times depending on your replication level).
It's worth noting that that solution only works if you have much fewer writes than reads, but that's probably true for most people. A really simple way of doing this that all memcached clients should be able to handle is to set up two separate memcached clusters, write to both, and randomly read from either of them. That way you don't replicate nodes, you replicate the entire cluster. :-)
A better solution would be to consolidate your items, i.e. make sure that items that often are fetched together with multi-get end up on the same server, however this requires a lot of extra knowledge about your data, and it's not very likely that this knowledge is available.
> It's worth noting that that solution only works if you have much fewer writes than reads, but that's probably true for most people.
Is it true of tweets? How about e-mail? What facebook content works that way? (IIRC the discussion of their new image store, one of the motivations was that most pictures were never viewed.)
Well, this is the underlying logic indeed. What the author mentions is seeing 50 requests for 100 friends, where these friends might be divided into 3 servers. So, the keys are hashed to servers, i.e., if there is only 1 multi-get request, each server would see only 1 request. However, if there are 50 multi-get requests, each server would see 50 requests irrespective of the number of keys (friends) it caters to.
This is the part of the article I really didn't understand.
The way multiget works is that you send a list of keys to a server, and get back a list of items corresponding to those keys. So if you have 100 items on two servers, you would send 50 keys to Server #1 and 50 keys to Server #2 (assuming they're evenly distributed) and get back 50 items from each.
And then the article suddenly talks about adding another server and having 33 items on each, and for some reason sending 50 requests to the two original servers. This makes no sense, of course 50 requests consume more resources than the one request in the previous example?
However, if we back up a bit, add another server and redistribute the items, then there will be 33 on each server. If we do a multi-get for these items, we will make three requests; one for each server, put 33 keys in each request, and get back 33 items in each response.
The point I think he should have made is that in this case you've not reduced the load on each server, only the amount of data sent in each, and since your two original servers were CPU-bound, you haven't gained any performance, they're still choked, they're still seeing the same amount of requests.
Then again, you have to have a crazy amount of requests, and a crazy amount of bandwidth before you'll start seeing memcached servers being CPU-bound, so for the absolute majority of us, this problem is not something we'll ever see in real life. :-)
No no, I think he speaks of 50 requests from the get go, i.e., 50 requests of 100 friends each, irrespective of the number of servers. These 50 requests is unrelated to the number of friends, it's just some huge number of requests. He claims that no matter how you divide the data, if a certain number of requests create a CPU bottleneck, they'll continue to do so no matter how you increase memory.