How memcached handle consistency of data in clustered environment

1,498 views
Skip to first unread message

Karim Tawfik

unread,
Oct 17, 2014, 2:23:45 PM10/17/14
to memc...@googlegroups.com
Hi,

I am new to memcached, and try to introduce it on our company as a caching layer, but there is a question came up to my mind, how memcached handle the consistency of data to all are the same in all clusters.

For example:
say I have 2 clusters, each have memcached is installed on it, and clients started to send requests (e.g.updating some data), how the other memcached server would know about such update if it is already caching an old version before it got updated.

I am asking this question as i read 2 contradicting statements on the website, which are:
  1. Under ==> https://code.google.com/p/memcached/wiki/NewOverview, section.
    • Servers are Disconnected From Each Other : Memcached servers are generally unaware of each other. There is no crosstalk, no syncronization, no broadcasting

  2. Under ==> https://code.google.com/p/memcached/wiki/TutorialCachingStory
    • So again, he takes keys that the Programmer uses and looks for them on his memcached servers. 'get this_key' 'get that_key' But each time he does this, he only finds each key on one memcached! Now WHY would you do this, he thinks? And he puzzles all night. That's silly! Don't you want the keys to be on all memcacheds?

      "But wait", he thinks "I gave each memcached 1 gigabyte of memory, and that means, in total, I can cache three gigabytes of my database, instead of just ONE! Oh man, this is great," he thinks. "This'll save me a ton of cash. Brad Fitzpatrick, I love your ass!"

Could you please give me the clear directions, if i have incorrect view.

The last thing is, does memcached get affected by anymeans of replications between server?

Thanks,
Karim

Les Mikesell

unread,
Oct 17, 2014, 2:50:56 PM10/17/14
to memc...@googlegroups.com
What is it that you think is contradicting here? The client is
configured for a set of servers to use, computes a specific one of
them from a hash of the key, and writes an item to exactly one server.
When any client with the same configuration looks up that same key it
will do the same computation and thus target the same server. Other
keys may go to other servers.

> The last thing is, does memcached get affected by anymeans of replications
> between server?

No, there is only one copy. If that server instance is down, the
client must get the data from the backing persistent storage - and
depending on the client's hashing strategy it can either continue to
fail for whatever percentage of the cache that server handles until it
comes back up, or it can rebalance the storage over the remaining
servers.

--
Les Mikesell
lesmi...@gmail.com

Karim Tawfik

unread,
Oct 17, 2014, 7:37:57 PM10/17/14
to memc...@googlegroups.com
Thanks alot for your quick reply.

What am confused about and think that is contradicting, is the first point mentioned that memcached server as separated and didn't know anything about each other, which means, if there is a data replicated on 2 server, and each have it own memcached, there would be no kind of keeping the data consistent between them, Am I right?

If yes, is this a good practice, or for all my clusters I should have one and only one memcached for'em all?

Thanks,
Karim

Les Mikesell

unread,
Oct 17, 2014, 8:14:26 PM10/17/14
to memc...@googlegroups.com
On Fri, Oct 17, 2014 at 6:37 PM, Karim Tawfik <karim.t...@gmail.com> wrote:
> Thanks alot for your quick reply.
>
> What am confused about and think that is contradicting, is the first point
> mentioned that memcached server as separated and didn't know anything about
> each other, which means, if there is a data replicated on 2 server, and each
> have it own memcached, there would be no kind of keeping the data consistent
> between them, Am I right?

No, I think you are still missing the concept. The servers don't
know/care about each other. It is the clients that know about the
number of servers and split the keys across them. So nothing is
replicated. One key/value goes to one server only, and the hashing
math makes all the clients pick the same one.

> If yes, is this a good practice, or for all my clusters I should have one
> and only one memcached for'em all?

First, remember that it is just a cache, so the client needs to be
prepared to get the data from a persistent store if it isn't in the
cache. Then think about the percentage of misses that the backend
storage can handle, timing wise. The more memcache servers you have,
the smaller the percentage of misses you'll have if one goes offline.

--
Les Mikesell
lesmi...@gmail.com

Karim Tawfik

unread,
Oct 17, 2014, 9:06:37 PM10/17/14
to memc...@googlegroups.com
Got this point, the other point is, how often should I update the cache, in other words, if the data I've added to the cache the client has updated, what are the best practices to make the cache upto date with the latest updates.

Thank you for your patience and clarification.

Les Mikesell

unread,
Oct 17, 2014, 10:03:48 PM10/17/14
to memc...@googlegroups.com
On Fri, Oct 17, 2014 at 8:06 PM, Karim Tawfik <karim.t...@gmail.com> wrote:
> Got this point, the other point is, how often should I update the cache, in
> other words, if the data I've added to the cache the client has updated,
> what are the best practices to make the cache upto date with the latest
> updates.

That depends very much on the data itself. If it doesn't change often
and/or doesn't hurt much to use old values you can set a long expire
time. For other things you might use a short expire or push to the
cache as you have the new values. The main value comes when you
store items that can be reused many times without bothering the
backend database.

--
Les Mikesell
lesmi...@gmail.com

Denis Samoylov

unread,
Oct 18, 2014, 1:49:37 AM10/18/14
to memc...@googlegroups.com
memcached is not a clustered solution, so each server is completely independent. if you need to keep consistency you need to do this by using client. great example is Facebook McRouter: https://code.facebook.com/posts/296442737213493/introducing-mcrouter-a-memcached-protocol-router-for-scaling-memcached-deployments/
it allows to setup show copy.

also, keep in mind if you do not use "leases" memcached is pretty "evential" consistency solution.

Other option is to use Redis instead of Memcached, it has replication (and some other advanced features). But replication brings tons of other problems (especially in a way how it is implemented in Redis)

Roberto Spadim

unread,
Oct 18, 2014, 2:21:11 AM10/18/14
to memc...@googlegroups.com
don't forget mysql innodb/ndb memcache interface, it's a memcached protocol with a mysql database as storage

--

---
You received this message because you are subscribed to the Google Groups "memcached" group.
To unsubscribe from this group and stop receiving emails from it, send an email to memcached+...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--
Roberto Spadim

Karim Tawfik

unread,
Oct 18, 2014, 6:43:08 AM10/18/14
to memc...@googlegroups.com
Great article Denis and got more from it, so the final answer is memcache can't handle consistency between clusters as you said.

But this only in case I use multi memcached server, but if only one memcached server, I would not care about consistency, right?

Les Mikesell

unread,
Oct 18, 2014, 11:38:11 AM10/18/14
to memc...@googlegroups.com
On Sat, Oct 18, 2014 at 5:43 AM, Karim Tawfik <karim.t...@gmail.com> wrote:
> Great article Denis and got more from it, so the final answer is memcache
> can't handle consistency between clusters as you said.
>
> But this only in case I use multi memcached server, but if only one
> memcached server, I would not care about consistency, right?
>

What do you mean by 'between clusters'? In a typical setup where
you have multiple servers there is still only one copy of any value
stored so as far as single key/value pairs it has to be consistent.

--
Les Mikesell
lesmi...@gmail.com
Reply all
Reply to author
Forward
0 new messages