Memcache in multiple servers

Margalit Silver

unread,

Feb 3, 2011, 5:32:25 AM2/3/11

to memc...@googlegroups.com

Our system has 4 live servers on a load balancer in an Amazon Cloud.

We are using memcached but don't understand it very well, previous programmers implemented it.

Any user that goes to our site will be using one of the 4 live servers and all are supposed to be in sync and have the same data. We have a bug that when a write is done, the other servers are not updated until after several hours so there is not consistent data between them. All 4 use the same database.

We see in our code that memcache is deleted when a write is done but it seems that it is only done for that one server, we want that it should be cleared for that one piece of data on all servers.

Any ideas how we can fix this?

Thank you for your help.

Roberto Spadim

unread,

Feb 3, 2011, 10:59:06 AM2/3/11

to memc...@googlegroups.com

try to use repcached:
http://repcached.sourceforge.net/

it´s a server based replication (not client based)

2011/2/3 Margalit Silver <margali...@gmail.com>:

--
Roberto Spadim
Spadim Technology / SPAEmpresarial

Dustin

unread,

Feb 3, 2011, 12:10:00 PM2/3/11

to memcached

On Feb 3, 2:32 am, Margalit Silver <margalit.sil...@gmail.com> wrote:
> Our system has 4 live servers on a load balancer in an Amazon Cloud.

> We see in our code that memcache is deleted when a write is done but it
> seems that it is only done for that one server, we want that it should be
> cleared for that one piece of data on all servers.

If you are using memcached effectively, a given piece of data will
only exist on one server. That's how you achieve massive scale in a
caching layer.

You can get a quick overview here: http://memcached.org/about

Roberto Spadim

unread,

Feb 3, 2011, 12:30:15 PM2/3/11

to memc...@googlegroups.com

>> We see in our code that memcache is deleted when a write is done but it
>> seems that it is only done for that one server, we want that it should be
>> cleared for that one piece of data on all servers.

i think they are doing client side replication....
maybe you client side code is wrong...

one solution: make your client side ok
other: use repcache and just use client side load balance

2011/2/3 Dustin <dsal...@gmail.com>:

--

Jason Sirota

unread,

Feb 3, 2011, 3:02:32 PM2/3/11

to memc...@googlegroups.com

Margalit,

Can you give us some more information about how your architecture is set up? You say you have 4 live servers on an amazon cloud.

What client are you using to access memcached?

Can you share your memcache configuration section?

Can you share a snippet of code that accesses memcached?

As Dustin says, data is not supposed to exist on more than one server, so something else may be going on.

Jason

Margalit Silver

unread,

Feb 7, 2011, 2:14:27 AM2/7/11

to memc...@googlegroups.com

Thank you for your responses.

In the last several days we have gotten a much better understanding of memcache in general and how we use it.

A little more background to help you understand the current state:

We have 4 live servers that are all supposed to be identical. Our code is in PHP. We have high user traffic and any user logging onto our site at any given moment could be sent to any one of the 4 servers and it shouldn't matter which one, they should all look the same. Based on my understanding of the benefits of memcache as a distributed caching system, we are not using it in this way. The current implementation sets up memcache like this:

class MyMemcache extends Memcache {

public function MyMemcache( $environment = null) {

if (empty($environment)) {

$environment = ENVIRONMENT_CODE;

}

switch ($environment) {

case ENV_DEV:

$this->addServer('localhost', 11211);

break;

case ENV_PROD:

// base

$this->addServer('localhost', 11211);

break;

default:

throw new Exception("Unknown environment $environment");

}

I saw in previous code that there were more servers added in the case ENV_PROD but they were removed at some point. So basically we use memcache as a local cache. This is how we use memcache:

public function getCacheableActivityByID($id) {

$key = ActivityManager::getActivityCacheKey($id);

$cached = $this->memcache->get($key);

if( !empty($cached )) {

return $this->refreshDynamicData($cached);

}

$activity = $this->getNonDeletedActivityByID($id); //fetches from DB

$this->memcache->set($key, $activity, 0, ACTIVITY_EXPIRY_SECS);

return $activity;

}

public function invalidateCachedActivity($activityID) {

$key = ActivityManager::getActivityCacheKey($activityID);

$this->memcache->delete($key);

}

We set the key to expire after 6 hours so currently we know that the maximum time a key could have inconsistent data is 6 hours but we would like it to be updated as soon as there is a DB write. We know that this problem of inconsistent data would be solved if we used memcache as it is supposed to be used by adding all the servers using the addServer function, however we are hesitant to do so because of the time lag that would be caused by a client having to get data from another server, the reason we are running 4 identical servers is to have quick response to many client machines.

Based on all of this we are leaning towards a solution of notifying all servers of an update. In order not to impact response time and not to have these servers bogged down in notifications, the best solution might be one with a master server that notifies the other servers to invalidate cache on a DB write.

Please let me know if we went wrong in our understanding somewhere. Any tips or thoughts are greatly appreciated.

Thanks.

Dustin

unread,

Feb 7, 2011, 2:42:24 AM2/7/11

to memcached

On Feb 6, 11:14 pm, Margalit Silver <margalitatw...@gmail.com> wrote:

> We set the key to expire after 6 hours so currently we know that the maximum
> time a key could have inconsistent data is 6 hours but we would like it to
> be updated as soon as there is a DB write. We know that this problem of
> inconsistent data would be solved if we used memcache as it is supposed to
> be used by adding all the servers using the addServer function, however we
> are hesitant to do so because of the time lag that would be caused by a
> client having to get data from another server, the reason we are running 4
> identical servers is to have quick response to many client machines.
>
> Based on all of this we are leaning towards a solution of notifying all
> servers of an update. In order not to impact response time and not to have
> these servers bogged down in notifications, the best solution might be one
> with a master server that notifies the other servers to invalidate cache on
> a DB write.

It sounds like you're using fear of a potential bottleneck you'd
have by using memcached the normal way lead you to the path of
something that won't scale well.

When you have 40 (10x) servers, how much of the time spent on any
one of the frontends will be running work from your distributed cache
invalidation tool? What will be the effect of having your cache size
not grow by 10x when your traffic and servers do? What's the cost of
developing this distributed cache invalidation tool (assuming you'd be
building on something like spread, that's still just a starting
point)? How does that cost compare to just doing the simple thing and
seeing how well it works for you?

Message has been deleted

Margalit Silver

unread,

Feb 7, 2011, 4:06:56 AM2/7/11

to memc...@googlegroups.com

1) I'm guessing your comment "fear of potential bottleneck" was referring to my comment ""not to have these servers bogged down in notifications". That isn't really our main concern. Our main concern because of which we are hesitant to go towards a full and proper implementation of memcache is "the time lag that would be caused by a client having to get data from another server". We only want to do a notification on a DB write which is very infrequent (relatively), the vast majority of activity is DB reads.

2) Can you please tell me what is "spread"?

3) What are the implications of adding a server with addserver for a brief moment and then removing the server, assuming there is some way to do that. Does it invalidate the entire cache?

Thanks again for your help.

Henrik Schröder

unread,

Feb 7, 2011, 6:39:23 AM2/7/11

to memc...@googlegroups.com

Seriously, this is not a problem you should be concerned about, and you should not spend time creating some weird over-engineered solution that solves a problem you didn't have in the first place. Use memcached the way it's supposed to be used. Try it out. Measure response times. I'm pretty sure you will find that the network I/O is insignificant.

Changing the server config during runtime may or may not invalidate the entire cache depending on which server selection algorithm you are using. Your client probably uses libketama, which means that if you have n servers, you invalidate 1/n of the cache by adding or removing a server.

/Henrik Schröder

Margalit Silver

unread,

Feb 7, 2011, 8:50:21 AM2/7/11

to memc...@googlegroups.com

A few more questions we have:

1) How does memcached deal with a change in IP of a server? Does that invalidate the whole cache in the same way as if I added or removed a server?

2) We are also concerned about the number of open apache connections on a machine at a given time. Currently we have about 25 regularly (planning on significant growth). When we once had too many error logs being written the number jumped to over 100 per machine and caused the servers to freeze up. We don't want this implementation to cause a huge increase in number of open connections because of key lookups and cause problems that way.

Please give as many details as possible in response to this to help us choose the right solution.

Thanks for all the help.

Roberto Spadim

unread,

Feb 7, 2011, 10:11:33 AM2/7/11

to memc...@googlegroups.com

i think you want raid1 of memcached?!
for write, make sure you wrote on all memcached, for read select any
one (that's online)

2011/2/7 Margalit Silver <margali...@gmail.com>:

--

Dustin

unread,

Feb 7, 2011, 1:47:47 PM2/7/11

to memcached

On Feb 7, 1:06 am, Margalit Silver <margalitatw...@gmail.com> wrote:
> 1) I'm guessing your comment "fear of potential bottleneck" was referring to
> my comment ""not to have these servers bogged down in notifications". That
> isn't really our main concern. Our main concern because of which we are
> hesitant to go towards a full and proper implementation of memcache is "the
> time lag that would be caused by a client having to get data from another
> server". We only want to do a notification on a DB write which is very
> infrequent (relatively), the vast majority of activity is DB reads.

I was referring to "we are hesitant to do so because of the time lag

that would be caused by a client having to get data from another
server"

You are concerned about something you haven't measured, but you
think will negatively affect you. This is a completely standard
deployment of memcached. This is why we wrote this page:
http://memcached.org/about -- having 4x more memory is going to make a
big difference to you. You should use it.

If it actually doesn't work for you, then you should consider what
you might need to change. It may very well not be memcached.

> 2) Can you please tell me what is "spread"?

Spread is what you'd be reinventing when trying to build a reliable
distributed cache invalidation mechanism to avoid a typical deployment
of memcached. http://www.spread.org/

> 3) What are the implications of adding a server with addserver for a brief
> moment and then removing the server, assuming there is some way to do that.
> Does it invalidate the entire cache?

This is also not a normal practice. I don't know exactly what you'd
be trying to accomplish by doing this, but I think the behavior would
hard to reason about.

Reply all

Reply to author

Forward