distributed locking via memcache ??

1,038 views
Skip to first unread message

Nathan Nobbe

unread,
Jul 24, 2008, 3:00:25 AM7/24/08
to memc...@googlegroups.com
hello all,

im somewhat of a newb w/ memcache so please bear w/ me on this one..

is there a memcache feature that could be used to implement a quick distributed mutex?
we have a lot of webservers where i work, and we are looking into building a distributed lock of some type.
currently we are considering mysql, but that sounds kinda rough to me..

tia,

-nathan

Simone Busoli

unread,
Jul 24, 2008, 3:54:07 AM7/24/08
to memc...@googlegroups.com
Well, memcached behaves like a distributed hashtable, so I guess you might use it for distributed locking. Just give your mutex a key and let your client check the value of that entry in the hashtable.

Dustin

unread,
Jul 24, 2008, 4:03:22 AM7/24/08
to memcached
memcached is not well suited for that due to being extremely
volatile. It might work, but that's merely a coincidence.

I wrote a really simple lock server for what you're describing. The
protocol allows for lock consistency even when servers fail, but I
haven't actually made an mnesia backend for it yet:

http://github.com/dustin/elock

Works well enough for me. :)

Karoly Negyesi

unread,
Jul 24, 2008, 4:05:29 AM7/24/08
to memc...@googlegroups.com
http://www.eu.socialtext.net/memcached/index.cgi?faq#emulating_locking_with_the_add_command

I figured this out before finding the FAQ and it works great. I hear
CPAN has a package for this too if you happen to work in Perl.

Boris Partensky

unread,
Jul 24, 2008, 10:35:51 AM7/24/08
to memc...@googlegroups.com
Karoly, what happens if memcached chooses to evict your lock after you successfully called add() ?
--
--Boris

Karoly Negyesi

unread,
Jul 24, 2008, 10:45:06 AM7/24/08
to memc...@googlegroups.com
Very good question that I never thought of. This lock is not that
fundamentally important it's just a performance boost so if it fails
ah well.

Brian Moon

unread,
Jul 24, 2008, 10:51:54 AM7/24/08
to memc...@googlegroups.com
Boris Partensky wrote:
> Karoly, what happens if memcached chooses to evict your lock after you
> successfully called add() ?

Then you are hosed. How long will you be keeping locks open?

--

Brian Moon
Senior Web Engineer
------------------------------
When you care enough to spend the very least.
http://dealnews.com/

Boris Partensky

unread,
Jul 24, 2008, 11:00:00 AM7/24/08
to memc...@googlegroups.com
Hi Brian, I am just saying that this approach is unreliable, doesn't matter for how long the lock is held. I think this is kind of what Dustin meant above.
--
--Boris

Josef Finsel

unread,
Jul 24, 2008, 11:02:33 AM7/24/08
to memc...@googlegroups.com
Boris,

Good point. Memcached is a cache! It is not:
  • Guaranteed to be available
  • A persistent data store
When considering using memcached in your application, ask yourself the following key question:
  • Will my application fail to run without memcached
If the answer to that question is Yes then you are not using memcached for what it was designed for. That doesn't mean you can use it in the manner you are trying to, but it does mean that it won't work the way you want without a great deal of pain and suffering and gnashing of teeth.

Josef

"If you see a whole thing - it seems that it's always beautiful. Planets, lives... But up close a world's all dirt and rocks. And day to day, life's a hard job, you get tired, you lose the pattern."
Ursula K. Le Guin

Brian Moon

unread,
Jul 24, 2008, 11:03:49 AM7/24/08
to memc...@googlegroups.com
Boris Partensky wrote:
> Hi Brian, I am just saying that this approach is unreliable, doesn't
> matter for how long the lock is held. I think this is kind of what
> Dustin meant above.

Sure, that is what happens when you use a tool (memcached) outside of
its intended scope. So, if your locks are mission critical, memcached
is not a good solution to your problem. If locks are just a luxury like
someone else talked about, then memcached may fill the hole.

Jehiah Czebotar

unread,
Jul 24, 2008, 11:34:32 AM7/24/08
to memc...@googlegroups.com
I think it's quite a stretch to say it's "unreliable". In fact this
setup can be quite reliable (I've used it for months without any
problems.) The time it takes a key to get evicted is weeks in my
system (and I'm sure most setups are similar), so locks that last on
the order of minutes or hours are not a problem, and i have never had
memcached crash on me, so it's ok there.

Now I'm not saying that it's perfect, and I've designed my locking to
be ok if it looses a lock or if the cache is flushed or ..., but that
doesn't mean it's "unreliable".

In my case my process that has a lock continually checks to confirm
that it has the lock, and as soon as it notices that it's lost the
lock (because of a cache flush or otherwise) it goes back to the step
of acquiring it again. (this also keeps the lock from being evicted,
along with other processes checking it)

so yes there are special constraints in dealing with this type of
lock, but i think those of us on this list should be careful about
trying to scare others away from using memcached like this by saying
it's "unreliable" or that you'll be "hosed". It's a very useful way to
use memcached even if it isn't how we all expect it to be used.

--
Jehiah

Kevin Jones

unread,
Jul 24, 2008, 2:11:41 PM7/24/08
to memcached
"In my case my process that has a lock continually checks to confirm
that it has the lock, and as soon as it notices that it's lost the
lock (because of a cache flush or otherwise) it goes back to the step
of acquiring it again. "

Is it just me, or does this sound like a race condition?

On Jul 24, 8:34 am, "Jehiah Czebotar" <jeh...@gmail.com> wrote:
> I think it's quite a stretch to say it's "unreliable". In fact this
> setup can be quite reliable (I've used it for months without any
> problems.) The time it takes a key to get evicted is weeks in my
> system (and I'm sure most setups are similar), so locks that last on
> the order of minutes or hours are not a problem, and i have never had
> memcached crash on me, so it's ok there.
>
> Now I'm not saying that it's perfect, and I've designed my locking to
> be ok if it looses a lock or if the cache is flushed or ..., but that
> doesn't mean it's "unreliable".
>
> In my case my process that has a lock continually checks to confirm
> that it has the lock, and as soon as it notices that it's lost the
> lock (because of a cache flush or otherwise) it goes back to the step
> of acquiring it again. (this also keeps the lock from being evicted,
> along with other processes checking it)
>
> so yes there are special constraints in dealing with this type of
> lock, but i think those of us on this list should be careful about
> trying to scare others away from using memcached like this by saying
> it's "unreliable" or that you'll be "hosed". It's a very useful way to
> use memcached even if it isn't how we all expect it to be used.
>
> --
> Jehiah
>
> On Thu, Jul 24, 2008 at 11:00 AM, Boris Partensky
>

Dustin

unread,
Jul 24, 2008, 2:18:34 PM7/24/08
to memcached

On Jul 24, 8:34 am, "Jehiah Czebotar" <jeh...@gmail.com> wrote:
> I think it's quite a stretch to say it's "unreliable".

It's not at all. memcached is designed specifically to not be
reliable for a given key. It is reliable as a network service, but
assuming a value will be where you left it will just lead to pain when
details change.

It's just not a good road to go down. I see it working like this:

1) Our cache isn't very busy, so the LRU doesn't matter too much.
2) Let's start locking for non-critical happy performance stuff.
3) Wow, memcached is making our application a lot faster, we should
use it more.
4) These locks have been pretty reliable, let's start using them in
slightly more critical areas.
5) goto 3

Besides, the semantics are just wrong. Advisory locks are cool and
all, but I wrote elockd because I needed something a little better. I
want blocking locks and locks that auto-free on process crash and
potentially locks that don't go away just because a lock server does.

> In fact this
> setup can be quite reliable (I've used it for months without any
> problems.)

Are you sure you're set up to correctly detect a problem? Sometimes
two mutually exclusive things won't cause a catastrophic or long-term
failure, but if you're using a tool that is designed to be lossy in a
way that can't afford to be lossy, then it'll eventually come back and
bite you.

> so yes there are special constraints in dealing with this type of
> lock, but i think those of us on this list should be careful about
> trying to scare others away from using memcached like this by saying
> it's "unreliable" or that you'll be "hosed". It's a very useful way to
> use memcached even if it isn't how we all expect it to be used.

It's not about how it's expected to be used, it's about how it's
designed to be used. This isn't fear mongering for the sake of it.
There are plenty of proper lock servers to choose from (google has
chubby, yahoo has zookeeper, amazon has dynamo -- at least one of
these is open source). Most of these are platforms on which you can
build locks, but I built elockd as a dedicated lock server so it's far
more simple to work with.

Of course, you don't have to use any of them,but if you're using
memcached as anything other than a volatile optimistic cache, it will
eventually fail you.

RJ Lalumiere

unread,
Jul 25, 2008, 9:06:11 PM7/25/08
to memc...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jehiah Czebotar wrote:
> and i have never had memcached crash on me, so it's ok there.

Wow, you're lucky. Even since we've been on 1.2.5 we still see 2-3
daemon crashes a day. We used to have each daemon crash multiple times a
day while we were running 1.2.0.

Speaking of which, anyone have some tips on how I might debug what is
causing memcached to fail? I really hate having vital production daemons
go down :-( Ideally something "after the fact" as this is happening in
production with very high volumes that I could not replicate in a test
environment and even if I did try in a test environment I haven't the
slightest clue yet as to what may be the culprit and hence what to test.

- --
RJ Lalumiere
Linux System Administrator


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIiniDRWtiNvkTA9cRAvW1AJ4pePbvJAaeIeLTyqRjchiaJ2oWkACfZfI4
orFRkr+6ZzPCjc61/eFJZG8=
=jo9/
-----END PGP SIGNATURE-----

dormando

unread,
Jul 25, 2008, 9:12:14 PM7/25/08
to memc...@googlegroups.com
Try 1.2.6-rc1 on one of your servers. It's mostly bug/crash fixes.

Have you previously reported these crashes to the list anywhere?

-Dormando

dormando

unread,
Jul 25, 2008, 9:15:52 PM7/25/08
to memcached
I think all the major points in this thread was covered by Dustin, but to
be clear from my point:

I do advertise (in tutorials, conversaion, FAQ, etc) the "ghetto lock",
but it should *not* be taken as a mutually exclusive guaranteed lock. It's
more of an advisory, or "make this mostly not run in parallel" lock. So
anything I've used it with in production would still survive just fine if
two processes still executed the code at the same time.

The benefit was a very fast lock to cut down the number of processes
actually activating that code. At sixapart we don't use the ghetto lock
since we have ddlockd - are there other standard usable distributed
locking daemons aside from what dustin wrote? We can add links to all of
them in the FAQ.

-Dormando

Jehiah Czebotar

unread,
Jul 25, 2008, 9:36:36 PM7/25/08
to memc...@googlegroups.com
> Jehiah Czebotar wrote:
>> and i have never had memcached crash on me, so it's ok there.
>
> Wow, you're lucky. Even since we've been on 1.2.5 we still see 2-3
> daemon crashes a day. We used to have each daemon crash multiple times a
> day while we were running 1.2.0.
>

well... being on this list gives me enough information to pick and
choose which releases I upgrade to when I don't want to be on the
bleeding edge. I am still happily running 1.2.1, and was looking to
upgrade to 1.2.5 but held off because of some of the reported bugs.
1.2.6 is looking promising though.

--
Jehiah

RJ Lalumiere

unread,
Jul 25, 2008, 9:40:05 PM7/25/08
to memc...@googlegroups.com
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

dormando wrote:
> Try 1.2.6-rc1 on one of your servers. It's mostly bug/crash fixes.

Yeah, I considered doing so on one of the more problematic servers, but
was kinda of reticent to put something not yet up to a "stable" release
into production. Hearing that 1.2.6 is mostly bug fixes certainly gives
me some incentive to test it out however.

I'll try it out next week on one server and see if crash frequency
reduces. The funny thing is that it definitely affects some of our
servers while others are happy as clams which makes me wonder if some of
our specific keys are triggering a bug.

> Have you previously reported these crashes to the list anywhere?

Nope, 1st and 2nd posts :P

- --
RJ Lalumiere
Linux System Administrator


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFIioB1RWtiNvkTA9cRAqhoAJ4s/Wk535u2xn52KPKdKX/YfvfX1QCfWZyw
yZ7IiZuvcEGKv11FW3uplvE=
=L4eA
-----END PGP SIGNATURE-----

dormando

unread,
Jul 25, 2008, 9:44:38 PM7/25/08
to memc...@googlegroups.com
> Yeah, I considered doing so on one of the more problematic servers, but
> was kinda of reticent to put something not yet up to a "stable" release
> into production. Hearing that 1.2.6 is mostly bug fixes certainly gives
> me some incentive to test it out however.
>
> I'll try it out next week on one server and see if crash frequency
> reduces. The funny thing is that it definitely affects some of our
> servers while others are happy as clams which makes me wonder if some of
> our specific keys are triggering a bug.

Ok, this is nice to hear. I can be more clear about testing in the
future. In general if you're having crashes like that you should try the
latest stable tree on one of them, report bugs, etc. 1.2.6-rc1's tree has
been almost unchanged for a while now.

One of the things that helps us stamp -rc's and -dev trees as final
releases is people who're actively having problems trying out the code and
letting us know if it fixes issues for them.

>> Have you previously reported these crashes to the list anywhere?
>
> Nope, 1st and 2nd posts :P

Be noisy. It helps fuel our guilt.

-Dormando

Istvan Szukacs

unread,
Jul 26, 2008, 12:30:31 PM7/26/08
to memc...@googlegroups.com
reporting bugs wo/ the description of the environment is pointless

what kind OS/kernel do you have?
what were the compile parameters for memcached (-O666 can crash holy
applications)
are you familiar with debuggers to able to help the developer to fix the
issue?

http://www.ibm.com/developerworks/linux/library/l-debug/

etc.

regards,
lix

Reply all
Reply to author
Forward
0 new messages