Redistribute failure mode in case of ArrayModNodeLocator

240 views
Skip to first unread message

Boris Partensky

unread,
Mar 23, 2010, 10:24:25 AM3/23/10
to spymem...@googlegroups.com
I am trying to understand how key redistribution algorithm works in case of ArrayModNodeLocator hashing.
Looks like getSequence just returns same node over and over again?


public Iterator<MemcachedNode> getSequence(String k) {
return new NodeIterator(getServerForKey(k));
}

private int getServerForKey(String key) {
int rv=(int)(hashAlg.hash(key) % nodes.length);
assert rv >= 0 : "Returned negative key for key " + key;
assert rv < nodes.length
: "Invalid server number " + rv + " for key " + key;
return rv;
}


Thanks
--Boris

Matt Ingenthron

unread,
Mar 31, 2010, 7:11:24 PM3/31/10
to spymem...@googlegroups.com
Hi Boris,

Boris Partensky wrote:
> I am trying to understand how key redistribution algorithm works in
> case of ArrayModNodeLocator hashing.
> Looks like getSequence just returns same node over and over again?

I believe this is correct. The classic modulus hashing always hashes to
the same node, given the same server list. That means it really doesn't
do anything interesting when redistributing, unless you've changed the
nodes, which you can't really do.

I think the reason this is here is mainly because it's part of the
interface and there could be another way to determine primary/backup if
someone were to implement their own NodeLocator. The only thing that
makes sense with the ArrayModNodeLocator is to return the same node.

Perhaps Dustin will chime in if there's more to this and I need
correcting.

- Matt

Boris Partensky

unread,
Apr 1, 2010, 9:34:10 AM4/1/10
to spymem...@googlegroups.com
<<That means it really doesn't do anything interesting when redistributing, unless you've changed the nodes, which you can't really do.

The way I saw redistribution implemented in whalin client is as follows: key is hashed, mod node array size -> primary node. If primary node is not active, there is a retry counter++ appended to the key, and the key is re-hashed to a new server until the new node is active. I am afraid that the way it is implemented now in spy only exacerbates the problem: node is having troubles, it was de-listed, but yet we queue up more and more ops to it. I changed our failure mode to Cancel, but that will cancel write ops as well, which may lead to stale cache.

--
You received this message because you are subscribed to the Google Groups "spymemcached" group.
To post to this group, send email to spymem...@googlegroups.com.
To unsubscribe from this group, send email to spymemcached...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/spymemcached?hl=en.




--
--Boris

Matt Ingenthron

unread,
Apr 1, 2010, 2:42:24 PM4/1/10
to spymem...@googlegroups.com
Boris Partensky wrote:
> << That means it really doesn't do anything interesting when
> redistributing, unless you've changed the nodes, which you can't
> really do.
>
> The way I saw redistribution implemented in whalin client is as
> follows: key is hashed, mod node array size -> primary node. If
> primary node is not active, there is a retry counter++ appended to the
> key, and the key is re-hashed to a new server until the new node is
> active. I am afraid that the way it is implemented now in spy only
> exacerbates the problem: node is having troubles, it was de-listed,
> but yet we queue up more and more ops to it. I changed our failure
> mode to Cancel, but that will cancel write ops as well, which may lead
> to stale cache.

So, it would rehash to the subset of "active" nodes? This could be
implemented with a Whalin compatible NodeLocator, but that behavior may
be different than other clients.

I just checked with dormando on this, and he says the behavior with
modulus hashing is different with a number of clients. The defaults for
what to do when a node down has even changed in some clients over time.

According to dormando, clients didn't start getting more consistent
until consistent hashing.

It seems to me that if you're not using consistent hashing, it may be
safer to just cancel operations and backoff with retry until the node is
brought back.

I believe Ash has a patch which keeps delete ops around for retry. Let
me ping him to see if that can be published. It may help you and seems
like a 'safer' thing to do rather than cancel delete operations, though
an argument could be made that the whalin client behavior in this case
is similar to what you'd have with consistent hashing.

Note, the 2.5rc3 has in it a fix for consistent hashing compatibility
with libketama based clients. It's probably in your tree as well though
from what I saw on github.

- Matt

> <mailto:spymem...@googlegroups.com>.


> To unsubscribe from this group, send email to
> spymemcached...@googlegroups.com

> <mailto:spymemcached%2Bunsu...@googlegroups.com>.

Dustin

unread,
Apr 1, 2010, 3:52:20 PM4/1/10
to spymemcached

On Apr 1, 11:42 am, Matt Ingenthron <ingen...@cep.net> wrote:
> It seems to me that if you're not using consistent hashing, it may be
> safer to just cancel operations and backoff with retry until the node is
> brought back.

Eric and I talked about cancellation a bit recently and we may be
able to help with that some, too. I think we may be being too
conservative in cancellations and could likely reduce the number of
operations that are cancelled on such failure.

Matt Ingenthron

unread,
Apr 1, 2010, 4:26:38 PM4/1/10
to spymem...@googlegroups.com

Except that in this case I'm thinking of the situation where the server
is down for a period of time.

But maybe your talking about moving read/write operations that are
deletes back to the front of the inputQueue?

- Matt

Reply all
Reply to author
Forward
0 new messages