Failure rehash behavior?

41 views
Skip to first unread message

Srivathsava Rangarajan

unread,
Apr 18, 2013, 7:27:56 PM4/18/13
to spymem...@googlegroups.com
I have been trying to either research / reproduce spymemcached's behavior on a node failing-

Found nothing, tried this:

Here's the relevant parts of my configuration details:
* servers = "host1:port1 host2:port2..."
* protocol = ASCII
* hashAlg = KETAMA_HASH
* locatorType = KETAMA_NODE_LOCATOR
* failureMode = Redistribute

What I do:

a) Create a client with 2 servers: (server1:port1, server2:port2)
b) Wait for 10secs while I go off and manually kill server 1.
c) Issue a set request on a key which seems to be mapping to server 1 on previous runs.
d) Key seems to map to server 1, which is now down and spymemcache complains about it with:

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Reconnecting due to exception on {QA sa=server1:port1, #Rops=2, #Wops=0, #iq=0, topRop=net.spy.memcached.protocol.ascii.StoreOperationImpl@8bbb40, topWop=null, toWrite=0, interested=1}
java.io.IOException: Disconnected unexpected, will reconnect.

Closing, and reopening {QA sa=server1:port1, #Rops=2, #Wops=0, #iq=0, topRop=net.spy.memcached.protocol.ascii.StoreOperationImpl@8bbb40, topWop=null, toWrite=0, interested=1}, attempt 0.

Discarding partially completed op: net.spy.memcached.protocol.ascii.StoreOperationImpl@8bbb40
Discarding partially completed op: net.spy.memcached.protocol.ascii.StatsOperationImpl@1488560

Reconnecting {QA sa=server1:port1, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0}
Reconnecting due to failure to connect to {QA sa=server1:port1, #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0}

java.net.ConnectException: Connection refused: no further information

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

At this point I am guessing the daemon itself isn't listening on server1:port1, so we get a connection refused.

But, why isn't spymemcache trying to redistribute to server2:port2?

The next question is about rehashing behavior itself (which I haven't even seen yet) -

What is the Ketama Hash's expected behavior on a node failing?


Disclaimer!: I have already read https://groups.google.com/forum/?fromgroups=#!topic/spymemcached/zFW6cvZOc1Q post, and while the questions seem quite similar, I quite didn't comprehend the answers. I though presenting my exact problem might trigger a paraphrased answer that might lead to deeper understanding.
  • Does it rehash the keys to the {set of servers} - failing server node?
  • If so, when? As in, will get requests to that key still keep trying to get the data from the failing node while a set request would cause the rehash and future get requests contact the new node?
  • If this node then comes up, and the client somehow attempts to reconnect to it, would future set and get operations on these rehashed set of keys now be directed to the rehashed node, or the failed-and-now-restored node?

Finally, is there any authoritative, detailed documentation of spymemcached available that can answer these questions in the future? It's unbelievably tedious trying to step through decompiled code to try and theorize what is happening in the client library.

Much thanks for a lovely research product and regards,

Srivathsava Rangarajan

Graduate Student & Assistant, Purdue University, West Lafayette






Reply all
Reply to author
Forward
0 new messages