Failed node causes 502 Bad Gateway

16 views
Skip to first unread message

rhinmass

unread,
Apr 14, 2010, 3:48:17 PM4/14/10
to spymemcached
I have a web application running in tomcat that uses 2 memcached
nodes.

I create my client as follows:
ConnectionFactoryBuilder builder = new
ConnectionFactoryBuilder();
builder.setHashAlg(HashAlgorithm.NATIVE_HASH);

builder.setLocatorType(ConnectionFactoryBuilder.Locator.CONSISTENT);
builder.setProtocol(ConnectionFactoryBuilder.Protocol.BINARY)
s_cacheClient = new MemcachedClient(builder.build(),
s_config.getMemcachedServers());

I have also tried using all defaults:
ConnectionFactoryBuilder builder = new
ConnectionFactoryBuilder();
s_cacheClient = new MemcachedClient(builder.build(),
s_config.getMemcachedServers());


Everything works great until I kill one of the 2 nodes. Then the
entire webapp hangs on every request eventually returning a 502 Bad
Gateway error.

In the catalina logs it appears to be spinning on
MemcachedConnection.handleIO

I have tried both the 2.4.2 jar and 2.5rc3 jars.

With 2.5.rc3 here is what I get in catalina.out

2010-04-14 15:42:45.002 INFO net.spy.memcached.MemcachedConnection:
Reconnecting {QA sa=ramp-cache-02/192.168.20.101:11211, #Rops=0,
#Wops=5329, #iq=0, topRop=null,
topWop=net.spy.memcached.protocol.ascii.StatsOperationImpl@71045457,
toWrite=0, interested=0}
2010-04-14 15:42:45.003 INFO net.spy.memcached.MemcachedConnection:
Connection state changed for sun.nio.ch.SelectionKeyImpl@7f85bc0b
2010-04-14 15:42:45.003 INFO net.spy.memcached.MemcachedConnection:
Reconnecting due to failure to connect to {QA sa=ramp-
cache-02/192.168.20.101:11211, #Rops=0, #Wops=5329, #iq=0,
topRop=null,
topWop=net.spy.memcached.protocol.ascii.StatsOperationImpl@71045457,
toWrite=0, interested=0}
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
at
net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:
321)
at
net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:
218)
at net.spy.memcached.MemcachedClient.run(MemcachedClient.java:
1589)
2010-04-14 15:42:45.003 WARN net.spy.memcached.MemcachedConnection:
Closing, and reopening {QA sa=ramp-cache-02/192.168.20.101:11211,
#Rops=0, #Wops=5329, #iq=0, topRop=null,
topWop=net.spy.memcached.protocol.ascii.StatsOperationImpl@71045457,
toWrite=0, interested=0}, attempt 23.
2010-04-14 15:43:15.004 INFO net.spy.memcached.MemcachedConnection:
Reconnecting {QA sa=ramp-cache-02/192.168.20.101:11211, #Rops=0,
#Wops=5569, #iq=0, topRop=null,
topWop=net.spy.memcached.protocol.ascii.StatsOperationImpl@71045457,
toWrite=0, interested=0}
2010-04-14 15:43:15.004 INFO net.spy.memcached.MemcachedConnection:
Connection state changed for sun.nio.ch.SelectionKeyImpl@24fcb795
2010-04-14 15:43:15.004 INFO net.spy.memcached.MemcachedConnection:
Reconnecting due to failure to connect to {QA sa=ramp-
cache-02/192.168.20.101:11211, #Rops=0, #Wops=5569, #iq=0,
topRop=null,
topWop=net.spy.memcached.protocol.ascii.StatsOperationImpl@71045457,
toWrite=0, interested=0}
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(Unknown Source)
at
net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:
321)
at
net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:
218)
at net.spy.memcached.MemcachedClient.run(MemcachedClient.java:
1589)
2010-04-14 15:43:15.005 WARN net.spy.memcached.MemcachedConnection:
Closing, and reopening {QA sa=ramp-cache-02/192.168.20.101:11211,
#Rops=0, #Wops=5569, #iq=0, topRop=null,
topWop=net.spy.memcached.protocol.ascii.StatsOperationImpl@71045457,
toWrite=0, interested=0}, attempt 24.

If I restart the failed node, things start working again.

This seems similar to other comments in this forum, but I'm not having
any luck with rc3. Also, bdk only gets the error when the last node
goes down. I'd take that behavior, if I could get it. Is there a way
I need to configure things to get more fault tolerant behavior?

Thanks for any help!!



--
You received this message because you are subscribed to the Google Groups "spymemcached" group.
To post to this group, send email to spymem...@googlegroups.com.
To unsubscribe from this group, send email to spymemcached...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/spymemcached?hl=en.

Reply all
Reply to author
Forward
0 new messages