net.spy.memcached.OperationTimeoutException or java.lang.RuntimeException under high load

3,890 views
Skip to first unread message

Erik Meier

unread,
Oct 4, 2012, 8:51:18 PM10/4/12
to spymem...@googlegroups.com
We are seeing timeouts under high load using spymemcached 2.7.3 and memcached 1.4.4:

net.spy.memcached.OperationTimeoutException: Timeout waiting for value at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:1185) at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:1200) 
java.lang.RuntimeException: Exception waiting for value at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:1183) at net.spy.memcached.MemcachedClient.get(MemcachedClient.java:1200) 

Here is my connection factory:

"Failure Mode: Redistribute, Hash Algorithm: NATIVE_HASH Max Reconnect Delay: 30, Max Op Timeout: 2500, Op Queue Length: 16384, Op Max Queue Block Time10000, Max Timeout Exception Threshold: 998, Read Buffer Size: 16384, Transcoder: net.spy.memcached.transcoders.SerializingTranscoder@90dae16, Operation Factory: net.spy.memcached.protocol.binary.BinaryOperationFactory@61202afe isDaemon: false, Optimized: true, Using Nagle: false, ConnectionFactory: DefaultConnectionFactory"

Everything is using the defaults except I am using binary connection and consistent hashing.

Out of several million requests only a dozen or so timeout, but we are storing a cache "version" key in memcached and fetching that before we fetch the requested key.  Since the version key does not get fetched it effectively clears all our cache..and at peak times nonetheless.

Is there any tuning I can do to avoid seeing these timeouts?

George Cao

unread,
Nov 8, 2012, 12:27:39 PM11/8/12
to spymem...@googlegroups.com
--
You received this message because you are subscribed to the Google Groups "spymemcached" group.
To view this discussion on the web visit https://groups.google.com/d/msg/spymemcached/-/lJAjH4NCC_EJ.
To post to this group, send email to spymem...@googlegroups.com.
To unsubscribe from this group, send email to spymemcached...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/spymemcached?hl=en.
Try to pay attention to long GC pause time. This may be one of the reasons why time out happens.

Erik Meier

unread,
Nov 12, 2012, 6:31:16 PM11/12/12
to spymem...@googlegroups.com
George, thanks, that makes a lot of sense...I've refactored our cache abstraction so this shouldn't cause us many issues anymore.  I'll cross check the timeouts to our GC logs.

~Erik

Matt Ingenthron

unread,
Nov 13, 2012, 12:14:14 AM11/13/12
to spymem...@googlegroups.com
We just recently added some guidance on Couchbase's client, which is derived from this.  Might be useful.  See: http://www.couchbase.com/docs/couchbase-sdk-java-1.0/java-gc-tuning.html

To view this discussion on the web visit https://groups.google.com/d/msg/spymemcached/-/3mD9d9iEp9gJ.

Erik Meier

unread,
Nov 30, 2012, 1:52:37 AM11/30/12
to spymem...@googlegroups.com
Matt,

Thanks, that is very useful information.  I have been finding some helpful hints in the CouchBase client docs that also apply to spymemcached.

We are considering CouchBase as a next step since we have been leaning more on memcached and the lack of persistence keeps us from using it even more.  I'll explore the documentation more since we are experiencing another odd issue where traffic to one node (in a 4 node cluster) is much lower than the other nodes (connections are the same though).
Reply all
Reply to author
Forward
0 new messages