Can you post actual log output and describe what you'd prefer?
Cheers,
Martin
On 08/08/2011 06:07 PM, Sean Laurent wrote:
> For some reason I can't reply on Google Groups directly, so I
> apologize if this email messes up the thread.
No problem, I'm pushing it back to the thread :-)
And I just reactivated that posting from the web shall be
possible. Sorry for being too restrictive.
> I have two dedicated memcached servers I'm using for testing:
> memcache1.dev.example.ec2 and memcache2.dev.example.ec2. Tomcat is setup
> to use both, with memcache1 listed first. I logged into my website and
> established a session. Then I terminated memcached on memcache1 and
> refreshed the page. At that point, I start seeing the following in the
> logs a handful of times:
>
> ---snip---
> 2011-08-06 16:35:19.037 INFO net.spy.memcached.MemcachedConnection:
> Reconnecting {QA sa=memcache1.dev.example.ec2/10.36.35.2:11211
> <http://10.36.35.2:11211>, #Rops=0, #Wops=0, #iq=0, topRop=null,
> topWop=null, toWrite=0, interested=0}
> 2011-08-06 16:35:19.038 INFO net.spy.memcached.MemcachedConnection:
> Connection state changed for sun.nio.ch.SelectionKeyImpl@43649854
> 2011-08-06 16:35:19.039 INFO net.spy.memcached.MemcachedConnection:
> Reconnecting due to failure to connect to {QA
> sa=memcache1.dev.example.ec2/10.36.35.2:11211 <http://10.36.35.2:11211>,
> #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0, interested=0}
> java.net.ConnectException: Connection refused
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:567)
> at
> net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:322)
> at
> net.spy.memcached.MemcachedConnection.handleIO(MemcachedConnection.java:219)
> at net.spy.memcached.MemcachedClient.run(MemcachedClient.java:1591)
> 2011-08-06 16:35:19.039 WARN net.spy.memcached.MemcachedConnection:
> Closing, and reopening {QA
> sa=memcache1.dev.example.ec2/10.36.35.2:11211 <http://10.36.35.2:11211>,
> #Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0,
> interested=0}, attempt 2.
> ---snip---
To make spymemcached more quit you can add the following to bin/catalina.sh:
CATALINA_OPTS="-Dnet.spy.log.LoggerImpl=net.spy.memcached.compat.log.SunLogger"
And then add this to conf/logging.properties
net.spy.memcached.MemcachedConnection.level = WARNING
Then the output from above should be reduced to
Aug 9, 2011 12:17:22 AM net.spy.memcached.MemcachedConnection queueReconnect
WARNING: Closing, and reopening {QA sa=localhost/127.0.0.1:11212,
#Rops=0, #Wops=0, #iq=0, topRop=null, topWop=null, toWrite=0,
interested=0}, attempt 2.
If you want it to be even more silent you can change WARNING to SEVERE.
I also just updated the msm docs with this information:
http://code.google.com/p/memcached-session-manager/wiki/SetupAndConfiguration#Logging
If you have suggestions for improvements please let me know!
Btw, in your initial posting you mentioned WARN - this has to be WARNING
actually, the next is SEVERE (see
http://tomcat.apache.org/tomcat-6.0-doc/logging.html#Using_java.util.logging_%28default%29).
> After that, I start seeing the following at least every couple seconds,
> if not more frequently, until I restart the first memcached server:
>
> ---snip---
> Aug 6, 2011 4:35:39 PM
> de.javakaffee.web.msm.LockingStrategy$OnBackupWithoutLoadedSessionTask
> pingSessionBackup
> WARNING: An exception occurred when trying to ping session
> 155A08552A63C7ED9A0676D774F57DE9-memcache2
> java.util.concurrent.ExecutionException: java.lang.RuntimeException:
> Cancelled
> at
> net.spy.memcached.internal.OperationFuture.get(OperationFuture.java:75)
> at
> de.javakaffee.web.msm.LockingStrategy$OnBackupWithoutLoadedSessionTask.pingSessionBackup(LockingStrategy.java:618)
> at
> de.javakaffee.web.msm.LockingStrategy$OnBackupWithoutLoadedSessionTask.call(LockingStrategy.java:595)
> at
> de.javakaffee.web.msm.LockingStrategy$OnBackupWithoutLoadedSessionTask.call(LockingStrategy.java:563)
> at
> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> at java.util.concurrent.FutureTask.run(FutureTask.java:138)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> at java.lang.Thread.run(Thread.java:662)
> Caused by: java.lang.RuntimeException: Cancelled
> ... 9 more
> ---snip---
Ok, let's see what's going on: when OnBackupWithoutLoadedSessionTask is
run it means that the user made a request that didn't change the
session. In this case msm saves the time to update the session in
memcached (both serialization time and transferring many bytes through
the wire can be skipped). As memcached works somehow like a LRU-cache
the session still should be pushed up in the list of used/active
sessions in memcached, as the user obviously is still active. So a
little bit of works has to be done - the session is "touched" for this
(in memcached it's an add).
When this fails, it might happen that the session will get lost in
memcached as it's considered to be too old and no longer of any
interest. So this warning should not be ignored.
Still, the question is why this does happen. At the start of the
request, msm checks if the primary memcached of the session is still
alive (check is done by
MemcachedSessionService.changeSessionIdOnMemcachedFailover).
When the primary memcached is not alive the session is loaded from the
secondary/backup memcached, gets assigned a new memcached and the
session id is changed accordingly. The logs of this process are of level
INFO.
So if you see this WARNING I'd assume that something before has gone
wrong. Can you post more log output or put together some sample that I
can check out for reproducing this?
Cheers,
Martin
I'm cc'ing the mailing list, it would be nice if you could also do this
(as long as you can, e.g. you don't post numbers that are somehow
confidential)
On 08/25/2011 12:39 AM, VegHead wrote:
> Sorry for the slow response. We're definitely noticing that there
> might be a problem with lost sessions and it looks like it's tied to
> the WARNING message you pointed out. Here's the typical log output:
Do you need to session lock or could you also try without session locks?
In the logs it would also be useful to see which statements are from the
same requests (e.g. by adding the thread name in logs).
Did you already check memcached stats, e.g. regarding evictions?
How many concurrent sessions do you have (per memcached), how big are
sessions in average (in kb), and how is memcached configured
memory-wise? Did you already check how this all fits together?
Cheers,
Martin
> That said, we really only touch the session in two cases: once when
> the user logs into our site and then whenever they perform one
> specific activity on the site. With that in mind, in the typical use
> case, it's not possible for the user to perform that specific activity
> multiple times simultaneously. So I may be worrying about something
> that's not actually a problem.
In this case I'd suggest that you configure msm with sticky sessions
also. This simplifies things and improves performance: for non-sticky
sessions the session is loaded from memcached if it's accessed in some
way (mostly a session is accessed more often than one thinks), while in
sticky setup the session can be loaded from memory.
Moving sessions between tomcats and memcacheds is supported by msm and
should work well.
>> In the logs it would also be useful to see which statements are from the
>> same requests (e.g. by adding the thread name in logs).
>
> That's going to catalina.out and I can't figure out how to change the
> output format. :(
AFAIK you had to switch tomcat logging to log4j, as java.util.logging
does not know/provide the thread name but only thread id (so that even
with a custom formatter the thread name is not available). But in the
current situation it's not worth it.
>> Did you already check memcached stats, e.g. regarding evictions?
>> How many concurrent sessions do you have (per memcached), how big are
>> sessions in average (in kb), and how is memcached configured
>> memory-wise?
>
> 6 concurrent sessions
> 0 evictions
> 2K session data typically
> 1GB of RAM for memcache
According to this sessions should not be dropped by memcached.
>> Did you already check how this all fits together?
>
> Yes, I think so. That's the weird part. I setup two memcached servers
> (memcache1.dev and memcache2.dev) and two web servers (web1.dev and
> web2.dev) using the configurations I mentioned earlier (also included
> at the bottom of this message). Both web servers pointed at both
> memcache servers, with memcache1.dev listed first. I logged into our
> site and watched the session data get stored in memcache1.dev. I
> stopped memcached on memcache1.dev, refreshed my browser and saw that
> I still had my session data. I stopped web1.dev, refreshed the browser
> and saw that I failed over to web2.dev successfully while retaining my
> session.
>
> That suggested to me that everything was working beautifully.
Yes, the setup seems to be fine.
>
> Is there something else I should look into?
The question would be what's the difference between the requests where
sessions seem to get lost and your own tests. The logs showed that there
were session locks existing that could not be acquired by another
request for that session, so that would be s.th. to check out.
But as I said before, I'd suggest switching to a sticky session
configuration for msm and see how this works instead of investing more
time analyzing the situation with the non-sticky setup.
Cheers,
Martin