Hi Dennis,
Okay, I was using builder.build(). It seems like it occasionally
loses connections (as I've mentioned outside of Eclipse the healing
doesn't work for me on my development mac or production linux box).
So I'm going to detail a little bit of what I'm seeing. Before I
dig in to that, I want to say I have a ton of respect for the time it
takes to put out an open source project. I have two myself. Also
your support has been fantastic. It's very helpful to get feedback
like this. So as I'm asking questions, I hope they come across as 'I
want to learn' and not me flaming or being a jerk. I totally
understand that this could be poor setup or usage on my part and not a
reflection of the project in general.
Okay so observations.
I've created a connection pool that has 25 MemcachedClient's (to
support 300 threads). I'm running against a localhost memcached
instance. I now get fewer connection timeouts, but in general
accessing data from memcached is an order of magnitude or two slower
than just fetching the data from it's origin source. That is I'm not
deriving any benefit from attempting to cache the data I'm caching.
Data cached is mostly less than 20k in size. Some data is closer to
300k, but that is in the minority.
When I profile my production system, I find that 80% of the time of my
server is locked up trying to access memcached...
40.942% 1517.000s show hide sun.misc.Unsafe.park()
at sun.misc.Unsafe.park()
at java.util.concurrent.locks.LockSupport.park()
at java.util.concurrent.locks.AbstractQueuedSynchronizer
$ConditionObject.await()
at java.util.concurrent.ArrayBlockingQueue.take()
at java.util.concurrent.ThreadPoolExecutor.getTask()
at java.util.concurrent.ThreadPoolExecutor$Worker.run()
at java.lang.Thread.run()
20.485% 759.000s show hide sun.misc.Unsafe.park()
at sun.misc.Unsafe.park()
at java.util.concurrent.locks.LockSupport.park()
at java.util.concurrent.locks.AbstractQueuedSynchronizer
$ConditionObject.await()
at java.util.concurrent.LinkedBlockingQueue.take()
at net.rubyeye.xmemcached.impl.MemcachedConnector$SessionMonitor.run
()
20.460% 758.100s show hide sun.nio.ch.EPollArrayWrapper.epollWait()
at sun.nio.ch.EPollArrayWrapper.epollWait()
at sun.nio.ch.EPollArrayWrapper.poll()
at sun.nio.ch.EPollSelectorImpl.doSelect()
at sun.nio.ch.SelectorImpl.lockAndDoSelect()
at sun.nio.ch.SelectorImpl.select()
at com.google.code.yanf4j.nio.impl.Reactor.run()
Even if I disable memcached caching (I've written my site so that I
can toggle memcached usage on and off in production). I still see a
huge percentage of time being consumed by the idle XMemcacheClients
In fact, it's roughly the same as above...
45.463% 972.400s show hide sun.misc.Unsafe.park()
at sun.misc.Unsafe.park()
at java.util.concurrent.locks.LockSupport.park()
at java.util.concurrent.locks.AbstractQueuedSynchronizer
$ConditionObject.await()
at java.util.concurrent.ArrayBlockingQueue.take()
at java.util.concurrent.ThreadPoolExecutor.getTask()
at java.util.concurrent.ThreadPoolExecutor$Worker.run()
at java.lang.Thread.run()
26.228% 561.000s show hide sun.nio.ch.EPollArrayWrapper.epollWait()
at sun.nio.ch.EPollArrayWrapper.epollWait()
at sun.nio.ch.EPollArrayWrapper.poll()
at sun.nio.ch.EPollSelectorImpl.doSelect()
at sun.nio.ch.SelectorImpl.lockAndDoSelect()
at sun.nio.ch.SelectorImpl.select()
at com.google.code.yanf4j.nio.impl.Reactor.run()
26.228% 561.000s show hide sun.misc.Unsafe.park()
at sun.misc.Unsafe.park()
at java.util.concurrent.locks.LockSupport.park()
at java.util.concurrent.locks.AbstractQueuedSynchronizer
$ConditionObject.await()
at java.util.concurrent.LinkedBlockingQueue.take()
at net.rubyeye.xmemcached.impl.MemcachedConnector$SessionMonitor.run
()
So it appears to me that the clients function in a 'busy-wait-loop'
and are not awoken when work exists but are constantly polling. Is
this the case? I would have assumed idle connections such as these
would be in a wait state.
Also, since these percentages are nearly the same regardless of
whether or not the clients are actually used, I suspect that most of
the delay is unrelated to accessing memcached and is more around the
cpu cycles being consumed by the clients themselves.
Are you aware of anyone using your client in a high scale multi
threaded environment? Again, please don't take this as me being mean,
I just need to understand what I should expect interms of access
times. To me anything more than a 3/10 or 4/10 and I'm better off
returning to the origin data and not caching it. I'd prefer it be
milliseconds.
In your performance benchmark. Were you pooling MemcachedClients? If
so, how big was the pool? Were the multiple threads allowed to share
clients or did each thread have to acquire/release a connection?
A few other thoughts.
Have you thought about logging events in Xemcached at info level for
things like connecting, adding servers, etc? Everything seems to be
logged at warning level. I can of course filter warning log events
out, but in cases where an unexpected error occurs it would be nice
for this to be handled separately than the information events that are
also logged as warnings.
Also, I get a fair amount of these exceptions, I'm not sure what they
mean but they seem to accompany removing sessions...
[2009-08-12 18:21:56.913] {Thread-516} WARNING
com.google.code.yanf4j.nio.impl.AbstractController remove session xxxx:
1624
[2009-08-12 18:21:56.914] {Thread-516} SEVERE
com.google.code.yanf4j.nio.impl.Reactor
java.lang.UnsupportedOperationException
[2009-08-12 18:21:56.914] {Thread-516} at
net.rubyeye.xmemcached.command.Command.setWriteBuffer(Command.java)
[2009-08-12 18:21:56.914] {Thread-516} at
com.google.code.yanf4j.nio.impl.DefaultTCPSession.writeToChannel
(DefaultTCPSession.java)
[2009-08-12 18:21:56.914] {Thread-516} at
com.google.code.yanf4j.nio.impl.AbstractSession.onWrite
(AbstractSession.java)
[2009-08-12 18:21:56.914] {Thread-516} at
com.google.code.yanf4j.nio.impl.AbstractSession.onEvent
(AbstractSession.java)
[2009-08-12 18:21:56.914] {Thread-516} at
com.google.code.yanf4j.nio.impl.SocketChannelController.dispatchWriteEvent
(SocketChannelController.java)
[2009-08-12 18:21:56.914] {Thread-516} at
com.google.code.yanf4j.nio.impl.AbstractController.onWrite
(AbstractController.java)
[2009-08-12 18:21:56.914] {Thread-516} at
com.google.code.yanf4j.nio.impl.Reactor.dispatchEvent(Reactor.java)
[2009-08-12 18:21:56.914] {Thread-516} at
com.google.code.yanf4j.nio.impl.Reactor.run(Reactor.java)
[2009-08-12 18:21:56.914] {Thread-516}
[2009-08-12 18:21:56.914] {Thread-515} WARNING
com.google.code.yanf4j.nio.impl.AbstractController Try to reconnect to
xxx.com:1624 for 1 times
[2009-08-12 18:21:56.915] {Thread-516} WARNING
com.google.code.yanf4j.nio.impl.AbstractController add session
xxx.com:
1624
So the questions are again..
1. Do the clients function in a busy wait loop, consuming cycles even
when not being used?
2. Do you know of a production site using your client in a
multithreaded environment? I'd like to ask them about their setup and
what sort of times they see.
3. Does your bench mark share concurrent clients across threads or
does it use a pool with exclusive access to a client per thread.
Thanks,
Bryant
On Aug 11, 6:48 pm, dennis zhuang <
killme2...@gmail.com> wrote:
> Talk with xmemcached when xmemcached client was not connected to memcached
> server yet,it will throw a MemcachedException with message
> There is no avriable session at this moment
>
> I think you program maybe invoke xmemcached's methods before it has been
> connected.
>
> builder.build();
>
> This method would blocked until memcached servers has been connected
> successfully or fail.If it fail,xmemcached will try to heal the connection.
>
> 2009/8/12 Bryant <
publish...@myrete.com>