uneven distribution

ktechie

unread,

Nov 25, 2011, 7:54:31 AM11/25/11

to xmemcached

I am having two memcached servers. We have a program to eagerly load
the data in memcached servers before the actual application uses it.
This is a multithreaded program.

It works fine, however in one of the environments I noticed that the
data is filled unevenly. One of the memcached server is holds much
more data than the other.

The memcached server which held less data gave the following error
continously.

Failed to read, and not due to blocking:
errno: 104 Connection reset by peer
rcurr=a88009f ritem=2aaab59a7601 rbuf=a87f780 rlbytes=3209 rsize=4096
Failed to read, and not due to blocking:
errno: 104 Connection reset by peer
rcurr=a5a283b ritem=2aaab5b09fad rbuf=a5a24f0 rlbytes=1311 rsize=2048
Failed to write, and not due to blocking: Broken pipe.

Ping is disabled in the environment so I couldn't get the ping times.
What could be the potential problem & is there anyway to avoid this.

ktechie

unread,

Nov 30, 2011, 4:58:02 AM11/30/11

to xmemcached

My configuration is VMware on Red hat 64bit linux OS. Memcached 1.4.5
and Xmemcached 1.3.5.
I have a program which does eager loading of data, which is about 3GB,
this happens through a multithreaded application, with 10 threads
I have also created a connection pool of 50 connections to memcached.
There is also a listener which tells when there is disconnection or
the connection is healed.

Sometimes the program works fine without errors, but at times when I
see disconnections and connections happening many times.
And the memcached server gives the following errors.

Failed to write, and not due to blocking: Connection reset by peer

Failed to read, and not due to blocking:
errno: 104 Connection reset by peer

rcurr=127980b8 ritem=1442bd8a rbuf=12797c00 rlbytes=550 rsize=2048

When this happens the data is not fully loaded.

At times this issue is resolved after the VM is rebooted. but that is
not always the case.
I tried reducing the connection pool size to 1 and thread pool to 1,
even then the disconnections could happen

What could be the possible reasons for this.

dennis zhuang

unread,

Nov 30, 2011, 5:06:22 AM11/30/11

to xmemc...@googlegroups.com

Hi, are you sure that your network was all right when disconnection occured?
The log:
Connection reset by peer.

It means that server sent a RST tcp segment to client,and the connection was disconnected.

I think you may try to write a simple socket program to connect memcached but do nothing running in the app machine,and check if the simple program will be disconnect just like your app's memcached client.If it happened,i think you may have to find out the network problem.

2011/11/30 ktechie <kirandos...@gmail.com>

--
You received this message because you are subscribed to the Google Groups "xmemcached" group.
To post to this group, send email to xmemc...@googlegroups.com.
To unsubscribe from this group, send email to xmemcached+...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/xmemcached?hl=en.

--
庄晓丹
Email: killm...@gmail.com
伯岩(花名) bo...@taobao.com
Site: http://fnil.net

淘宝（中国）软件有限公司 / 产品技术部 / Java中间件

ktechie

unread,

Dec 7, 2011, 5:47:39 AM12/7/11

to xmemcached

Hi,
I created a simple program(NetworkTest) which will make a pool of
connection to the memcached server, using xmemcached.client. If there
is any disconnect then there will be a log created. There is no data
inserted or fetched.
The memcached is running in verbose mode.
In the environment where I was not facing any issues. There it was
able to hold connection for 3 days without any disconnects.
Then I tested on the environment where I was facing issues & within
less than 2 hours I could see disconnects.
It is able to heal the connections again, but there are number of
connections and disconnections.

As per the network team the vm running the memcached server and the vm
running Network test program are on the same VLAN so network should
not be a problem.
The memcached vm is having enough free memory. The NetworkTest vm is
also having enough memory.

The memcached logs are :

Failed to write, and not due to blocking: Connection reset by peer
Failed to write, and not due to blocking: Connection reset by peer
Failed to read, and not due to blocking:
errno: 104 Connection reset by peer

rcurr=2aaab0050dfc ritem=2aaab53fc91e rbuf=2aaab0050c70 rlbytes=1757
rsize=8192

The XMemcached logs are :

2011-12-06 19:05:08,493 ERROR
[com.google.code.yanf4j.core.impl.AbstractController] Reconnect to
10.10.48.31:11211 fail
2011-12-06 19:05:08,496 ERROR
[com.google.code.yanf4j.core.impl.AbstractController] Exception
occured in controller
java.io.IOException: Connect to 10.10.48.31:11211 fail,No route to
host
at
net.rubyeye.xmemcached.impl.MemcachedConnector.onConnect(MemcachedConnector.java:
402)
at com.google.code.yanf4j.nio.impl.Reactor.dispatchEvent(Reactor.java:
302)
at com.google.code.yanf4j.nio.impl.Reactor.run(Reactor.java:141)

On Nov 30, 3:06 pm, dennis zhuang <killme2...@gmail.com> wrote:
> Hi, are you sure that your network was all right when disconnection occured?
> The log:
> Connection reset by peer.
>
> It means that server sent a RST tcp segment to client,and the connection
> was disconnected.
>
> I think you may try to write a simple socket program to connect memcached
> but do nothing running in the app machine,and check if the simple program
> will be disconnect just like your app's memcached client.If it happened,i
> think you may have to find out the network problem.
>

> 2011/11/30 ktechie <kirandoshitec...@gmail.com>

> Email: killme2...@gmail.com

Reply all

Reply to author

Forward