Slow connection and slow response time from memcached server

7,364 views
Skip to first unread message

head

unread,
Nov 16, 2009, 3:15:26 PM11/16/09
to memcached
I run a memcached server on my server, after certain amount of
connections the next connections to this server are very slow (php is
waiting like 5s to connect) and the response time looks to be very
slow as well. The memory usage is very small, about 1.5GB from 8GB of
available ram and the load average and cpu usage is also very small,
so server is in no ways overloaded (load average shows never more than
0.5). As far as I checked everything there are no OS limits doing any
problem here, I checked amount of available sockets, connections and
file descriptors

I start the memcached this way:

/usr/local/bin/memcached -d -u nobody -m 7168 -t 2 -P /var/run/
memcached.pid -c 10000 -v >> /var/log/memcached.log 2>&1

the server is dual xeon 5130 2GHZ, 8GB ram, scsi disk, running redhat
enterprise linux 5.4 64 bit

the memcached is the newest version
I login to my memcached server using telnet and I issue "stats"
command, I noticed the poor performance starts when the value of
curr_connections grows to around 3800. There are numerous servers
connecting to this memcache server, each one of them using about 200
webserver threads

any idea what might be causing the performance problem?

dormando

unread,
Nov 16, 2009, 4:04:01 PM11/16/09
to memcached

What version of memcached are you using?

If you do a quick strace of the process, is it calling epoll_wait (and
similar) or select()/poll()?

Can you verify that yor host is not using any swap, and is not actively
swapping? (watch vmstat 1 for a minute or two, the si/so columns).

Are you using large multigets at all?

Finally, can you pastebin the output of "stats" and "stats settings"
somewhere? (assuming you're on 1.4)

Thanks,
-Dormando

Trond Norbye

unread,
Nov 16, 2009, 4:03:49 PM11/16/09
to memc...@googlegroups.com
And finally: what is the connect time when you telnet to the port to run
the stats command... if that's faster than the 5 sec your php clients
are waiting, I would start looking at the client boxes first..

Cheers,

Trond

head

unread,
Nov 16, 2009, 7:46:21 PM11/16/09
to memcached
>
> > any idea what might be causing the performance problem?
>
> What version of memcached are you using?

I am running the newest version of memcached available, that is
actually 1.4.1

>
> If you do a quick strace of the process, is it calling epoll_wait (and
> similar) or select()/poll()?

I am kind of beginner admin so I am not really sure on how to do the
strace, also at this moment I am unable to replicate the problem
because we stopped using memcached after this problem showed up

When I add memcached to my script the entire *website* I run go down
in a matter of 5 minutes, if you really need this info to find my
problem I will run memcached again to find out

> Can you verify that yor host is not using any swap, and is not actively
> swapping? (watch vmstat 1 for a minute or two, the si/so columns).

The host is not using any swap (zero) I am 100% positive, there is
more than 6GB of ram free at the moment of a problem and cpu usage is
minimal

> Are you using large multigets at all?

I have no idea what is "multigets", care to explain?


> Finally, can you pastebin the output of "stats" and "stats settings"
> somewhere? (assuming you're on 1.4)

I am guessing you need this for a server at the moment of performance
decline, right?

head

unread,
Nov 16, 2009, 7:49:27 PM11/16/09
to memcached
> And finally: what is the connect time when you telnet to the port to run
> the stats command... if that's faster than the 5 sec your php clients
> are waiting, I would start looking at the client boxes first..
>


The connection time is minimal, all the server are on a local gigabit
network and are using local lan IPs to connect to other machines and
to this memcached server
I am running several heavily loaded databases on the same lan which
would make the memcached server looking like a joke in terms of
hardware specs and the amount of connections
all the database servers are running great and clients (the same
clients were actually connecting to memcached) have no problems
connecting to them

client boxes are going down at the moment when memcached server is
responding slowly, but this is caused by memcached server only, we
made some pointers in the php code to find out what comnnection time
to memcache server is

head

unread,
Nov 16, 2009, 7:54:16 PM11/16/09
to memcached
guys I found this log from stats
I am not sure when I copied this but probably at the moment of poor
performance

STAT pid 2709
STAT uptime 396
STAT time 1258045190
STAT version 1.4.1
STAT pointer_size 64
STAT rusage_user 55.767522
STAT rusage_system 144.081096
STAT curr_connections 2989
STAT total_connections 75315
STAT connection_structures 3623
STAT cmd_get 8222208
STAT cmd_set 317102
STAT cmd_flush 0
STAT get_hits 7905029
STAT get_misses 317179
STAT delete_misses 0
STAT delete_hits 0
STAT incr_misses 0
STAT incr_hits 6768762
STAT decr_misses 0
STAT decr_hits 0
STAT cas_misses 0
STAT cas_hits 0
STAT cas_badval 0
STAT bytes_read 635907741
STAT bytes_written 561528865
STAT limit_maxbytes 7516192768
STAT accepting_conns 1
STAT listen_disabled_num 0
STAT threads 3
STAT conn_yields 0
STAT bytes 90608790
STAT curr_items 254808
STAT total_items 317155
STAT evictions 0

head

unread,
Nov 16, 2009, 7:46:33 PM11/16/09
to memcached
I will try to answer all your questions in one message

Patrick Galbraith

unread,
Nov 16, 2009, 8:32:41 PM11/16/09
to memc...@googlegroups.com
I have questions (and answers too)

head wrote:
any idea what might be causing the performance problem?
      
What version of memcached are you using?
    
I am running the newest version of memcached available, that is
actually 1.4.1

  
What type of application/client are you using (PHP, Perl, Ruby, etc...) ?


  
If you do a quick strace of the process, is it calling epoll_wait (and
similar) or select()/poll()?
    
I am kind of beginner admin so I am not really sure on how to do the
strace, 

All you need to do is start memcached with "strace /usr/local/bin/memcached...". It spits out all the low level calls a binary is making.

also at this moment I am unable to replicate the problem
because we stopped using memcached after this problem showed up

When I add memcached to my script the entire *website* I run go down
in a matter of 5 minutes, if you really need this info to find my
problem I will run memcached again to find out

  
I'd have to see your script to see what the deal is :)


  
Can you verify that yor host is not using any swap, and is not actively
swapping? (watch vmstat 1 for a minute or two, the si/so columns).
    
The host is not using any swap (zero) I am 100% positive, there is
more than 6GB of ram free at the moment of a problem and cpu usage is
minimal

  
Are you using large multigets at all?
    
I have no idea what is "multigets", care to explain?


  
Depending on your client, you will have a "get" and a "multi get" call. "get" fetches a single item whereas "multi get" call returns multiple items (given a list of keys). For instance, Perl's Cache::Memcached client has an "mget()" and a "get_multi()" function.

head

unread,
Nov 16, 2009, 8:43:56 PM11/16/09
to memcached

What type of application/client are you using (PHP, Perl, Ruby,
etc...) ?

it's php application

> have to see your script to see what the deal is :)

its very simple script, I am pretty sure there is no problem in a
script because below 3000 connections to memcached it works fine

>Depending on your client, you will have a "get" and a "multi get" call. "get" fetches a single item whereas "multi get" call returns multiple items (given a list of keys). For instance, Perl's Cache::Memcached client has an "mget()" and a "get_multi()" function.

I grepped all the script and I see only get is used, not get_multi

Henrik Schröder

unread,
Nov 17, 2009, 4:47:44 AM11/17/09
to memc...@googlegroups.com
On Tue, Nov 17, 2009 at 01:54, head <stere...@gmail.com> wrote:
STAT curr_connections 2989
STAT total_connections 75315

Switch to pooled connections instead of making new ones all the time.


/Henrik Schröder

head

unread,
Nov 17, 2009, 7:27:04 PM11/17/09
to memcached
> > STAT curr_connections 2989
> > STAT total_connections 75315
>
> Switch to pooled connections instead of making new ones all the time.
>
> /Henrik Schröder

do you mean pconnect? of course we are using pconnect :)

Trond Norbye

unread,
Nov 18, 2009, 2:30:08 AM11/18/09
to memc...@googlegroups.com
On 11/18/2009 01:27 AM, head wrote:
>>> STAT curr_connections 2989
>>> STAT total_connections 75315
>>>
>> Switch to pooled connections instead of making new ones all the time.
>>
>> /Henrik Schr�der
>>
> do you mean pconnect? of course we are using pconnect :)
>

No. you should reuse the connection to the memcached server.

Cheers,

Trond

Trond Norbye

unread,
Nov 18, 2009, 3:08:41 AM11/18/09
to memc...@googlegroups.com
Haha.. I just learned that pconnect is a persistent connection...

Sorry for that

Cheers,

Trond

head

unread,
Nov 18, 2009, 7:30:40 AM11/18/09
to memcached
yes, we are using pconnect since always and it is not the solution for
the problem
any other ideas? seems I will really have to do that strace?

Carlos Alvarez

unread,
Nov 18, 2009, 11:29:00 AM11/18/09
to memc...@googlegroups.com
On Mon, Nov 16, 2009 at 6:04 PM, dormando <dorm...@rydia.net> wrote:
> Are you using large multigets at all?

Please forgive me if I am wrong, I am just a newbie.

Looking at the code (libmemcached), I understand that large multigets
of small items would pressure over the conection. As far as I can
imagine looking at the code, if two keys of a multiget go to the same
server, they would use two separate conections. So a big number of
multigets would dry the available conections.

for (x= 0; x < number_of_keys; x++)
{
...
rc= memcached_connect(&ptr->hosts[server_key]);
...
}




Carlos.

Trond Norbye

unread,
Nov 18, 2009, 11:49:29 AM11/18/09
to memc...@googlegroups.com
No, it they will reuse the same connection. the call to
memcached_connect there is to verify that the connection is open so
that we can send data over the wire..

Cheers,

Trond

Carlos Alvarez

unread,
Nov 18, 2009, 12:33:17 PM11/18/09
to memc...@googlegroups.com
I didn't digg deep enough. Now I see that in the function
network_connect() the conection is open only if the socket fd is -1.
Great.

Thank you very much.


Carlos.

Patrick Galbraith

unread,
Nov 18, 2009, 12:42:38 PM11/18/09
to memc...@googlegroups.com
Hi!

I meant to let you know - if you don't already know, here are the docs for get multi (if you are using PECL/meemcached): http://www.php.net/manual/en/memcached.getmulti.php

--Patrick

head

unread,
Nov 18, 2009, 9:23:21 PM11/18/09
to memcached
is this answer for me or for Carlos?

On Nov 18, 6:42 pm, Patrick Galbraith <p...@patg.net> wrote:
> Hi!
> I meant to let you know - if you don't already know, here are the docs for get multi (if you are using PECL/meemcached):http://www.php.net/manual/en/memcached.getmulti.php
> --Patrick
> head wrote:What type of application/client are you using (PHP, Perl, Ruby, etc...) ? it's php applicationhave to see your script to see what the deal is :)its very simple script, I am pretty sure there is no problem in a script because below 3000 connections to memcached it works fineDepending on your client, you will have a "get" and a "multi get" call. "get" fetches a single item whereas "multi get" call returns multiple items (given a list of keys). For instance, Perl's Cache::Memcached client has an "mget()" and a "get_multi()" function.I grepped all the script and I see only get is used, not get_multi

Dustin

unread,
Nov 18, 2009, 9:43:07 PM11/18/09
to memcached

On Nov 18, 6:23 pm, head <stereoj...@gmail.com> wrote:
> is this answer for me or for Carlos?

Messages sent to the list are intended to be for whomever may find
them useful. You don't need to request permission to apply an answer
to your situation if you find it useful. If you do not find it
useful, you are not obligated to care about it.

head

unread,
Dec 2, 2009, 3:46:33 PM12/2/09
to memcached
ok guys, so I am still having this problem

we are using php client with persistent connections (this is pooled
connections in other words) using $memcache_obj->pconnect, which means
that each php thread has it's own connection to memcached
there are 20 http servers, each with minimum 20 php threads, so this
is total of 4000 clients (and that is minimum)

however from about 3700 clients we are seeing performance decrease,
the answer from memcached for a select can be up to 1 second, this is
much, am I right?

the memcached server from my calculations is getting about 5000
requests per second, is this a lot? this is high performance machine,
there is absolutely no swapping *and never was*!!! the load average is
0.17!!

Maybe I just need to install more memcached servers? but this one
seems ot be doing nothing anyway

below are the stats and stats settings, the problem is visible right
now


stats settings
STAT maxbytes 3221225472
STAT maxconns 10000
STAT tcpport 11211
STAT udpport 11211
STAT inter NULL
STAT verbosity 1
STAT oldest 0
STAT evictions on
STAT domain_socket NULL
STAT umask 700
STAT growth_factor 1.25
STAT chunk_size 48
STAT num_threads 3
STAT stat_key_prefix :
STAT detail_enabled no
STAT reqs_per_event 20
STAT cas_enabled yes
STAT tcp_backlog 1024
STAT binding_protocol auto-negotiate
END

stats
STAT pid 2709
STAT uptime 1727263
STAT time 1259772057
STAT version 1.4.1
STAT pointer_size 64
STAT rusage_user 29672.975022
STAT rusage_system 51701.090239
STAT curr_connections 3188
STAT total_connections 20452452
STAT connection_structures 3623
STAT cmd_get 5977832958
STAT cmd_set 665620469
STAT cmd_flush 0
STAT get_hits 5355401281
STAT get_misses 622431677
STAT delete_misses 0
STAT delete_hits 0
STAT incr_misses 0
STAT incr_hits 24302362
STAT decr_misses 0
STAT decr_hits 0
STAT cas_misses 0
STAT cas_hits 0
STAT cas_badval 0
STAT bytes_read 403401321019
STAT bytes_written 971168343526
STAT limit_maxbytes 7516192768
STAT accepting_conns 1
STAT listen_disabled_num 8554
STAT threads 3
STAT conn_yields 0
STAT bytes 198980367
STAT curr_items 1077436
STAT total_items 665620605
STAT evictions 0
END

dormando

unread,
Dec 2, 2009, 4:08:17 PM12/2/09
to memcached
Did you ever get that (idle) strace or anything from the server?

I might've missed something in the thread, but I'm not sure...

The easiest thing to point to though:

> STAT accepting_conns 1
> STAT listen_disabled_num 8554

... you've hit a max connections condition 8,554 times since the instance
was restarted 20 days ago.

Try setting maxconns to 20,000. Monitor that stat for increases and
alert/note when it happens, if it's still increasing after that.

-Dormando

dormando

unread,
Dec 2, 2009, 4:09:04 PM12/2/09
to memcached
Just to be clear: when you hit max connections, new connections to
memcached can/will lag until connections are accepted again. Existing
connections won't be slow.

On Wed, 2 Dec 2009, head wrote:

head

unread,
Dec 2, 2009, 6:43:29 PM12/2/09
to memcached
I didn't get the strace, but I will do that shortly

I noticed the listen_disabled value, I restarted memcached today to
reset this value
since restart I am already experiencing slow answers from memcached
server but the listed_disabled value is still zero:


STAT pid 21077
STAT uptime 9508
STAT time 1259782740
STAT version 1.4.1
STAT pointer_size 64
STAT rusage_user 601.808511
STAT rusage_system 1067.900654
STAT curr_connections 3192
STAT total_connections 80267
STAT connection_structures 3200
STAT cmd_get 128892811
STAT cmd_set 11342241
STAT cmd_flush 0
STAT get_hits 117550894
STAT get_misses 11341917
STAT delete_misses 0
STAT delete_hits 0
STAT incr_misses 0
STAT incr_hits 0
STAT decr_misses 0
STAT decr_hits 0
STAT cas_misses 0
STAT cas_hits 0
STAT cas_badval 0
STAT bytes_read 7470287539
STAT bytes_written 19952006747
STAT limit_maxbytes 7516192768
STAT accepting_conns 1
STAT listen_disabled_num 0
STAT threads 3
STAT conn_yields 0
STAT bytes 105147226
STAT curr_items 918731
STAT total_items 11342241
STAT evictions 0

head

unread,
Dec 2, 2009, 6:55:09 PM12/2/09
to memcached
hmm, I started memcached with strace, but the output file is only 6k
in size and is not increasing, I mean no data is written to the file
maybe I did something wrong, I started strace this way:

strace -t -o /strace.log /usr/local/bin/memcached -d -u nobody -m 7168
-t 2 -P /var/run/memcached.pid -c 10000 -v

any ideas?

dormando

unread,
Dec 2, 2009, 6:57:53 PM12/2/09
to memcached
It's forking and you're not following children.

Just attach to a runing process via strace -p, run it for 5 seconds, ^C,
gzip, put it somewhere

On Wed, 2 Dec 2009, head wrote:

Christian Becker

unread,
Dec 3, 2009, 3:57:10 AM12/3/09
to memc...@googlegroups.com
have you also checked your network connection?

we often have the problem, that restarting a server is causing that the network will only have 10 Mbit/s.
From the amount of data in your memcached and the traffic count, this could also be a bottleneck. Do you have really a Gigabit connection on the memcached machine?

2009/12/2 head <stere...@gmail.com>
Reply all
Reply to author
Forward
0 new messages