Memcached performance issues

Patrick Santora

unread,

Feb 20, 2011, 12:15:54 PM2/20/11

to memcached

I am having issues with Memcached at the moment. I have multiple
servers on the front end that each have 100 connections round robining
to memcached. I have 2 memcached servers, each with 512MB of ram and
20 threads (might be a little high) available to each.

What I am seeing is that when my memcached container hits around 10MB
of written traffic is starts to bottleneck causing my front end
systems to slow WAY down. I've turned on verbose debugging and see no
issues and there are no complaints on the front end stating that the
connection clients are not able to hit memcached.

Has anyone seen anything like this before?

I would appreciate any feedback that could help out with this.

Thanks
-Pat

Paul Gale

unread,

Feb 20, 2011, 1:50:24 PM2/20/11

to memc...@googlegroups.com

How large do you have the cache configured to be? The default is 64MB unless overridden. What do the memcached statistics show when you telnet into each instance? Are you seeing a lot of evictions I wonder?

Also, how many core does each memcached server have? Generally speaking the memcached thread count should be the same as the number of available cores available. Any more than that increases the likelihood of thread contention on the cache itself. The default number of threads is 4 threads (assuming you're using 1.4.3). After all adding more threads is not a free lunch otherwise we'd all just set it to be 10,000 and be done with it. I doubt you have a 20 core server with only 512MB of ram. ;)

To tune your memcached setup for maximum efficiency spend some time checking out the various load testing and benchmarking tools such as Brutis or memslap. However, if it were me I would immediately correct the thread count setting as mentioned above.

Thanks,
Paul

dormando

unread,

Feb 20, 2011, 1:55:35 PM2/20/11

to memcached

> What I am seeing is that when my memcached container hits around 10MB
> of written traffic is starts to bottleneck causing my front end
> systems to slow WAY down. I've turned on verbose debugging and see no
> issues and there are no complaints on the front end stating that the
> connection clients are not able to hit memcached.
>
> Has anyone seen anything like this before?
>
> I would appreciate any feedback that could help out with this.

Lower the threads down to 4 or 8 or so. It's rare that it needs adjusting.

Things we'd like to know:

- your version of memcached
- how many queries/second you run
- stats output usually helps

have you gone through this page yet?
http://code.google.com/p/memcached/wiki/Timeouts
then there's this:
http://code.google.com/p/memcached/wiki/NewServerMaint

Patrick Santora

unread,

Feb 20, 2011, 2:22:08 PM2/20/11

to memc...@googlegroups.com

@Paul:
They are 4 core systems, so it looks like i should push it down to 4, which I will do now :)

The rest of the information I believe you are looking for can be found within the stats dump below.

@dormando:
Ok, I will push it to 4 and see what happens

Queries a second are I believe around 16 to 20.

Version: 1.4.5

Here are my current stats output, but right now the issue is not occurring. I will have to toss out a post once it happens again.
stats
STAT pid 22821
STAT uptime 10416
STAT time 1298229426
STAT version 1.4.5
STAT pointer_size 64
STAT rusage_user 32.524055
STAT rusage_system 121.877471
STAT curr_connections 843
STAT total_connections 1312
STAT connection_structures 904
STAT cmd_get 196833
STAT cmd_set 19566
STAT cmd_flush 0
STAT get_hits 192557
STAT get_misses 4276
STAT delete_misses 0
STAT delete_hits 0
STAT incr_misses 0
STAT incr_hits 0
STAT decr_misses 0
STAT decr_hits 0
STAT cas_misses 0
STAT cas_hits 0
STAT cas_badval 0
STAT auth_cmds 0
STAT auth_errors 0
STAT bytes_read 1466964195
STAT bytes_written 15148562971
STAT limit_maxbytes 536870912
STAT accepting_conns 1
STAT listen_disabled_num 0
STAT threads 20
STAT conn_yields 0
STAT bytes 2532172
STAT curr_items 2657
STAT total_items 19566
STAT evictions 0
STAT reclaimed 4801
END

Thanks!

Patrick Santora

unread,

Feb 20, 2011, 11:40:22 PM2/20/11

to memc...@googlegroups.com

I forgot to mention that my servers traffic output screams upwards to 80MB per second, which seems kind of steep? Could it be the binary protocol since I am passing in a new binary factory for each client instance created on my client machines?

Dustin

unread,

Feb 21, 2011, 1:33:33 AM2/21/11

to memcached

On Feb 20, 11:22 am, Patrick Santora <patwe...@gmail.com> wrote:

> Queries a second are I believe around 16 to 20.

That's close to nothing.

> STAT connection_structures 904

Nearly 1,000 connections.

> STAT get_hits 192557
> STAT get_misses 4276

Decent hit rate.

> STAT bytes_read 1466964195
> STAT bytes_written 15148562971

That's almost 80K transferred per successful get request? That's
1.5MBps given your 20 req/s estimate. Are you perhaps hitting it with
stats requests really hard (that's not shown here)?

The overall average (given an uptime of 22821 seconds) is about 648
kBps. I don't know how you're getting 80MBps, but if it's memcached,
it would've had to have sent all of those bytes in three minutes.

> STAT limit_maxbytes 536870912

How much RAM does your server have?

> STAT bytes 2532172
> STAT curr_items 2657

This is showing an average of closer to ~1k per item. Something
doesn't seem to be adding up here...

Patrick Santora

unread,

Feb 21, 2011, 1:44:35 AM2/21/11

to memc...@googlegroups.com

@Dustin: Thanks for the great feedback Dustin.

We are hitting it with our monitoring software, but I believe that's for every 5 min only. The 80MB was a bad metric on my part I believe. The monitoring systems say 80M bits per second. So that's my fault.

What are your thoughts about either using or not using the BinaryConnectionFactory when instantiating the client?

@All: Here is the stats trace right before it started to complain again:
STAT pid 30375
STAT uptime 15937
STAT time 1298266239

STAT version 1.4.5
STAT pointer_size 64

STAT rusage_user 93.698755
STAT rusage_system 369.808780
STAT curr_connections 811
STAT total_connections 1747
STAT connection_structures 870
STAT cmd_get 514387
STAT cmd_set 50233
STAT cmd_flush 1
STAT get_hits 503458
STAT get_misses 10929

STAT delete_misses 0
STAT delete_hits 0
STAT incr_misses 0
STAT incr_hits 0
STAT decr_misses 0
STAT decr_hits 0
STAT cas_misses 0
STAT cas_hits 0
STAT cas_badval 0
STAT auth_cmds 0
STAT auth_errors 0

STAT bytes_read 6356720419
STAT bytes_written 63755346246

STAT limit_maxbytes 536870912
STAT accepting_conns 1
STAT listen_disabled_num 0

STAT threads 4
STAT conn_yields 0
STAT bytes 3537642
STAT curr_items 3631
STAT total_items 50233
STAT evictions 0
STAT reclaimed 13167
END

Rohit Karlupia

unread,

Feb 21, 2011, 2:31:27 AM2/21/11

to memc...@googlegroups.com, Patrick Santora

Usually the bottleneck would be either cpu or network.

- Check the cpu usage on both client and server machines when this happens.

- Verify your network capacity using some tool. 10Mbps network will choke around 1.25MBps. This could be because of NIC speed on one of the machines or because of some switch in between.

thanks!
rohitk

Patrick Santora

unread,

Feb 21, 2011, 3:31:41 AM2/21/11

to Rohit Karlupia, memc...@googlegroups.com

Heh. I had a funny feeling that was going to be the answer. I was curious mostly because the Binary mode seemed to do quite a deal of good for Facebook when it was used. I'm imagining that they cached images so binary was a good idea, but for simple structures like json, it might not make much sense. So thought I would get some opinions :).

I will do some testing

Thanks again. I'll post back in case I continue to get stuck after these tests.

-Pat

_______

Hard to tell but easy to measure. Just try both and measure the difference.

thanks!

rohitk
_______

@Rohitk

Thanks for this information! I will pass this on to our IT person.

This has dovetailed into another question now. How much of a difference would it be to use the BinaryConnectionFactory and not to use it?

-Pat

Dustin

unread,

Feb 21, 2011, 8:59:12 AM2/21/11

to memcached

On Feb 21, 12:31 am, Patrick Santora <patwe...@gmail.com> wrote:
> Heh. I had a funny feeling that was going to be the answer. I was curious
> mostly because the Binary mode seemed to do quite a deal of good for
> Facebook when it was used. I'm imagining that they cached images so binary
> was a good idea, but for simple structures like json, it might not make much
> sense. So thought I would get some opinions :).

binary protocol doesn't make much of a difference wrt what you're
caching, but can help you optimize some access patterns with a
sufficiently smart client. If you're concerned that it may be making
things worse (it probably doesn't have a huge effect from what I'm
hearing here), you can just try disabling it.

Patrick Santora

unread,

Feb 21, 2011, 11:12:34 AM2/21/11

to memc...@googlegroups.com

@Dustin
Thanks, I will be disabling them to see if that helps.

-Pat

Patrick Santora

unread,

Feb 21, 2011, 1:14:50 PM2/21/11

to memc...@googlegroups.com

Hrmm. Still having issues. Here is the latest stats dump. I also talked with my IT person and he mentioned the following setup, which does not look like an issue?
NIC SETTINGS
the servers should all be autonegotiating to 100/Full and we apply these additional kernel tuning parameters
net.core.rmem_max = 16777216
net.core.wmem_max = 16777216
net.ipv4.tcp_rmem = 4096 87380 16777216
net.ipv4.tcp_wmem = 4096 65536 16777216

LATEST STATS
STAT pid 1788
STAT uptime 44811
STAT time 1298311271

STAT version 1.4.5
STAT pointer_size 64

STAT rusage_user 178.875806
STAT rusage_system 763.939863
STAT curr_connections 811
STAT total_connections 2012
STAT connection_structures 813
STAT cmd_get 876886
STAT cmd_set 74747
STAT cmd_flush 0
STAT get_hits 858907
STAT get_misses 17979
STAT delete_misses 0
STAT delete_hits 2

STAT incr_misses 0
STAT incr_hits 0
STAT decr_misses 0
STAT decr_hits 0
STAT cas_misses 0
STAT cas_hits 0
STAT cas_badval 0
STAT auth_cmds 0
STAT auth_errors 0

STAT bytes_read 17426408671
STAT bytes_written 180479901035

STAT limit_maxbytes 536870912
STAT accepting_conns 1
STAT listen_disabled_num 0
STAT threads 4
STAT conn_yields 0

STAT bytes 3501518
STAT curr_items 3230
STAT total_items 74747
STAT evictions 0
STAT reclaimed 20950
END

dormando

unread,

Feb 21, 2011, 2:42:35 PM2/21/11

to memc...@googlegroups.com

Have you walked through those links I gave you? You haven't mentioned
exactly what you're seeing and those links walk you through narrowing it
down a lot as well as listing a lot of things to look for.

Patrick Santora

unread,

Feb 21, 2011, 2:51:46 PM2/21/11

to memc...@googlegroups.com

I will need to look at those further today. This weekend went a little haywire for me. :)

dormando

unread,

Feb 21, 2011, 9:25:10 PM2/21/11

to memc...@googlegroups.com

Have you been running the connection tester tool while observing the
client slowdown?

The tool is there so you can rule if your client is an issue or not, ie;
if the tool never sees a blip but all/most/some of your clients are seeing
blips, it's the client's fault. If the tool sees a blip, you can see
exactly where it's getting hung up and further narrow it down.

On Mon, 21 Feb 2011, Patrick Santora wrote:

>
> Its just strange. Memcaced with verbose logging looks ok but the client machines just take forever to get data. Like in the stats I don't
> see anything out of the ordinary. The nic settings look ok too. Quite frustrating...

Patrick Santora

unread,

Feb 21, 2011, 9:20:04 PM2/21/11

to memc...@googlegroups.com

So initial testing via those links have not caused the bottleneck I have been seeing. I am going to test further and have started to also investigate the use of spymemcached client since it looks to be a recommended client to use.

boyan

unread,

Feb 21, 2011, 10:49:16 PM2/21/11

to memc...@googlegroups.com

You may want to try xmecached ,here is a benchmark
http://xmemcached.googlecode.com/svn/trunk/benchmark/benchmark.html

2011/2/22 Patrick Santora <patw...@gmail.com>

Yeah. I will run it the next time the issue comes up. Does it matter if I run the tester on the same box the clients on? It should not matter but thought ii would ask.

Thanks!

--
name:   庄晓丹(伯岩)
email: killm...@gmail.com
            bo...@taobao.com
work:    http://www.taobao.com
twitter: @killme2008
Blog:     http://www.blogjava.net/killme2008

淘宝（中国）软件有限公司 / 新业务和开发平台 / Java中间件

Patrick Santora

unread,

Feb 21, 2011, 10:55:02 PM2/21/11

to memc...@googlegroups.com

Yeah I looked into that one. I have it in my back pocket just in case. Unless im told I should use that over spymemcached.

dormando

unread,

Feb 21, 2011, 11:02:39 PM2/21/11

to memc...@googlegroups.com

Run two, and keep them running all the time, so you see log from
before/after. You can also enable the "debug" switch and have it log
everything.

So yeah. run one on the client and one on an idle machine elsewhere.

On Mon, 21 Feb 2011, Patrick Santora wrote:

>
> Yeah. I will run it the next time the issue comes up. Does it matter if I run the tester on the same box the clients on? It should not matter but
> thought ii would ask.
>
> Thanks!
>

Boris Partensky

unread,

Feb 21, 2011, 11:07:32 PM2/21/11

to memc...@googlegroups.com, Patrick Santora

Patrick, are you sure the time is not spent GC'ing? Are you using icms
(-XX:+UseConcMarkSweepGC)? Is your verbose gc logging on?

On Mon, Feb 21, 2011 at 9:23 PM, Patrick Santora <patw...@gmail.com> wrote:
> Its just strange. Memcaced with verbose logging looks ok but the client
> machines just take forever to get data. Like in the stats I don't see
> anything out of the ordinary. The nic settings look ok too. Quite
> frustrating...
>

Patrick Santora

unread,

Feb 21, 2011, 9:23:14 PM2/21/11

to memc...@googlegroups.com

Its just strange. Memcaced with verbose logging looks ok but the client machines just take forever to get data. Like in the stats I don't see anything out of the ordinary. The nic settings look ok too. Quite frustrating...

Patrick Santora

unread,

Feb 21, 2011, 11:18:09 PM2/21/11

to memc...@googlegroups.com

Yeah I have the debug switch on. Thanks for the recommendation on running it in two places. I will give that a shot. I saw that the set and get times will be 0 if there is anything wrong via the debug. I take it if that's the case then I should look into what the verbose logs give me? Is there anything specific I should look for as when I look at it I'm not seeing anything very unusual.

Thanks
-Pat

Patrick Santora

unread,

Feb 21, 2011, 10:47:09 PM2/21/11

to memc...@googlegroups.com

Yeah. I will run it the next time the issue comes up. Does it matter if I run the tester on the same box the clients on? It should not matter but thought ii would ask.

Thanks!

Adam Lee

unread,

Feb 22, 2011, 11:48:27 AM2/22/11

to memc...@googlegroups.com

I'm a little late to the party, but I've been reading the emails and following along...

Out of curiosity, what do you mean by this:

I have multiple servers on the front end that each have 100 connections round robining to memcached.

I mean, I think I understand what you mean by this, but it doesn't really make sense to me-- why does each server need 100 connections to memcached? Beyond that, how does each server have 100 connections to memcached? You said that you're using the spymemcached client, right?

If you could explain exactly how your setup works and what your actual intention was with this design, I think it'd help me a lot. I have quite a bit of experience tuning spymemcached to do hundreds of thousands of requests a second, so I'm hoping I can help you out quite a bit once I can wrap my head around it.

--
awl

Patrick Santora

unread,

Feb 22, 2011, 11:55:23 AM2/22/11

to memc...@googlegroups.com

Sure Adam,

I have 8 production servers that each have a memcache connection pool of 100 connections that round robin as requests to memcached are made. I did not want to have to worry about creating my connection objects on the fly.

In regards to spymemcached, yeah, I just started using it. Before I was using the general client that came with memcached, but saw that the spy version had some additional features.

I would be quite interested in getting your feedback in regards to using the spymemcached client more efficiently. :)

-Pat

Boris Partensky

unread,

Feb 28, 2011, 7:20:07 AM2/28/11

to memc...@googlegroups.com, Patrick Santora

<<memcache connection pool of 100 connections that round robin as
requests to memcached are made

Hi Patrick, what do you mean by this? Do you have 100 instances of Spy
MemCachedClient to which you round robin requests?

Boris

Reply all

Reply to author

Forward