Questions related to slabs

33 views
Skip to first unread message

Kishore Komya

unread,
Oct 16, 2020, 9:59:40 PM10/16/20
to memcached

  1. I setup memcache with 55gb, however total_malloced from stats slabs says only 41657393216, will that grow as the data grows?

  2. How many pages are allocated per slab? Is that dynamic or is there a limit?

  3. We use not more than 5-6 slabs, and our largest slab is 300 byte, are there best practices to limit the object size to 301, so that slab allocation logic is simplified

dormando

unread,
Oct 17, 2020, 3:23:21 PM10/17/20
to memcached
Hey,

I'll answer these inline, but up front: Are you having a specific problem
you're tracking down, or is this just out of curiosity?

None of these are things you should waste time thinking about. Memcached
handles it internally.

> 1.
>
> I setup memcache with 55gb, however total_malloced from stats slabs says only 41657393216, will that grow as the data grows?

Yes. Memory is lazily allocated, one megabyte at a time.

> 2.
>
> How many pages are allocated per slab? Is that dynamic or is there a limit?

It's dynamic. If you have a new enough version (> 1.5) they are also
rebalanced automatically as necessary.

> 3.
>
> We use not more than 5-6 slabs, and our largest slab is 300 byte, are there best practices to limit the object size to 301, so that slab allocation
> logic is simplified

You can probably ignore this, unless you feel like there's a significant
memory waste going on. You can change the slab growth factor (-f) to
create more classes in the lower numbers than higher numbers but again I
wouldn't bother unless you really need to.

It doesn't "simplify" the logic either way.

-Dormando

Kishore Komya

unread,
Oct 18, 2020, 5:44:59 PM10/18/20
to memcached
Thanks, I was asking out of curiosity and here are the specific problems I am chasing.

We are running memcache locally on each node in our spark cluster

1) I saw conn_yields as high as 170065753 and we use SpyMemcachedClient, which has batch size of 4096 by default, so I changed the -R value to 4096, after that I can see  conn_yields is reduced to zero.
However, I expected to see improvement in read latency in getAll method for the memcache, however I am seeing spikes in both p50 and p90. As part of the change, I also increased the memcache size fro 15gb to 54gb and after which I could see evictions from  702281306 to zero and improvement in cache hit ratio from 70% to 95%.  Is the spike in read latency of memcache is coming from the increase in the memcache size?

2) Next we want to move from local memcache to remote memcache, so that all of our nodes can have access to the remote cache.

Here are our stats for the data

Total data = 50GB
 max # of keys read per minute  = 700M
max # of requests read per minute = 700M/4096 =175k
Our TTL = 24 hours and Total # of put requests required per day = 500M, so we get hit ratio > 99.99%
Max chunk size = 152 based on our existing data

Lets say we are using n1-highmem-16 in GCP with 16 cores and 104 gb ram, it is clear that one machine can hold the data in ram, but can it support our # of read requests ? How many suck kind of machines do we need? Are there are standard benchmarks available so that we can estimate the # of machines for our data?

Thanks,
Kishor.

Kishore Komya

unread,
Oct 18, 2020, 5:58:31 PM10/18/20
to memcached
From https://cloud.google.com/compute/docs/machine-types, the Maximum egress bandwidth is 32 gbps. From our data, our outgoing bandwidth is (175k/60)*4096*152  which is 14gbps, so looks like out network outgoing bandwidth is covered as well.
Reply all
Reply to author
Forward
0 new messages