warm restart to avoid cold memcached nodes?

Javier Arias Losada

unread,

Mar 9, 2022, 11:49:00 AM3/9/22

to memcached

Hi all,

recently discovered about the Warm restart feature... simply awesome!

we use memcached as a look-aside cache and we run it in kubernetes, also have autoscaling based on cpu... so when the number of requests increase enough, a new memcached node is started... we can tolerate a temporary decrease in hit/miss ratio... but I think we can improve the situation by warming up new memcached nodes.

Wondering if the warm restart could be used for that regards. Is it possible to dump the files before stopping a running node? I was thinking about maintaining periodic dumps that are used by the new node(s) started. Not sure if this is an option.

Anyone has solved a similar problem? I'd appreciate hearing others' experiences.

Thank you

Javier

dormando

unread,

Mar 9, 2022, 5:58:11 PM3/9/22

to memcached

Hey,

Unfortuantely I don't think it works that way. Warm restart is useful for
upgrading or slightly changing the configuration of an independent cache
node without losing the data.

However since you're expanding and contracting a cluster, keys get
remapped inbetween hosts. If you're saving the data of a downed machine
and bringing it back up later, you will have stale cache and still cause
some extra cache misses.

As an aside you should be autoscaling based on cache hit/miss rate and not
the number of requests, unless your rate is huge memcached will scale
traffic very well. Hopefully you're already doing that :)

> --
>
> ---
> You received this message because you are subscribed to the Google Groups "memcached" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
> memcached+...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/memcached/d0f09c1c-760f-44bb-95e9-95afa7dd9c43n%40googlegroups.com.
>
>

Javier Arias Losada

unread,

Mar 10, 2022, 5:16:23 AM3/10/22

to memcached

Thank you for your response.

I think I'd better share some more details of our use case so that the rationale of my question is more clear.

Our use case is CPU bounded for memcached.

I mean, a big enough part of the dataset can be fit into memory. On the other side the number of requests grows and decreases by some orders of magnitude thorough the day organically with user's traffic.

So, what we do is having N memcached pods with a relatively small number of cores and each one of can fit a significant part of the dataset in memory... when load increases we (Kubernetes) start a new, empty, node. Our clients replicate all write operations to all memcached nodes, and do load balancing for read operations. This is OK in our case because we are very read heavy.

When scaling up, the node is empty and we see an increase int he number of misses... but nothing very bad... for us this is more convenient than having a huge amount of servers sitting almost idle for over 16 hours.

So we were thinking on leveraging WarmRestarts to warm faster newly created memcached nodes. It's true that still there would be some inconsistencies between new and old nodes, but much less than with our current setup.

This is why I was asking about the option for creating some kind of snapshot from a live node... or try to leverage the WarmRestarts to increase our efficiency.

Not sure if this would bring more ideas... but I hope our use case is now more clear.

Again, thank you.

dormando

unread,

Mar 12, 2022, 2:00:15 PM3/12/22

to memcached

Hey,

RE: your rationale, if you're actually fine with the stale cache then just
save the restart file + the .meta file. You can't do a live snapshot, you
have to capture the files once the server has completely stopped. That
works fine though, I've brought up caches on completely different
machines.

As a curiosity on read performance: that is pretty unusual to run
memcached out of CPU unless the instances are very slow. Are you running
out of network bandwidth or CPU? Or perhaps you have millions of rps per
instance? :) It'd be operationally a lot simpler to have a small static
number of read replicas and memcached should excel at read heavy
workloads.

but if your instances are odd, I guess you do what you gotta do.

> https://groups.google.com/d/msgid/memcached/98782c32-6399-4cd9-9b73-30861e84b9b0n%40googlegroups.com.
>
>

Reply all

Reply to author

Forward